Sunteți pe pagina 1din 572

Workflow Administration

Guide

Informatica® PowerCenter®
(Version 8.6)
PowerCenter Workflow Administration Guide
Version 8.6
June 2008
This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use and disclosure and are also
protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,
recording or otherwise) without prior consent of Informatica Corporation. This Software may be protected by U.S. and/or international Patents and other Patents Pending.

Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided in DFARS 227.7202-1(a) and
227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable.

The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to us in writing.
Informatica, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer,
Informatica Complex Data Exchange and Informatica On Demand Data Replicator are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions
throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rights reserved. Copyright © 2007
Adobe Systems Incorporated. All rights reserved. Copyright © Sun Microsystems. All rights reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All
rights reserved. Copyright © Platon Data Technology GmbH. All rights reserved. Copyright © Melissa Data Corporation. All rights reserved. Copyright © Aandacht c.v. All rights reserved. Copyright
1996-2007 ComponentSource®. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright 2007 Isomorphic Software. All rights reserved. Copyright © Meta Integration Technology,
Inc. All rights reserved. Copyright © MySQL AB. All rights reserved. Copyright © Microsoft. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © AKS-Labs. All rights reserved.
Copyright © Quovadx, Inc. All rights reserved. Copyright © SAP. All rights reserved. Copyright 2003, 2007 Instantiations, Inc. All rights reserved.

This product includes software developed by the Apache Software Foundation (http://www.apache.org/), software copyright 2004-2005 Open Symphony (all rights reserved) and other software which is
licensed under the Apache License, Version 2.0 (the “License”). You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in
writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.

This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software copyright, Red Hat Middleware, LLC,
all rights reserved; software copyright © 1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under the GNU Lesser General Public License Agreement, which may be
found at http://www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, “as-is”, without warranty of any kind, either express or implied, including but not limited to the
implied warranties of merchantability and fitness for a particular purpose.

The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California, Irvine, and Vanderbilt
University, Copyright (c) 1993-2006, all rights reserved.

This product includes software copyright (c) 2003-2007, Terence Parr. All rights reserved. Your right to use such materials is set forth in the license which may be found at http://www.antlr.org/
license.html. The materials are provided free of charge by Informatica, “as-is”, without warranty of any kind, either express or implied, including but not limited to the implied warranties of
merchantability and fitness for a particular purpose.

This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and redistribution of this software is subject to
terms available at http://www.openssl.org.

This product includes Curl software which is Copyright 1996-2007, Daniel Stenberg, <daniel@haxx.se>. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright
notice and this permission notice appear in all copies.

The product includes software copyright 2001-2005 (C) MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available at http://www.dom4j.org/
license.html.

The product includes software copyright (c) 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available at http://
svn.dojotoolkit.org/dojo/trunk/LICENSE.

This product includes ICU software which is copyright (c) 1995-2003 International Business Machines Corporation and others. All rights reserved. Permissions and limitations regarding this software
are subject to terms available at http://www-306.ibm.com/software/globalization/icu/license.jsp

This product includes software copyright (C) 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at http://www.gnu.org/software/
kawa/Software-License.html.

This product includes OSSP UUID software which is Copyright (c) 2002 Ralf S. Engelschall, Copyright (c) 2002 The OSSP Project Copyright (c) 2002 Cable & Wireless Deutschland. Permissions and
limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.

This product includes software developed by Boost (http://www.boost.org/). Permissions and limitations regarding this software are subject to terms available at http://www.boost.org/
LICENSE_1_0.txt.

This product includes software copyright © 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at http://www.pcre.org/license.txt.
This product includes software copyright (c) 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms available at http://
www.eclipse.org/org/documents/epl-v10.php.

The product includes the zlib library copyright (c) 1995-2005 Jean-loup Gailly and Mark Adler.
This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html.

This product includes software licensed under the terms at http://www.bosrup.com/web/overlib/?License.


This product includes software licensed under the terms at http://www.stlport.org/doc/license.html.
This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php.)

This product includes software developed by the Indiana University Extreme! Lab. For further information please visit http://www.extreme.indiana.edu/.
This Software is protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,014,670; 6,032,158; 5,794,246; 6,339,775; 6,850,947; 6,895,471 and other U.S. Patents Pending.
DISCLAIMER: Informatica Corporation provides this documentation “as is” without warranty of any kind, either express or implied, including, but not limited to, the implied warranties of non-
infringement, merchantability, or use for a particular purpose. Informatica Corporation does not warrant that this software or documentation is error free. The information provided in this software or
documentation may include technical inaccuracies or typographical errors. The information in this software and documentation is subject to change at any time without notice.

Part Number: PC-WAG-86000-0001


Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
Informatica Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
Informatica Customer Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
Informatica Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
Informatica Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
Informatica Knowledge Base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
Informatica Global Customer Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiv

Chapter 1: Using the Workflow Manager. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Workflow Manager Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Workflow Manager Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Workflow Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Workflow Manager Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Setting the Date/Time Display Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Removing an Integration Service from the Workflow Manager . . . . . . . . . . . . . . . . . . . . . 3
Customizing Workflow Manager Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Configuring General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Configuring Format Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Configuring Miscellaneous Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Enabling Enhanced Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Configuring Page Setup Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Navigating the Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Customizing Workflow Manager Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Using Toolbars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Searching for Items . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Arranging Objects in the Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Zooming the Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Working with Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Viewing Object Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Entering Descriptions for Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Renaming Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Checking In and Out Versioned Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Checking In Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Viewing and Comparing Versioned Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Searching for Versioned Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Copying Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Copying Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Copying Workflow Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Comparing Repository Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Steps for Comparing Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Working with Metadata Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

iii
Creating a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Editing a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Deleting a Metadata Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Keyboard Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Chapter 2: Managing Connection Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Working with Connection Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Connection Object Code Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Creating Connection Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Editing a Connection Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Deleting a Connection Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Configuring Session Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Setting the Connection Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Setting the $Source and $Target Connection Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Overriding Connection Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Overriding Connection Attributes in the Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Overriding Connection Attributes in the Parameter File . . . . . . . . . . . . . . . . . . . . . . . . . 26
Configuring Connection Object Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Relational Database Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Database User Names and Passwords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Database Connect Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Configuring Environment SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Database Connection Resilience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Configuring a Relational Database Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Copying a Relational Database Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Replacing a Relational Database Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
FTP Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Using an SFTP Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Rules and Guidelines for Mainframes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
External Loader Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
HTTP Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Configuring Certificate Files for SSL Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
PowerChannel Relational Database Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
PowerExchange for JMS Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Connection Properties for JNDI Application Connections . . . . . . . . . . . . . . . . . . . . . . . 42
Connection Properties for JMS Application Connections . . . . . . . . . . . . . . . . . . . . . . . . 42
Creating JNDI and JMS Application Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
PowerExchange for MSMQ Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
PowerExchange for PeopleSoft Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
PowerExchange for Salesforce Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
PowerExchange for SAP NetWeaver Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Configuring an SAP R/3 Application Connection for ABAP Integration . . . . . . . . . . . . . 47
Configuring an FTP Connection for ABAP Integration . . . . . . . . . . . . . . . . . . . . . . . . . 48
Configuring Application Connections for ALE Integration . . . . . . . . . . . . . . . . . . . . . . . 49
Configuring an Application Connection for BAPI/RFC Integration . . . . . . . . . . . . . . . . . 50

iv Table of Contents
PowerExchange for SAP NetWeaver BI Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Configuring an SAP BW OHS Application Connection . . . . . . . . . . . . . . . . . . . . . . . . . 51
Configuring an SAP BW Application Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
PowerExchange for Siebel Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
PowerExchange for TIBCO Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Connection Properties for TIB/Rendezvous Application Connections . . . . . . . . . . . . . . . 55
Connection Properties for TIB/Adapter SDK Connections . . . . . . . . . . . . . . . . . . . . . . . 56
Configuring TIBCO Application Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
PowerExchange for Web Services Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
PowerExchange for webMethods Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
PowerExchange for WebSphere MQ Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Test Queue Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Chapter 3: Working with Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Creating a Workflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Creating a Workflow Manually . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Creating a Workflow Automatically . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Adding Tasks to Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Deleting a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Using the Workflow Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Step 1. Assign a Name and Integration Service to the Workflow . . . . . . . . . . . . . . . . . . . 67
Step 2. Create a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Step 3. Schedule a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Assigning an Integration Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Assigning a Service from the Workflow Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Assigning a Service from the Menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Working with Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Specifying Link Conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Viewing Links in a Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Deleting Links in a Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Using the Expression Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Adding Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Validating Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Expression Editor Display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Scheduling a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Unscheduling a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Creating a Reusable Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Configuring Scheduler Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Editing Scheduler Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Disabling Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Scheduling Workflows During Daylight Savings Time . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Validating a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Expression Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Task Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Workflow Properties Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Table of Contents v
Running Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Manually Starting a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Running a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Running a Workflow with Advanced Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Running Part of a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Running a Task in the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Suspending the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Configuring Suspension Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Stopping or Aborting the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
How the Integration Service Handles Stop and Abort . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Stopping or Aborting a Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Viewing a Workflow Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Chapter 4: Working with Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Predefined Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Using Predefined Workflow Variables in Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Evaluating Condition in a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Evaluating Task Status in a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Evaluating Previous Task Status in a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
User-Defined Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Workflow Variable Start and Current Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Datatype Default Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Creating User-Defined Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Chapter 5: Working with Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Creating a Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Creating a Task in the Task Developer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Creating a Task in the Workflow or Worklet Designer . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Configuring Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Reusable Workflow Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
AND or OR Input Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Disabling Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Failing Parent Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Validating Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Working with the Assignment Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Working with the Command Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Using Parameters and Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Assigning Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Creating a Command Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Executing Commands in the Command Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Working with the Control Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Working with the Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Using the Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Creating a Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

vi Table of Contents
Working with Event Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Example of User-Defined Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Working with Event-Raise Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Working with Event-Wait Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Working with the Timer Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Chapter 6: Working with Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Suspending Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Developing a Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Creating a Reusable Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Creating a Non-Reusable Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Configuring Worklet Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Adding Tasks in Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Nesting Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Using Worklet Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Persistent Worklet Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Overriding the Initial Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Assigning Variable Values in a Worklet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Passing Variable Values between Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Configuring Variable Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Validating Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Chapter 7: Working with Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Creating a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Steps to Create a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Editing a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Applying Attributes to All Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Understanding Buffer Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Configuring Automatic Memory Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Configuring Buffer Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Configuring Session Cache Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Configuring Performance Details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Using Pre- and Post-Session SQL Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Guidelines for Entering Pre- and Post-Session SQL Commands . . . . . . . . . . . . . . . . . . . 127
Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Using Pre- and Post-Session Shell Commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Using Parameters and Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Configuring Non-Reusable Shell Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Configuring Reusable Shell Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Pre-Session Shell Command Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Validating a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Validating Multiple Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Stopping and Aborting a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

Table of Contents vii


Threshold Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Fatal Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
ABORT Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
User Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Integration Service Handling for Session Failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Working with Session Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Changing the Session Log Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Changing the Target File and Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Changing Source Parameters in a File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Changing Connection Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Getting Run-Time Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Mapping Parameters and Variables in Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
Assigning Parameter and Variable Values in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Passing Parameter and Variable Values between Sessions . . . . . . . . . . . . . . . . . . . . . . . . 137
Configuring Parameter and Variable Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Handling High Precision Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
Bigint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Decimal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Chapter 8: Session Configuration Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Configuration Object and Config Object Tab Settings . . . . . . . . . . . . . . . . . . . . . . . . . 141
Advanced Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Log Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Error Handling Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Partitioning Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Session on Grid Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Working with a Session Configuration Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

Chapter 9: Working with Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Globalization Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
Source Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Allocating Buffer Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Partitioning Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Configuring Sources in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Configuring Readers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Configuring Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Configuring Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Working with Relational Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
Selecting the Source Database Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Defining the Treat Source Rows As Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Overriding the SQL Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Configuring the Table Owner Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Overriding the Source Table Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

viii Table of Contents


Working with File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
Configuring Source Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Configuring Commands for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Configuring Fixed-Width File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Configuring Delimited File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Configuring Line Sequential Buffer Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Integration Service Handling for File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Multibyte Character Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Null Character Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Row Length Handling for Fixed-Width Flat Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Numeric Data Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Using a File List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Creating the File List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Configuring a Session to Use a File List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Using FastExport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Creating a FastExport Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Changing the Reader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Changing the Source Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
Overriding the Control File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

Chapter 10: Working with Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Globalization Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Target Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Partitioning Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Configuring Targets in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Configuring Writers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Configuring Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Configuring Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Working with Relational Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Target Database Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Truncating Target Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
Deadlock Retry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Dropping and Recreating Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Constraint-Based Loading. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Bulk Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Table Name Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Target Table Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Reserved Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Working with Target Connection Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
Working with Active Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Working with File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Configuring Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

Table of Contents ix
Configuring Commands for File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
Configuring Test Load Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Configuring Fixed-Width Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Configuring Delimited Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
Integration Service Handling for File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Writing to Fixed-Width Flat Files with Relational Target Definitions . . . . . . . . . . . . . . 183
Writing to Fixed-Width Files with Flat File Target Definitions . . . . . . . . . . . . . . . . . . . 184
Generating Flat File Targets By Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Writing Multibyte Data to Fixed-Width Flat Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
Null Characters in Fixed-Width Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Writing Metadata to Flat File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Working with Heterogeneous Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Locating Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Reading Reject Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

Chapter 11: Real-time Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Understanding Real-time Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
Messages and Message Queues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
Web Service Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Change Data from PowerExchange Change Data Capture Sources . . . . . . . . . . . . . . . . . 193
Configuring Real-time Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Terminating Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Idle Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Message Count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Reader Time Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Flush Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Commit Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Message Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Steps to Enable Message Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Recovery File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Message Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Message Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Session Recovery Data Flush . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Recovery Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
PM_REC_STATE Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Message Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Message Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Recovery Queue and Recovery Topic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
Message Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Message Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Recovery Ignore List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
Stopping Real-time Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

x Table of Contents
Restarting and Recovering Real-time Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Restarting Real-time Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Recovering Real-time Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Restart and Recover Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Rules and Guidelines for Real-time Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Rules and Guidelines for Message Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Real-time Processing Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Informatica Real-time Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

Chapter 12: Understanding Commit Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Target-Based Commits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Source-Based Commits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Determining the Commit Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Switching from Source-Based to Target-Based Commit . . . . . . . . . . . . . . . . . . . . . . . . . 210
User-Defined Commits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Rolling Back Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
Understanding Transaction Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Transformation Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Understanding Transaction Control Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Creating Target Files by Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Setting Commit Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

Chapter 13: Recovering Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
State of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Workflow State of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Session State of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Target Recovery Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Recovery Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Configuring Workflow Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
Recovering Stopped, Aborted, and Terminated Workflows . . . . . . . . . . . . . . . . . . . . . . 227
Recovering Suspended Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Configuring Task Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Task Recovery Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Automatically Recovering Terminated Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Resuming Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Working with Repeatable Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Source Repeatability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Transformation Repeatability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Configuring a Mapping for Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
Steps to Recover Workflows and Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Recovering a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Recovering a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

Table of Contents xi
Recovering a Workflow From a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Rules and Guidelines for Session Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Configuring Recovery to Resume from the Last Checkpoint . . . . . . . . . . . . . . . . . . . . . 235
Unrecoverable Workflows or Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

Chapter 14: Sending Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Configuring Email on UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
Configuring Email on Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Step 1. Configure a Microsoft Outlook User . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Step 2. Configure Logon Network Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Step 3. Create Distribution Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Step 4. Verify the Integration Service Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Working with Email Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
Using Email Tasks in a Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Email Address Tips and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Steps to Create an Email Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Working with Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Email Variables and Format Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Configuring Post-Session Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
Sample Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
Working with Suspension Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
Using Service Variables to Address Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

Chapter 15: Understanding Pipeline Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . 247


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
Partitioning Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Partition Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Number of Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Partition Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Dynamic Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Configuring Dynamic Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
Rules and Guidelines for Dynamic Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
Using Dynamic Partitioning with Partition Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Configuring Partition-Level Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Cache Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Mapping Variables in Partitioned Pipelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Partitioning Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
Partition Restrictions for Editing Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
Partition Restrictions for PowerExchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Configuring Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Configuring a Partition Point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Steps for Adding Partition Points to a Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257

xii Table of Contents


Chapter 16: Working with Partition Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Adding and Deleting Partition Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
Partitioning Relational Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Entering an SQL Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Entering a Filter Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
Partitioning File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Rules and Guidelines for Partitioning File Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Using One Thread to Read a File Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Using Multiple Threads to Read a File Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Configuring for File Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Partitioning Relational Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Database Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Partitioning File Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Configuring Connection Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
Configuring File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Partitioning Custom Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
Working with Multiple Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Creating Partition Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Working with Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
Partitioning Joiner Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
Partitioning Sorted Joiner Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
Using Sorted Flat Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
Using Sorted Relational Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
Using Sorter Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
Optimizing Sorted Joiner Transformations with Partitions . . . . . . . . . . . . . . . . . . . . . . 277
Partitioning Lookup Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Cache Partitioning Lookup Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Partitioning Pipeline Lookup Transformation Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Partitioning Sequence Generator Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Partitioning Sorter Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
Configuring Sorter Transformation Work Directories . . . . . . . . . . . . . . . . . . . . . . . . . . 280
Restrictions for Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

Chapter 17: Working with Partition Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
Setting Partition Types in the Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
Setting Partition Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Database Partitioning Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Partitioning Database Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Target Database Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Hash Auto-Keys Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Hash User Keys Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

Table of Contents xiii


Key Range Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
Adding a Partition Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
Adding Key Ranges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
Pass-Through Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
Round-Robin Partition Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

Chapter 18: Using Pushdown Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
Running Pushdown Optimization Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
Running Source-Side Pushdown Optimization Sessions . . . . . . . . . . . . . . . . . . . . . . . . 294
Running Target-Side Pushdown Optimization Sessions . . . . . . . . . . . . . . . . . . . . . . . . . 294
Running Full Pushdown Optimization Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Working with Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
Using ODBC Drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Working with Dates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Working with Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Error Handling, Logging, and Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Working with Slowly Changing Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
Working with Sequences and Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Troubleshooting Orphaned Sequences and Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
Using the $$PushdownConfig Mapping Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
Configuring Sessions for Pushdown Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
Pushdown Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
Target Load Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
Viewing Pushdown Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

Chapter 19: Pushdown Optimization and Transformations . . . . . . . . . . . . . . . . . . . 315


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
Aggregator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
Expression Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
Unconnected Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
Lookup Transformation with an SQL Override . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
Router Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320

xiv Table of Contents


Sequence Generator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
Sorter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Source Qualifier Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
Source Qualifier Transformation with an SQL Override . . . . . . . . . . . . . . . . . . . . . . . . 322
Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
Union Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
Update Strategy Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324

Chapter 20: Running Concurrent Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
Configuring Unique Workflow Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
Recovering Workflow Instances by Instance Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
Configuring Concurrent Workflows of the Same Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Running Concurrent Web Service Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Configuring Workflow Instances of the Same Name . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Recovering Workflow Instances of the Same Name . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
Using Parameters and Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
Accessing the Run Instance Name or Run ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
Steps to Configure Concurrent Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Starting and Stopping Concurrent Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Starting Workflow Instances from Workflow Designer . . . . . . . . . . . . . . . . . . . . . . . . . 329
Starting One Concurrent Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
Starting Concurrent Workflows from the Command Line . . . . . . . . . . . . . . . . . . . . . . . 330
Stopping or Aborting Concurrent Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
Monitoring Concurrent Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Viewing Session and Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Log Files for Unique Workflow Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Log Files for Workflow Instances of the Same Name . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
Rules and Guidelines for Concurrent Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332

Chapter 21: Monitoring Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Using the Workflow Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
Opening the Workflow Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Connecting to a Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Connecting to an Integration Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Filtering Tasks and Integration Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
Opening and Closing Folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Viewing Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Viewing Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Customizing Workflow Monitor Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
Configuring General Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
Configuring Gantt Chart View Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Configuring Task View Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

Table of Contents xv
Configuring Advanced Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Using Workflow Monitor Toolbars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
Working with Tasks and Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
Opening and Displaying Previous Workflow Runs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
Running a Task, Workflow, or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
Recovering a Workflow or Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
Restarting a Task or Workflow Without Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
Stopping or Aborting Tasks and Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
Scheduling and Unscheduling Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Viewing Session Logs and Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Viewing History Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Workflow and Task Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Using the Gantt Chart View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Listing Tasks and Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Navigating the Time Window in Gantt Chart View . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Zooming the Gantt Chart View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Performing a Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
Opening All Folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Using the Task View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Filtering in Task View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Opening All Folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

Chapter 22: Viewing Workflow Monitor Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
Viewing Repository Service Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
Viewing Integration Service Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
Viewing Integration Service Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
Viewing Integration Service Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Viewing Repository Folder Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
Viewing Workflow Run Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Viewing Workflow Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
Viewing Task Progress Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Viewing Session Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Viewing Worklet Run Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Viewing Worklet Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Viewing Command Task Run Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
Viewing Session Task Run Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Viewing Failure Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Viewing Session Task Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Viewing Source and Target Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
Viewing Partition Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Viewing Performance Details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
Understanding Performance Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362

xvi Table of Contents


Chapter 23: Running Workflows and Sessions on a Grid . . . . . . . . . . . . . . . . . . . . 365
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
Running Workflows on a Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
Running Sessions on a Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
Working with Partition Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
Forming Partition Groups Without Resource Requirements . . . . . . . . . . . . . . . . . . . . . 367
Forming Partition Groups With Resource Requirements . . . . . . . . . . . . . . . . . . . . . . . . 368
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
Working with Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
Grid Connectivity and Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Configuring a Workflow or Session to Run on a Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370

Chapter 24: Working with the Load Balancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Assigning Service Levels to Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Assigning Resources to Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372

Chapter 25: Session and Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
Log Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
Log Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
Message Severity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
Writing Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
Passing Session Events to an External Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
Log Events Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
Searching for Log Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
Working with Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
Writing to Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Archiving Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Configuring Workflow Log File Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
Configuring Session Log File Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
Workflow Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
Workflow Log Events Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
Workflow Log Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
Session Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
Log Events Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
Session Log File Sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
Setting Tracing Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
Viewing Log Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384

Chapter 26: Working with the Session Log Interface . . . . . . . . . . . . . . . . . . . . . . . . 387


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
Implementing the Session Log Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
The Integration Service and the Session Log Interface . . . . . . . . . . . . . . . . . . . . . . . . . 387

Table of Contents xvii


Rules and Guidelines for Implementing the Session Log Interface . . . . . . . . . . . . . . . . . 388
Functions in the Session Log Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
INFA_InitSessionLog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
INFA_OutputSessionLogMsg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
INFA_OutputSessionLogFatalMsg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
INFA_EndSessionLog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
INFA_AbnormalSessionTermination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Session Log Interface Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Building the External Session Log Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Using the External Session Log Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392

Chapter 27: Row Error Logging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
Error Log Code Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
Understanding the Error Log Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
PMERR_DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
PMERR_MSG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
PMERR_SESS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
PMERR_TRANS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
Understanding the Error Log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
Configuring Error Log Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401

Chapter 28: Parameter Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
Parameter and Variable Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
Where to Use Parameters and Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Parameter File Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
Parameter File Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Null Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
Sample Parameter File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
Configuring the Parameter File Name and Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
Using a Parameter File with Workflows or Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
Using a Parameter File with pmcmd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
Parameter File Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
Guidelines for Creating Parameter Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417

Chapter 29: External Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
Before You Begin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
External Loader Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
Loading Data to a Named Pipe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
Staging Data to a Flat File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420

xviii Table of Contents


Partitioning Sessions with External Loaders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
Loading to IBM DB2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
Setting Operation Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
Configuring Authorities, Privileges, and Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . 422
Configuring IBM DB2 EE External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . 423
Configuring IBM DB2 EEE External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . 424
Loading to Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
Loading Multibyte Data to Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
Configuring Oracle External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
Loading to Sybase IQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
Loading Multibyte Data to Sybase IQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
Configuring Sybase IQ External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
Loading to Teradata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Overriding the Control File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
Creating User Variables in the Control File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
Configuring Teradata MultiLoad External Loader Attributes . . . . . . . . . . . . . . . . . . . . . 431
Configuring Teradata TPump External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . . 433
Configuring Teradata FastLoad External Loader Attributes . . . . . . . . . . . . . . . . . . . . . . 435
Configuring Teradata Warehouse Builder Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
Configuring External Loading in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Configuring a Session to Write to a File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
Configuring File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
Selecting an External Loader Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441

Chapter 30: Using FTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
Integration Service Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
Using FTP with Source Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
Using FTP with Target Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
Configuring FTP in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Configuring SFTP in a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Selecting an FTP Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Configuring Source File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
Configuring Target File Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447

Chapter 31: Using Incremental Aggregation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Integration Service Processing for Incremental Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . 450
Reinitializing the Aggregate Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
Moving or Deleting the Aggregate Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451

Table of Contents xix


Finding Index and Data Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
Partitioning Guidelines with Incremental Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
Preparing for Incremental Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
Configuring the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
Configuring the Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452

Chapter 32: Session Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455


Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Cache Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
Cache Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
Naming Convention for Cache Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
Cache File Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Configuring the Cache Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Calculating the Cache Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Using Auto Memory Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
Configuring a Numeric Cache Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
Steps to Configure the Cache Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
Cache Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
Configuring the Cache Size for Cache Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
Aggregator Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
Incremental Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Configuring the Cache Sizes for an Aggregator Transformation . . . . . . . . . . . . . . . . . . . 463
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
Joiner Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
1:n Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
n:n Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
Configuring the Cache Sizes for a Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . 465
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
Lookup Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
Sharing Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
Configuring the Cache Sizes for a Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . 468
Rank Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
Configuring the Cache Sizes for a Rank Transformation . . . . . . . . . . . . . . . . . . . . . . . . 469
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Sorter Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
Configuring the Cache Size for a Sorter Transformation . . . . . . . . . . . . . . . . . . . . . . . . 470
XML Target Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
Configuring the Cache Size for an XML Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
Optimizing the Cache Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471

Appendix A: Session Properties Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473


General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
General Options Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
Performance Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
Mapping Tab (Transformations View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479

xx Table of Contents
Connections Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
Sources Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
Targets Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
Mapping Tab (Partitions View) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
Partition Properties Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
KeyRange Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
HashKeys Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
Partition Points Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
Non-Partition Points Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
Components Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
Reusable Pre- or Post-Session Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
Non-Reusable Pre- or Post-Session Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
Reusable Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
Non-Reusable Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
Email Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506

Appendix B: Workflow Properties Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509


General Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
Scheduler Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
Edit Scheduler Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
Variables Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
Events Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
Metadata Extensions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519

Table of Contents xxi


xxii Table of Contents
Preface
The PowerCenter Workflow Administration Guide is written for developers and administrators who are
responsible for creating workflows and sessions, and running workflows. This guide assumes you have
knowledge of your operating systems, relational database concepts, and the database engines, flat files or
mainframe system in your environment. This guide also assumes you are familiar with the interface
requirements for your supporting applications.

Informatica Resources
Informatica Customer Portal
As an Informatica customer, you can access the Informatica Customer Portal site at http://my.informatica.com.
The site contains product information, user group information, newsletters, access to the Informatica customer
support case management system (ATLAS), the Informatica Knowledge Base, Informatica Documentation
Center, and access to the Informatica user community.

Informatica Documentation
The Informatica Documentation team takes every effort to create accurate, usable documentation. If you have
questions, comments, or ideas about this documentation, contact the Informatica Documentation team
through email at infa_documentation@informatica.com. We will use your feedback to improve our
documentation. Let us know if we can contact you regarding your comments.

Informatica Web Site


You can access the Informatica corporate web site at http://www.informatica.com. The site contains
information about Informatica, its background, upcoming events, and sales offices. You will also find product
and partner information. The services area of the site includes important information about technical support,
training and education, and implementation services.

Informatica Knowledge Base


As an Informatica customer, you can access the Informatica Knowledge Base at http://my.informatica.com. Use
the Knowledge Base to search for documented solutions to known technical issues about Informatica products.
You can also find answers to frequently asked questions, technical white papers, and technical tips.

xxiii
Informatica Global Customer Support
There are many ways to access Informatica Global Customer Support. You can contact a Customer Support
Center through telephone, email, or the WebSupport Service.
Use the following email addresses to contact Informatica Global Customer Support:
♦ support@informatica.com for technical inquiries
♦ support_admin@informatica.com for general customer service requests
WebSupport requires a user name and password. You can request a user name and password at http://
my.informatica.com.
Use the following telephone numbers to contact Informatica Global Customer Support:

North America / South America Europe / Middle East / Africa Asia / Australia

Informatica Corporation Informatica Software Ltd. Informatica Business


Headquarters 6 Waltham Park Solutions Pvt. Ltd.
100 Cardinal Way Waltham Road, White Waltham Diamond District
Redwood City, California Maidenhead, Berkshire Tower B, 3rd Floor
94063 SL6 3TN 150 Airport Road
United States United Kingdom Bangalore 560 008
India

Toll Free Toll Free Toll Free


+1 877 463 2435 00 800 4632 4357 Australia: 1 800 151 830
Singapore: 001 800 4632 4357

Standard Rate Standard Rate Standard Rate


Brazil: +55 11 3523 7761 Belgium: +32 15 281 702 India: +91 80 4112 5738
Mexico: +52 55 1168 9763 France: +33 1 41 38 92 26
United States: +1 650 385 5800 Germany: +49 1805 702 702
Netherlands: +31 306 022 797
United Kingdom: +44 1628 511 445

xxiv Preface
CHAPTER 1

Using the Workflow Manager


This chapter includes the following topics:
♦ Overview, 1
♦ Customizing Workflow Manager Options, 4
♦ Navigating the Workspace, 7
♦ Working with Repository Objects, 9
♦ Checking In and Out Versioned Repository Objects, 10
♦ Searching for Versioned Objects, 11
♦ Copying Repository Objects, 12
♦ Comparing Repository Objects, 13
♦ Working with Metadata Extensions, 15
♦ Keyboard Shortcuts, 17

Overview
In the Workflow Manager, you define a set of instructions called a workflow to execute mappings you build in
the Designer. Generally, a workflow contains a session and any other task you may want to perform when you
run a session. Tasks can include a session, email notification, or scheduling information. You connect each task
with links in the workflow.
You can also create a worklet in the Workflow Manager. A worklet is an object that groups a set of tasks. A
worklet is similar to a workflow, but without scheduling information. You can run a batch of worklets inside a
workflow.
After you create a workflow, you run the workflow in the Workflow Manager and monitor it in the Workflow
Monitor.

Workflow Manager Options


You can customize the Workflow Manager default options to control the behavior and look of the Workflow
Manager tools. You can also configure options, such as grouping sessions or docking and undocking windows.

RELATED TOPICS:
♦ “Customizing Workflow Manager Options” on page 4

1
Workflow Manager Tools
To create a workflow, you first create tasks such as a session, which contains the mapping you build in the
Designer. You then connect tasks with conditional links to specify the order of execution for the tasks you
created. The Workflow Manager consists of three tools to help you develop a workflow:
♦ Task Developer. Use the Task Developer to create tasks you want to run in the workflow.
♦ Workflow Designer. Use the Workflow Designer to create a workflow by connecting tasks with links. You
can also create tasks in the Workflow Designer as you develop the workflow.
♦ Worklet Designer. Use the Worklet Designer to create a worklet.
Figure 1-1 shows what a workflow might look like if you want to run a session, perform a shell command after
the session completes, and then stop the workflow:

Figure 1-1. Sample Workflow

Workflow Tasks
You can create the following types of tasks in the Workflow Manager:
♦ Assignment. Assigns a value to a workflow variable. For more information, see “Working with the
Assignment Task” on page 100.
♦ Command. Specifies a shell command to run during the workflow. For more information, see “Working
with the Command Task” on page 100.
♦ Control. Stops or aborts the workflow. For more information about the Control task, see “Stopping or
Aborting the Workflow” on page 82.
♦ Decision. Specifies a condition to evaluate. For more information, see “Working with the Decision Task” on
page 103.
♦ Email. Sends email during the workflow. For more information about the Email task, see “Sending Email”
on page 237.
♦ Event-Raise. Notifies the Event-Wait task that an event has occurred. For more information, see “Working
with Event Tasks” on page 105.
♦ Event-Wait. Waits for an event to occur before executing the next task. For more information, see
“Working with Event Tasks” on page 105.
♦ Session. Runs a mapping you create in the Designer. For more information about the Session task, see
“Working with Sessions” on page 119.
♦ Timer. Waits for a timed event to trigger. For more information, see “Scheduling a Workflow” on page 73.

Workflow Manager Windows


The Workflow Manager displays the following windows to help you create and organize workflows:
♦ Navigator. You can connect to and work in multiple repositories and folders. In the Navigator, the Workflow
Manager displays a red icon over invalid objects.
♦ Workspace. You can create, edit, and view tasks, workflows, and worklets.
♦ Output. Contains tabs to display different types of output messages. The Output window contains the
following tabs:
− Save. Displays messages when you save a workflow, worklet, or task. The Save tab displays a validation
summary when you save a workflow or a worklet.
− Fetch Log. Displays messages when the Workflow Manager fetches objects from the repository.

2 Chapter 1: Using the Workflow Manager


− Validate. Displays messages when you validate a workflow, worklet, or task.
− Copy. Displays messages when you copy repository objects.
− Server. Displays messages from the Integration Service.
− Notifications. Displays messages from the Repository Service.
♦ Overview. An optional window that lets you easily view large workflows in the workspace. Outlines the
visible area in the workspace and highlights selected objects in color. Click View > Overview Window to
display this window.
You can view a list of open windows and switch from one window to another in the Workflow Manager. To
view the list of open windows, click Window > Windows.
The Workflow Manager also displays a status bar that shows the status of the operation you perform.
Figure 1-2 shows the Workflow Manager windows:

Figure 1-2. Workflow Manager Windows

Status Bar Output Navigator Workspace Overview

Setting the Date/Time Display Format


The Workflow Manager displays the date and time formats configured in the Windows Control Panel of the
PowerCenter Client machine. To modify the date and time formats, display the Control Panel and open
Regional Settings. Set the date and time formats on the Date and Time tabs.
Note: For the Timer task and schedule settings, the Workflow Manager displays date in short date format and
the time in 24-hour format (HH:mm).

Removing an Integration Service from the Workflow Manager


You can remove an Integration Service from the Navigator. Remove an Integration Service if the Integration
Service no longer exists or if you no longer use that Integration Service. When you remove an Integration
Service with associated workflows, assign another Integration Service to the workflows.

To delete an Integration Service:

1. In the Navigator, right-click on the Integration Service you want to delete.


2. Click Delete.

Overview 3
Customizing Workflow Manager Options
You can customize the Workflow Manager default options to control the behavior and look of the Workflow
Manager tools.
To configure Workflow Manager options, click Tools > Options. You can configure the following options:
♦ General. You can configure workspace options, display options, and other general options on the General
tab. For more information about the General tab, see “Configuring General Options” on page 4.
♦ Format. You can configure font, color, and other format options on the Format tab. For more information
about the Format tab, see “Configuring Format Options” on page 5.
♦ Miscellaneous. You can configure Copy Wizard and Versioning options on the Miscellaneous tab. For more
information about the Miscellaneous tab, see “Configuring Miscellaneous Options” on page 6.
♦ Advanced. You can configure enhanced security for connection objects in the Advanced tab. For more
information about the Advanced tab, see “Enabling Enhanced Security” on page 7.
You can also configure the workspace layout for printing. For more information, see “Configuring Page Setup
Options” on page 7.

Configuring General Options


General options control tool behavior, such as whether or not a tool retains its view when you close it, how the
Overview window behaves, and where the Workflow Manager stores workspace files.
Table 1-1 describes general options you can configure in the Workflow Manager:

Table 1-1. Workflow Manager General Options

Option Description

Reload Reloads the last view of a tool when you open it. For example, if you have a
Tasks/Workflows When workflow open when you disconnect from a repository, select this option so that the
Opening a Folder same workflow appears the next time you open the folder and Workflow Designer.
Default is enabled.

Ask Whether to Reload Appears when you select Reload tasks/workflows when opening a folder. Select this
the Tasks/Workflows option if you want the Workflow Manager to prompt you to reload tasks, workflows,
and worklets each time you open a folder. Default is disabled.

Delay Overview By default, when you drag the focus of the Overview window, the focus of the
Window Pans workbook moves concurrently. When you select this option, the focus of the
workspace does not change until you release the mouse button. Default is disabled.

Arrange Arranges tasks in workflows vertically by default. Default is disabled.


Workflows/Worklets
Vertically By Default

Allow Invoking In-Place By default, you can press F2 to edit objects directly in the workspace instead of
Editing Using the opening the Edit Task dialog box. Select this option so you can also click the object
Mouse name in the workspace to edit the object. Default is disabled.

Open Editor When a Opens the Edit Task dialog box when you create a task. By default, the Workflow
Task Is Created Manager creates the task in the workspace. If you do not enable this option, double-
click the task to open the Edit Task dialog box. Default is disabled.

Workspace File Directory for workspace files created by the Workflow Manager. Workspace files
Directory maintain the last task or workflow you saved. This directory should be local to the
PowerCenter Client to prevent file corruption or overwrites by multiple users. By
default, the Workflow Manager creates files in the PowerCenter Client installation
directory.

Display Tool Names on Displays the name of the tool in the upper left corner of the workspace or workbook.
Views Default is enabled.

Always Show the Full Shows the full name of a task when you select it. By default, the Workflow Manager
Name of Tasks abbreviates the task name in the workspace. Default is disabled.

4 Chapter 1: Using the Workflow Manager


Table 1-1. Workflow Manager General Options

Option Description

Show the Expression Shows the link condition in the workspace. If you do not enable this option, the
on a Link Workflow Manager abbreviates the link condition in the workspace. Default is
enabled.

Show Background in Displays background color for objects in iconic view. Disable this option to remove
Partition Editor and background color from objects in iconic view. Default is disabled.
Pushdown Optimization

Launch Workflow Launches Workflow Monitor when you start a workflow or a task. Default is enabled.
Monitor when Workflow
Is Started

Receive Notifications You can receive notification messages in the Workflow Manager and view them in
from Repository the Output window. Notification messages include information about objects that
Service another user creates, modifies, or deletes. You receive notifications about sessions,
tasks, workflows, and worklets. The Repository Service notifies you of the changes
so you know objects you are working with may be out of date. For the Workflow
Manager to receive a notification, the folder containing the object must be open in
the Navigator, and the object must be open in the workspace. You also receive
user-created notifications posted by the user who manages the Repository Service.
Default is enabled.

Reset All Reset all format options to the default values.

Configuring Format Options


Format options control workspace colors and fonts. You can configure format options for each Workflow
Manager tool.
Table 1-2 describes the format options for the Workflow Manager:

Table 1-2. Workflow Manager Format Options

Option Description

Current Theme Currently selected color theme for the Workflow Manager tools. This field is display-
only.

Select Theme Apply a color theme to the Workflow Manager tools. For more information, see
“Using Color Themes” on page 5.

Tools Workflow Manager tool that you want to configure. When you select a tool, the
configurable workspace elements appear in the list below Tools menu.

Color Color of the selected workspace element.

Orthogonal Links Link lines run horizontally and vertically but not diagonally in the workspace.

Solid Lines for Links Links appear as solid lines. By default, the Workflow Manager displays orthogonal
links as dotted lines.

Categories Component of the Workflow Manager that you want to customize.

Change Change the display font and language script for the selected category.

Current Font Font of the Workflow Manager component that is currently selected in the
Categories menu. This field is display-only.

Reset All Reset all format options to the default values.

Using Color Themes


Use color themes to quickly select the colors of the workspace elements in all the Workflow Manager tools. You
can choose from the following standard color themes:
♦ Informatica Classic. This is the standard color scheme for workspace elements. The workspace background
is gray, the workspace text is white, and the link colors are blue, red, blue-gray, dark green, and black.

Customizing Workflow Manager Options 5


♦ High Contrast Black. Bright link colors stand out against the black background.The workspace background
is black, the workspace text is white, and the link colors are purple, red, light blue, bright green, and white.
♦ Colored Backgrounds. Each Designer tool has a different pastel-colored workspace background. The
workspace text is black, and the link colors are the same as in the Informatica Classic color theme.
Note: You can also apply a color theme to the Designer tools.

After you select a color theme for the Workflow Manager tools, you can modify the color of individual
workspace elements.

To select a color theme for a Workflow Manager tool:

1. In the Workflow Manager, click Tools > Options.


2. Click the Format tab.
3. In the Color Themes section of the Format tab, click Select Theme.
The Theme Selector dialog box appears.
4. Select a theme from the Theme menu.
5. Click the tabs in the Preview section to see how the workspace elements appear in each of the Workflow
Manager tools.
6. Click OK to apply the color theme.
Note: After you select the workspace colors using a color theme, you can change the color of individual
workspace elements. Changes that you make to individual elements do not appear in the Preview section of the
Theme Selector dialog box.

RELATED TOPICS:
♦ “Configuring Format Options” on page 5

Configuring Miscellaneous Options


Miscellaneous options control the display settings and available functions of the Copy Wizard, versioning, and
target load options. Target options control how the Integration Service loads targets. To configure the Copy
Wizard, Versioning, and Target Load Type options, click Tools > Options and select the Miscellaneous tab.
Table 1-3 describes the miscellaneous options:

Table 1-3. Workflow Manager Miscellaneous Options

Option Description

Validate Copied Objects Validates the copied object. Enabled by default.

Generate Unique Name When Generates unique names for copied objects if you select the Rename
Resolved to “Rename” option. For example, if the workflow wf_Sales has the same name as a
workflow in the destination folder, the Rename option generates the
unique name wf_Sales1. Default is enabled.

Get Default Object When Uses the object with the same name in the destination folder if you select
Resolved to “Choose” the Choose option. Default is disabled.

Show Check Out Image in Displays the Check Out icon when an object has been checked out.
Navigator Default is enabled.

Allow Delete Without Checkout You can delete versioned repository objects without first checking them
out. You cannot, however, delete an object that another user has checked
out. When you select this option, the Repository Service checks out an
object to you when you delete it. Default is disabled.

Check In Deleted Objects Checks in deleted objects after you save the changes to the repository.
Automatically After They Are When you clear this option, the deleted object remains checked out and
Saved you must check it in from the results view. Default is disabled.

6 Chapter 1: Using the Workflow Manager


Table 1-3. Workflow Manager Miscellaneous Options

Option Description

Target Load Type Sets default load type for sessions. You can choose normal or bulk
loading.
Any change you make takes effect after you restart the Workflow
Manager.
You can override this setting in the session properties. Default is Bulk.
For more information about normal and bulk loading, see Table A-12 on
page 491.

Reset All Resets all Miscellaneous options to the default values.

Enabling Enhanced Security


The Workflow Manager has an enhanced security option to specify a default set of permissions for connection
objects.
When you enable enhanced security, the Workflow Manager assigns default permissions on connection objects
for users, groups, and others.
When you disable enable enhanced security, the Workflow Manager assigns read, write, and execute permissions
to all users that would otherwise receive permissions of the default group.
If you delete the owner from the repository, the Workflow Manager assigns ownership of the object to the
administrator.

To enable enhanced security for connection objects:

1. Click Tools > Options.


2. Click the Advanced Tab.
3. Select Enable Enhanced Security.
4. Click OK.

Configuring Page Setup Options


Page Setup options allow you to control the layout of the workspace you are printing.
Table 1-4 describes the page setup options:

Table 1-4. Workflow Manager Page Setup Options

Option Description

Header and Displays the window title, page number, number of pages, current date and current time
Footer in the printout of the workspace. You can also indicate the alignment of the header and
footer.

Options Adds a frame or corner to the page, shows full name of the tasks and options. You can
also choose to print in color or black and white.

Navigating the Workspace


The Workflow Manager lets you perform the following operations to navigate the workspace:
♦ Customize windows.
♦ Customize toolbars.
♦ Search for tasks, links, events and variables.

Navigating the Workspace 7


♦ Arrange objects in the workspace.
♦ Zoom and pan the workspace.

Customizing Workflow Manager Windows


You can customize the following options for the Workflow Manager windows:
♦ Display a window. From the menu, select View. Then select the window you want to open.
♦ Close a window. Click the small x in the upper right corner of the window.
♦ Dock or undock a window. Double-click the title bar or drag the title bar toward or away from the
workspace.

Using Toolbars
The Workflow Manager can display the following toolbars to help you select tools and perform operations
quickly:
♦ Standard. Contains buttons to connect to and disconnect from repositories and folders, toggle windows,
zoom in and out, pan the workspace, and find objects.
♦ Connections. Contains buttons to create and edit connections, and assign Integration Services.
♦ Repository. Contains buttons to connect to and disconnect from repositories and folders, export and import
objects, save changes, and print the workspace.
♦ View. Contains buttons to customize toolbars, toggle the status bar and windows, toggle full-screen view,
create a new workbook, and view the properties of objects.
♦ Layout. Contains buttons to arrange and restore objects in the workspace, find objects, zoom in and out,
and pan the workspace.
♦ Tasks. Contains buttons to create tasks.
♦ Workflow. Contains buttons to edit workflow properties.
♦ Run. Contains buttons to schedule the workflow, start the workflow, or start a task.
♦ Versioning. Contains buttons to check in objects, undo checkouts, compare versions, list checked-out
objects, and list repository queries.
♦ Tools. Contains buttons to connect to the other PowerCenter Client applications. When you use a Tools
button to open another PowerCenter Client application, PowerCenter uses the same repository connection
to connect to the repository and opens the same folders.
You can perform the following operations with toolbars:
♦ Display or hide a toolbar.
♦ Create a new toolbar.
♦ Add or remove buttons.

Searching for Items


The Workflow Manager includes search features to help you find tasks, links, variables, events in the workspace,
and text in the Output window. You can search for items in any Workflow Manager tool or Output window.
There are two ways to search for items in the workspace:
♦ Find in Workspace. Searches multiple items at once and returns a list of all task names, link conditions,
event names, or variable names that contain the search string.
♦ Find Next. Searches through items one at a time and highlights the first task, link, event, variable, or text
string that contains the search string. If you repeat the search, the Workflow Manager highlights the next
item that contains the search string.

8 Chapter 1: Using the Workflow Manager


To find a task, link, event, or variable in the workspace:

1. In any Workflow Manager tool, click the Find in Workspace toolbar button or click Edit > Find in
Workspace.
The Find in Workspace dialog box appears.
2. Choose search for tasks, links, variables, or events.
3. Enter a search string, or select a string from the list.
The Workflow Manager saves the last 10 search strings in the list.
4. Specify whether or not to match whole words and whether or not to perform a case-sensitive search.
5. Click Find Now.
The Workflow Manager lists task names, link conditions, event names, or variable names that match the
search string at the bottom of the dialog box.
6. Click Close.

To find a single object:

1. To search for a task, link, event, or variable, open the appropriate Workflow Manager tool and click a task,
link, or event. To search for text in the Output window, click the appropriate tab in the Output window.
2. Enter a search string in the Find field on the standard toolbar.
The search is not case sensitive
3. Click Edit > Find Next, click the Find Next button on the toolbar, or press Enter or F3 to search for the
string.
The Workflow Manager highlights the first task name, link condition, event name, or variable name that
contains the search string, or the first string in the Output window that matches the search string.
4. To search for the next item, press Enter or F3 again.
The Workflow Manager alerts you when you have searched through all items in the workspace or Output
window before it highlights the same objects a second time.

Arranging Objects in the Workspace


The Workflow Manager can arrange objects in the workspace horizontally or vertically. In the Task Manager,
you can also arrange tasks evenly in the workspace by choosing Tile. To arrange objects in the workspace, click
Layout > Arrange and choose Horizontal, Vertical, or Tile. To display the links as horizontal and vertical lines,
click Layout > Orthogonal Links.

Zooming the Workspace


You can zoom and pan the workspace to adjust the view. Use the toolbar or Layout menu options to set zoom
levels. To maximize the size of the workspace window, click View > Full Screen. To go back to normal view,
click the Close Full Screen button or press Esc.
To pan the workspace, click Layout > Pan or click the Pan button on the toolbar. Drag the focus of the
workspace window and release the mouse button when it is in the appropriate position. Double-click the
workspace to stop panning.

Working with Repository Objects


Use the Workflow Manager to perform the following general operations with repository objects:

Working with Repository Objects 9


♦ View properties for each object.
♦ Enter descriptions for each object.
♦ Rename an object.
To edit any repository object, you must first add a repository in the Navigator so you can access the repository
object. To add a repository in the Navigator, click Repository > Add. Use the Add Repositories dialog box to
add the repository. For more information about adding repositories, see the PowerCenter Repository Guide.

Viewing Object Properties


To view properties of a repository object, first select the repository object in the Navigator. Click View >
Properties to view object properties. Or, right-click the repository object and choose Properties.
You can view properties of a folder, task, worklet, or workflow. For folders, the Workflow Manager displays
folder name and whether the folder is shared. Object properties are read-only.
You can also view dependencies for repository objects. For more information about viewing object
dependencies, see the PowerCenter Repository Guide.

Entering Descriptions for Repository Objects


When you edit an object in the Workflow Manager, you can enter descriptions and comments for that object.
The maximum number of characters you can enter is 2,000 bytes/K, where K is the maximum number of bytes
a character contains in the selected repository code page. For example, if the repository code page is a Japanese
code page where each character can contain up to two bytes (K=2), each description and comment field can
contain up to 1,000 characters.

Renaming Repository Objects


You can rename repository objects by clicking the Rename button in the Edit Tasks dialog box or the Edit
Workflow dialog box. You can also rename repository objects by clicking the object name in the workspace and
typing in the new name.

Checking In and Out Versioned Repository Objects


When you work with versioned objects, you must check out an object if you want to change it, and save it when
you want to commit the changes to the repository. You must check in the object to allow other users to make
changes to it. Checking in an object adds a new numbered version to the object history.

Checking In Objects
You commit changes to the repository by checking in objects. When you check in an object, the repository
creates a new version of the object and assigns it a version number. The repository increments the version
number by one each time it creates a new version.
To check in an object from the Workflow Manager workspace, select the object or objects and click Versioning
> Check in. If you are checking in multiple objects, you can choose to apply comment to all objects.
If you want to check out or check in scheduler objects in the Workflow Manager, you can run an object query
to search for them. You can also check out a scheduler object in the Scheduler Browser window when you edit
the object. However, you must run an object query to check in the object.
If you want to check out or check in session configuration objects in the Workflow Manager, you can run an
object query to search for them. You can also check out objects from the Session Config Browser window when
you edit them.

10 Chapter 1: Using the Workflow Manager


You also can check out and check in session configuration and scheduler objects from the Repository Manager.

RELATED TOPICS:
♦ “Searching for Versioned Objects” on page 11

Viewing and Comparing Versioned Repository Objects


You can view and compare versions of objects in the Workflow Manager. If an object has multiple versions, you
can find the versions of the object in the View History window. In addition to comparing versions of an object
in a window, you can view the various versions of an object in the workspace to graphically compare them.
Use the following rules and guidelines when you view older versions of objects in the workspace:
♦ You cannot simultaneously view multiple versions of composite objects, such as workflows and worklets.
♦ Older versions of a composite object might not include the child objects that were used when the composite
object was checked in. If you open a composite object that includes a child object version that is purged
from the repository, the preceding version of the child object appears in the workspace as part of the
composite object. For example, you might want to view version 5 of a workflow that originally included
version 3 of a session, but version 3 of the session is purged from the repository. When you view version 5 of
the workflow, version 2 of the session appears as part of the workflow.
♦ You cannot view older versions of sessions if they reference deleted or invalid mappings, or if they do not
have a session configuration.

To open an older version of an object in the workspace:

1. In the workspace or Navigator, select the object and click Versioning > View History.
2. Select the version you want to view in the workspace and click Tools > Open in Workspace.
Note: An older version of an object is read-only, and the version number appears as a prefix before the object
name. You can simultaneously view multiple versions of a non-composite object in the workspace.

To compare two versions of an object:

1. In the workspace or Navigator, select an object and click Versioning > View History.
2. Select the versions you want to compare and then click Compare > Selected Versions.
-or-
Select a version and click Compare > Previous Version to compare a version of the object with the previous
version.
The Diff Tool appears. For more information about comparing objects in the Diff Tool, see the
PowerCenter Repository Guide.
Note: You can also access the View History window from the Query Results window when you run a query.
For more information about object queries, see the PowerCenter Repository Guide.

Searching for Versioned Objects


Use an object query to search for versioned objects in the repository that meet specified conditions. When you
run a query, the repository returns results based on those conditions. You may want to create an object query to
perform the following tasks:
♦ Track repository objects during development. You can add Label, User, Last saved, or Comments
parameters to queries to track objects during development.
♦ Associate a query with a deployment group. When you create a dynamic deployment group, you can
associate an object query with it.

Searching for Versioned Objects 11


To create an object query, click Tools > Queries to open the Query Browser.
From the Query Browser, you can create, edit, and delete queries. You can also configure permissions for each
query from the Query Browser. You can run any queries for which you have read permissions from the Query
Browser.
For more information about object queries and deployment groups, see the PowerCenter Repository Guide.

Copying Repository Objects


You can copy repository objects, such as workflows, worklets, or tasks within the same folder, to a different
folder, or to a different repository. If you want to copy the object to another folder, you must open the
destination folder before you copy the object into the folder.
Use the Copy Wizard in the Workflow Manager to copy objects. When you copy a workflow or a worklet, the
Copy Wizard copies all of the worklets, sessions, and tasks in the workflow. You must resolve all conflicts that
occur. Conflicts occur when the Copy Wizard finds a workflow or worklet with the same name in the target
folder or when the connection object does not exist in the target repository. If a connection object does not
exist, you can skip the conflict and choose a connection object after you copy the workflow. You cannot copy
connection objects. Conflicts may also occur when you copy Session tasks.
You can configure display settings and functions of the Copy Wizard by choosing Tools > Options.
For more information about the Copy Wizard, see the PowerCenter Repository Guide.
Note: Use the Import Wizard in the Workflow Manager to import objects from an XML file. The Import
Wizard provides the same options to resolve conflicts as the Copy Wizard.

Copying Sessions
When you copy a Session task, the Copy Wizard looks for the database connection and associated mapping in
the destination folder. If the mapping or connection does not exist in the destination folder, you can select a
new mapping or connection. If the destination folder does not contain any mapping, you must first copy a
mapping to the destination folder in the Designer before you can copy the session.
When you copy a session that has mapping variable values saved in the repository, the Workflow Manager either
copies or retains the saved variable values.

Copying Workflow Segments


You can copy segments of workflows and worklets when you want to reuse a portion of workflow or worklet
logic. A segment consists of one or more tasks, the links between the tasks, and any condition in the links. You
can copy reusable and non-reusable objects when copying and pasting segments. You can copy segments of
workflows or worklets into workflows and worklets within the same folder, within another folder, or within a
folder in a different repository. You can also paste segments of workflows or worklets into an empty Workflow
Designer or Worklet Designer workspace.

To copy a segment from a workflow or worklet:

1. Open the workflow or worklet.


2. To select a segment, highlight each task you want to copy. You can select multiple reusable or non-reusable
objects. You can also select segments by dragging the pointer in a rectangle around objects in the
workspace.
3. Click Edit > Copy or press Ctrl+C to copy the segment to the clipboard.
4. Open the workflow or worklet into which you want to paste the segment. You can also copy the object into
the Workflow or Worklet Designer workspace.

12 Chapter 1: Using the Workflow Manager


5. Click Edit > Paste or press Ctrl+V.
The Copy Wizard opens, and notifies you if it finds copy conflicts.
Note: You can copy individual non-reusable tasks by selecting the individual task and following the instructions
for copying and pasting segments.

Comparing Repository Objects


Use the Workflow Manager to compare two repository objects of the same type to identify differences between
the objects. For example, if you have two similar Email tasks in a folder, you can compare them to see which
one contains the attributes you need. When you compare two objects, the Workflow Manager displays their
attributes in detail.
You can compare objects across folders and repositories. You must open both folders to compare the objects.
You can compare a reusable object with a non-reusable object. You can also compare two versions of the same
object. For more information about versioned objects, see the PowerCenter Repository Guide.
You can compare the following types of objects:
♦ Tasks
♦ Sessions
♦ Worklets
♦ Workflows
You can also compare instances of the same type. For example, if the workflows you compare contain worklet
instances with the same name, you can compare the instances to see if they differ. Use the Workflow Manager to
compare the following instances and attributes:
♦ Instances of sessions and tasks in a workflow or worklet comparison. For example, when you compare
workflows, you can compare task instances that have the same name.
♦ Instances of mappings and transformations in a session comparison. For example, when you compare
sessions, you can compare mapping instances.
♦ The attributes of instances of the same type within a mapping comparison. For example, when you
compare flat file sources, you can compare attributes, such as file type (delimited or fixed), delimiters, escape
characters, and optional quotes.
You can compare schedulers and session configuration objects in the Repository Manager. You cannot compare
objects of different types. For example, you cannot compare an Email task with a Session task.
When you compare objects, the Workflow Manager displays the results in the Diff Tool window. The Diff Tool
output contains different nodes for different types of objects.
When you import Workflow Manager objects, you can compare object conflicts. For more information about
importing objects, see the PowerCenter Repository Guide.

Steps for Comparing Objects


Use the following procedure to compare objects.

To compare two objects:

1. Open the folders that contain the objects you want to compare.
2. Open the appropriate Workflow Manager tool.
3. Click Tasks > Compare, Worklets > Compare, or Workflow > Compare.

Comparing Repository Objects 13


A dialog box similar to the following one appears.

4. Click Browse to select an object.


5. Click Compare.
Tip: You can also compare objects from the Navigator or workspace. In the Navigator, select the objects,
right-click and select Compare Objects. In the workspace, select the objects, right-click and select Compare
Objects.
Figure 1-3 shows the result of comparing two objects:

Figure 1-3. Diff Tool Window


Filter nodes that
have same attribute
values.

Drill down to
further compare
objects.

Differences between
objects are
highlighted and the
nodes are flagged.

Differences
between object
properties are
marked.
Displays the
properties of the
node you select.

14 Chapter 1: Using the Workflow Manager


6. To view more differences between object properties, click the Compare Further icon or right-click the
differences.

7. If you want to save the comparison as a text or HTML file, click File > Save to File.

Working with Metadata Extensions


You can extend the metadata stored in the repository by associating information with individual repository
objects. For example, you may want to store your name with the worklets you create. If you create a session, you
can store your telephone extension with that session. You associate information with repository objects using
metadata extensions.
Repository objects can contain both vendor-defined and user-defined metadata extensions. You can view and
change the values of vendor-defined metadata extensions, but you cannot create, delete, or redefine them. You
can create, edit, delete, view user-defined metadata extensions and change their values.
You can create metadata extensions for the following objects in the Workflow Manager:
♦ Sessions
♦ Workflows
♦ Worklets
You can create both reusable and non-reusable metadata extensions. You associate reusable metadata extensions
with all repository objects of a certain type such as all sessions or all worklets. You associate non-reusable
metadata extensions with a single repository object such as one workflow. For more information about metadata
extensions, see the PowerCenter Repository Guide.

Creating a Metadata Extension


You can create user-defined, reusable, and non-reusable metadata extensions for repository objects using the
Workflow Manager. To create a metadata extension, you edit the object for which you want to create the
metadata extension and then add the metadata extension to the Metadata Extensions tab.
If you need to create multiple reusable metadata extensions, it is easier to create them using the Repository
Manager.

Working with Metadata Extensions 15


To create a metadata extension:

1. Open the appropriate Workflow Manager tool.


2. Drag the appropriate object into the workspace.
3. Double-click the title bar of the object to edit it.
4. Click the Metadata Extensions tab.
This tab lists the existing user-defined and vendor-defined metadata extensions. User-defined metadata
extensions appear in the User Defined Metadata Domain. If they exist, vendor-defined metadata extensions
appear in their own domains.
5. Click the Add button.
A new row appears in the User Defined Metadata Extension Domain.
6. Enter the information in Table 1-5:

Table 1-5. Metadata Extension Attributes in the Workflow Manager

Field Description

Extension Name Name of the metadata extension. Metadata extension names must be unique for
each type of object in a domain. Metadata extension names cannot contain any
special characters except underscores and cannot begin with numbers.

Datatype Datatype: numeric (integer), string, or boolean.

Precision Maximum length for string metadata extensions.

Value Optional value.


For a numeric metadata extension, the value must be an integer between -
2,147,483,647 and 2,147,483,647.
For a boolean metadata extension, choose true or false.
For a string metadata extension, click the Open button in the Value field to enter
a value of more than one line, up to 2,147,483,647 bytes.

Reusable Makes the metadata extension reusable or non-reusable. Check to apply the
metadata extension to all objects of this type (reusable). Clear to make the
metadata extension apply to this object only (non-reusable).
Note: If you make a metadata extension reusable, you cannot change it back to
non-reusable. The Workflow Manager makes the extension reusable as soon as
you confirm the action.

UnOverride Restores the default value of the metadata extension when you click Revert. This
column appears if the value of one of the metadata extensions was changed.

Description Description of the metadata extension.

7. Click OK.

Editing a Metadata Extension


You can edit user-defined, reusable, and non-reusable metadata extensions for repository objects using the
Workflow Manager. To edit a metadata extension, you edit the repository object, and then make changes to the
Metadata Extensions tab.
What you can edit depends on whether the metadata extension is reusable or non-reusable. You can promote a
non-reusable metadata extension to reusable, but you cannot change a reusable metadata extension to non-
reusable.

Editing Reusable Metadata Extensions


If the metadata extension you want to edit is reusable and editable, you can change the value of the metadata
extension, but not any of its properties. However, if the vendor or user who created the metadata extension did
not make it editable, you cannot edit the metadata extension or its value.

16 Chapter 1: Using the Workflow Manager


To edit the value of a reusable metadata extension, click the Metadata Extensions tab and modify the Value
field. To restore the default value for a metadata extension, click Revert in the UnOverride column.

Editing Non-Reusable Metadata Extensions


If the metadata extension you want to edit is non-reusable, you can change the value of the metadata extension
and its properties. You can also promote the metadata extension to a reusable metadata extension.
To edit a non-reusable metadata extension, click the Metadata Extensions tab. You can update the Datatype,
Value, Precision, and Description fields. For a description of these fields, see Table 1-5 on page 16.
To make the metadata extension reusable, select Reusable. If you make a metadata extension reusable, you
cannot change it back to non-reusable. The Workflow Manager makes the extension reusable as soon as you
confirm the action.
To restore the default value for a metadata extension, click Revert in the UnOverride column.

Deleting a Metadata Extension


You can delete metadata extensions for repository objects. You delete reusable metadata extensions using the
Repository Manager. Use the Workflow Manager to delete non-reusable metadata extensions. Edit the
repository object and then delete the metadata extension from the Metadata Extensions tab.

Keyboard Shortcuts
When editing a repository object or maneuvering around the Workflow Manager, use the following Keyboard
shortcuts to help you complete different operations quickly.
Table 1-6 lists the Workflow Manager keyboard shortcuts for editing a repository object:

Table 1-6. Workflow Manager Keyboard Shortcuts

To Press

Cancel editing in a cell. Esc

Select and clear a check box. Space Bar

Copy text from a cell onto the clipboard. Ctrl+C

Cut text from a cell onto the clipboard. Ctrl+X

Edit the text of a cell. F2

Find all combination and list boxes. Type the first letter on the list.

Find tables or fields in the workspace. Ctrl+F

Move around cells in a dialog box. Ctrl+directional arrows

Paste copied or cut text from the clipboard into a cell. Ctrl+V

Select the text of a cell. F2

Keyboard Shortcuts 17
Table 1-7 lists the Workflow Manager keyboard shortcuts for navigating in the workspace:

Table 1-7. Keyboard Shortcuts for Navigating the Workspace

To Press

Create links. Ctrl+F2. Press Ctrl+F2 to select first task you want
to link. Press Tab to select the rest of the tasks you
want to link. Press Ctrl+F2 again to link all the tasks
you selected.

Edit task name in the workspace. F2

Expand selected node and all its children. SHIFT + * (use asterisk on numeric keypad )

Move across Select tasks in the workspace. Tab

Select multiple tasks. Ctrl+mouse click

18 Chapter 1: Using the Workflow Manager


CHAPTER 2

Managing Connection Objects


This chapter includes the following topics:
♦ Overview, 19
♦ Working with Connection Objects, 20
♦ Configuring Session Connections, 22
♦ Overriding Connection Attributes, 24
♦ Configuring Connection Object Permissions, 27
♦ Relational Database Connections, 28
♦ FTP Connections, 34
♦ External Loader Connections, 36
♦ HTTP Connections, 37
♦ PowerChannel Relational Database Connections, 40
♦ PowerExchange Connections

Overview
Before using the Workflow Manager to create workflows and sessions, you must configure connections in the
Workflow Manager.
You can configure the following connection information in the Workflow Manager:
♦ Create relational database connections. Create connections to each source, target, Lookup transformation,
and Stored Procedure transformation database. You must create connections to a database before you can
create a session that accesses the database. For more information, see “Relational Database Connections” on
page 28.
♦ Create FTP connections. Create a File Transfer Protocol (FTP) connection object in the Workflow Manager
and configure the connection properties. For more information, see “FTP Connections” on page 34.
♦ Create external loader connections. Create an external loader connection to load information directly from
a file or pipe rather than running the SQL commands to insert the same data into the database. For more
information, see “External Loader Connections” on page 36.
♦ Create queue connections. Create database connections for message queues. For more information, see
“PowerExchange for WebSphere MQ Connections” on page 60 and “PowerExchange for MSMQ
Connections” on page 43.
♦ Create source and target application connections. Create connections to source and target applications.
When you create or modify a session that reads from or writes to an application, you can select configured
source and target application connections. When you create a connection, the connection properties you
need depends on the application. For more information, see PowerExchange Interfaces for PowerCenter or one
of the following PowerExchange sections in this chapter:

Overview 19
− PowerExchange for JMS Connections, 42
− PowerExchange for MSMQ Connections, 43
− PowerExchange for PeopleSoft Connections, 44
− PowerExchange for Salesforce Connections, 45
− PowerExchange for SAP NetWeaver BI Connections, 51
− PowerExchange for SAP NetWeaver Connections, 46
− PowerExchange for Siebel Connections, 53
− PowerExchange for TIBCO Connections, 54
− PowerExchange for Web Services Connections, 57
− PowerExchange for webMethods Connections, 59
− PowerExchange for WebSphere MQ Connections, 60

Working with Connection Objects


A connection object is a global object that defines a connection in the repository. You create and modify
connection objects in the Workflow Manager. When you create a connection object, you define values for the
connection properties. The properties vary depending on the type of connection you create.
You can create, edit, and delete connection objects. You can also assign permissions to connection objects. For
relational database connections, you can also copy and replace connection objects.

Connection Object Code Pages


Code pages must be compatible for accurate data movement. You must select a code page for most types of
connection objects. The code page of a database connection must be compatible with the database client code
page. If the code pages are not compatible, sessions may hang, data may become inconsistent, or you might
receive a database error, such as:
ORA-00911: Invalid character specified.

The Workflow Manager filters the list of code pages for connections to ensure that the code page for the
connection is a subset of the code page for the repository. It lists the five code pages you have most recently
selected. Then it lists all remaining code pages in alphabetical order.
If you configure the Integration Service for data code page validation, the Integration Service enforces code
page compatibility at session runtime. The Integration Service ensures that the target database code page is a
superset of the source database code page.
When you change the code page in a connection object, you must choose one that is compatible with the
previous code page. If the code pages are incompatible, the Workflow Manager invalidates all sessions using that
connection.
If you configure the PowerCenter Client and Integration Service for relaxed code page validation, you can select
any supported code page for source and target connections. If you are familiar with the data and are confident
that it will convert safely from one code page to another, you can run sessions with incompatible source and
target data code pages. It is your responsibility to ensure your data will convert properly.

Creating Connection Objects


You use the Connection Browser to create connection objects. Open the Connection Browser dialog box for the
connection object. For example, click Connections > Relational to open the Connection Browser dialog box for
a relational database connection.

20 Chapter 2: Managing Connection Objects


To create a connection object:

1. In the Workflow Manager, click Connections and select the type of connection you want to create.
Select one of the following connection types:
♦ Relational. For information about creating relational database connections, see “Relational Database
Connections” on page 28 or PowerExchange Interfaces for PowerCenter.
♦ FTP. For information about creating FTP connections, see “FTP Connections” on page 34.
♦ Loader. For information about creating loader connections, see “External Loader Connections” on
page 36.
♦ Queue. For information about creating queue connections, see “PowerExchange for WebSphere MQ
Connections” on page 60 and “PowerExchange for MSMQ Connections” on page 43.
♦ Application. For information about creating application connections, see the appropriate
PowerExchange application section in this chapter or PowerExchange Interfaces for PowerCenter.
The Connection Browser dialog box appears, listing all the source and target connections available for the
selected connection type.
2. Click New.
If you selected FTP as the connection type, the Connection Object dialog box appears. Go to step 5.
If you selected Relational, Queue, Application, or Loader connection type, the Select Subtype dialog box
appears.
3. In the Select Subtype dialog box, select the type of database connection you want to create.
4. Click OK.
5. Enter the properties for the type of connection object you want to create.
The Connection Object Definition dialog box displays different properties depending on the type of
connection object you create. For more information about connection object properties, see the section for
each specific connection type in this chapter.
6. Click OK.
The new database connection appears in the Connection Browser list.
7. To add more database connections, repeat steps 3 to 6.
8. Click OK to save all changes.

Editing a Connection Object


You can change connection information at any time. If you edit a connection object used by a workflow, the
Integration Service uses the updated connection information the next time the workflow runs. You might use
this functionality when moving from test to production.

To edit a connection object:

1. Open the Connection Browser dialog box for the connection object. For example, click Connections >
Relational to open the Connection Browser dialog box for a relational database connection.
2. Click Edit.
The Connection Object Definition dialog box appears.
3. Enter the values for the properties you want to modify.
The connection properties vary depending on the type of connection you select. For more information
about connection properties, see the section for each specific connection type in this chapter.
4. Click OK.

Working with Connection Objects 21


Deleting a Connection Object
When you delete a connection object, the Workflow Manager invalidates all sessions that use these connections.
To make the sessions valid, you must edit them and replace the missing connections.

To delete a connection object:

1. Open the Connection Browser dialog box for the connection object. For example, click Connections >
Relational to open the Connection Browser dialog box for a relational database connection.
2. Select the connection object you want to delete in the Connection Browser dialog box.
Tip: Hold the shift key to select more than one connection to delete.

3. Click Delete, and then click Yes.

Configuring Session Connections


When you configure a mapping, you can choose the connection type and select a connection to you. You can
also override the connection attributes for the session or create a connection.

Setting the Connection Type


You can set the connection type from the Mapping tab for each object. Connection types include application,
queue, external loader, and FTP.
Note: A relational connection type appears as the connection type for all relational sources, targets, and
transformations such as Lookup and Stored Procedure transformations. You cannot change the relational
connection type.

Application
Select this connection type to access PowerExchange and Teradata FastExport sources and targets. You can also
access transformations such as HTTP, Salesforce Lookup, and BAPI/RFC transformations.

Queue
Select this connection type to access an MSMQ or WebSphere MQ source, or if you want to write messages to
a WebSphere MQ message queue. If you select this option, select a configured MQ connection in the value
column. For static WebSphere MQ targets, set the connection type to FTP or Queue. For dynamic MQSeries
targets, set the connection type to Queue.

External Loader
Select this connection type to use the External Loader to load output files to Teradata, Oracle, DB2, or Sybase
IQ databases. If you select this option, select a configured loader connection in the Value column.
To use this option, you must use a mapping with a relational target definition and choose File as the writer type
on the Writers tab for the relational target instance. As the Integration Service completes the session, it uses an
external loader to load target files to the Oracle, Sybase IQ, IBM DB2, or Teradata database. You cannot choose
external loader for flat file or XML target definitions in the mapping.

RELATED TOPICS:
♦ “External Loading” on page 419

22 Chapter 2: Managing Connection Objects


FTP
Select this connection type to use FTP to access the source or target directory for flat file and XML sources or
targets. If you want to extract or load data from or to a flat file or XML source or target using FTP, you must
specify an FTP connection when you configure source or target options. If you select this option, select a
configured FTP connection in the Value column. FTP connections must be defined in the Workflow Manager
prior to configuring sessions.

RELATED TOPICS:
♦ “Using FTP” on page 443

None
Select None when you want to read or write from a local flat file or XML file, or if you use an associated source
for a WebSphere MQ session.

Setting the $Source and $Target Connection Values


Enter the database connection you want the Integration Service to use for the $Source and $Target connection
variables. You can select a connection object, or you can use the $DBConnectionName or
$AppConnectionName session parameter if you want to define the connection value in a parameter file.
Use the $Source or $Target variable in Lookup and Stored Procedure transformations to specify the database
location for the lookup table or stored procedure. For more information about specifying connection
information for Lookup or Stored Procedure transformations, see the PowerCenter Transformation Guide. You
can also use the $Source variable to specify the source connection for relational sources and the $Target variable
to specify the target connection for relational targets in the session properties.
If you use $Source or $Target in a Lookup or Stored Procedure transformation, you can specify the database
location in the $Source Connection Value or $Target Connection Value field to ensure the Integration Service
uses the correct database connection to run the session. If you use $Source or $Target in a mapping, but do not
specify a database connection in one of these fields, the Integration Service determines which database
connection to use when it runs the session.
The following list describes how the Integration Service determines the value of $Source or $Target when you
do not specify the $Source Connection Value or $Target Connection Value in the session properties:
♦ When you use $Source and the pipeline contains one source, the Integration Service uses the database
connection you specify for the source.
♦ When you use $Source and the pipeline contains multiple sources joined by a Joiner transformation, the
Integration Service determines the database connections based on the location of the Lookup or Stored
Procedure transformation in the pipeline:
− When the transformation is after the Joiner transformation, the Integration Service uses the database
connection for the detail table.
− When the transformation is before the Joiner transformation, the Integration Service uses the database
connection for the source connected to the transformation.
♦ When you use $Target and the pipeline contains one target, the Integration Service uses the database
connection you specify for the target.
♦ When you use $Target and the pipeline contains multiple relational targets, the session fails.
♦ When you use $Source or $Target in an unconnected Lookup or Stored Procedure transformation, the
session fails.
You can enter values for the $Source and $Target connection variables on the Properties tab of the session
properties, or on the Mapping tab, Connections node.

To enter the database connection for the $Source and $Target connection variables:

1. In the session properties, select the Properties tab or the Mapping tab, Connections node.
2. Click the Open button in $Source Connection Value or $Target Connection Value field.

Configuring Session Connections 23


The Connection Browser dialog box appears.
3. Select a connection object or variable:
♦ Use Connection Object. Select a relational or application connection object. You can also create a
connection object and select it.
♦ Use Connection Variable. Use a connection variable or session parameter. You can enter either the
$Source or $Target connection variable, or the $DBConnectionName or $AppConnectionName session
parameter. If you enter a session parameter, define the parameter in the parameter file. If you do not
define a value for the session parameter, the Integration Service determines which database connection
to use when it runs the session.
4. Click OK.

Overriding Connection Attributes


Before you can configure a session to use a connection, you must create a connection object in the Workflow
Manager and configure the connection attributes. For example, to configure a session to use FTP, create an FTP
connection object and configure the connection attributes.
After you create a connection object, you configure source and target instances in a session to use the object.
When you configure the source and target instances, you can override connection attributes and define some
attributes that are not in the connection object. You can override connection attributes based on how you
configure the source or target instances.
You can override connection attributes when you configure the source or target session properties in the
following ways:
♦ You use an FTP, queue, external loader, or application connection for a non-relational source or target
instance.
♦ You use an FTP, queue, or external loader connection for a relational target instance.
♦ You use an application connection for a relational source instance.
For example, for an FTP target connection, you can override the remote file name, whether the target file is
staged, and the transfer mode. For a Teradata FastExport source connection, you can override attributes such as
TDPID, tenacity, and block size.
When you configure a session, you can override connection attributes for a connection object or you can use a
session parameter and override the attributes in the parameter file.

24 Chapter 2: Managing Connection Objects


You configure connections in the Connections settings on the Mapping tab:

Select a connection
object and override
connections, or use a
session parameter.

Connection Object or
Session Parameter that
defines the Connection

You can override connection attributes in the session or in the parameter file:
♦ Session. Select the connection object and override attributes in the session.
♦ Parameter file. Use a session parameter to define the connection and override connection attributes in the
parameter file.

RELATED TOPICS:
♦ “Working with Connection Objects” on page 20

♦ “Overriding Connection Attributes in the Session” on page 25


♦ “Overriding Connection Attributes in the Parameter File” on page 26

Overriding Connection Attributes in the Session


You can override the source and target connection attributes in the session properties.

To override connection attributes in the session:

1. On the Mapping tab, select the source or target instance in the Connections node.
2. Select the connection type.
3. Click the Open button in the value field to select a connection object.
4. Choose the connection object.
5. Click Override.

Overriding Connection Attributes 25


The Connection Object Definition dialog box appears.

Override connection attributes.

6. Update the attributes you want to change.


7. Click OK.

Overriding Connection Attributes in the Parameter File


If you use a session parameter to define a connection for a source or target, you can override the connection
attributes in the parameter file. Use the $FTPConnectionName, $QueueConnectionName,
$LoaderConnectionName, or $AppConnectionName session parameter.
When you define a connection in the parameter file, the Integration Service searches for specific, user-defined
session parameters that define the connection attributes. For example, you create a Message Queue connection
parameter called $QueueConnectionMyMQ and define it in the “[s_MySession]” section in the parameter file.
The Integration Service searches this section of the parameter file for the “rows per message” parameter,
$Param_QueueConnectionMyMQ_Rows_Per_Message.
When you install PowerCenter, the installation program creates a template file named ConnectionParam.prm
that lists the connection attributes you can override for FTP, queue, loader, and application connections. The
ConnectionParam.prm file is located in the following directory:
<PowerCenter Installation Directory>/server/bin

When you define a connection in the parameter file, copy the template for the appropriate connection type and
paste it into the parameter file. Then supply the parameter values.
For example, to override connection attributes for an FTP connection in the parameter file, perform the
following steps:
1. Configure the session or workflow to run with a parameter file. For more information, see “Configuring
the Parameter File Name and Location” on page 412.
2. In the session properties Mapping tab, select the source or target instance in the Connections node.
3. Click the Open button in the value field and configure the connection to use a session parameter. For
example, use $FTPConnectionMyFTPConn for an FTP connection.
4. Open the ConnectionParam.prm template file in a text editor and scroll down to the section for the
connection type whose attributes you want to override. For example, for an FTP connection, locate the
“Connection Type: FTP” section:

26 Chapter 2: Managing Connection Objects


Connection Type : FTP
---------------------

...

Template
====================
$FTPConnection<VariableName>=
$Param_FTPConnection<VariableName>_Remote_Filename=
$Param_FTPConnection<VariableName>_Is_Staged=
$Param_FTPConnection<VariableName>_Is_Transfer_Mode_ASCII=

5. Copy the template text for the connection attributes you want to override. For example, to override the
“Remote File Name” and “Is Staged” attributes, copy the following lines:
$FTPConnection<VariableName>=
$Param_FTPConnection<VariableName>_Remote_Filename=
$Param_FTPConnection<VariableName>_Is_Staged=

6. Paste the text into the parameter file. Replace <VariableName> with the connection name, and supply the
parameter values. For example:
[MyFolder.WF:wf_MyWorkflow.ST:s_MySession]

$FTPConnectionMyFTPConn=FTP_Conn1
$Param_FTPConnectionMyFTPConn_Remote_Filename=ftp_src.txt
$Param_FTPConnectionMyFTPConn_Is_Staged=YES

Note: The Integration Service interprets spaces or quotation marks before or after the equals sign as part of
the parameter name or value.
If you do not define a value for an attribute, the Integration Service uses the value defined for the
connection object.

RELATED TOPICS:
♦ “Parameter Files” on page 403

Configuring Connection Object Permissions


You can access global connection objects from all folders in the repository and use them in any session. The
Workflow Manager assigns owner permissions to the user who creates the connection object. The owner has all
permissions. You can change the owner but you cannot change the owner permissions. You can assign
permissions on a connection object to users, groups, and all others for that object.
The Workflow Manager assigns default permissions for connection objects to users, groups, and all others if you
enable enhanced security.
You can specify read, write, and execute permissions for each user and group. You can perform the following
types of tasks with different connection object permissions in combination with user privileges and folder
permissions:
♦ Read. View the connection object in the Workflow Manager and Repository Manager. When you have read
permission, you can perform tasks in which you view, copy, or edit repository objects associated with the
connection object.
♦ Write. Edit the connection object.
♦ Execute. Run sessions that use the connection object.
To view a connection object you must have read permission on the object. To delete a connection object, you
must be the owner of the connection object or have the Administrator role for the Repository Service. To edit a
connection object, you must have read and write permissions on the object.
To assign or edit permissions on a connection object, select an object from the Connection Object Browser, and
click Permissions.

Configuring Connection Object Permissions 27


You can perform the following tasks to manage permissions on a connection object:
♦ Change connection object permissions for users and groups.
♦ Add users and groups and assign permissions to users and groups on the connection object.
♦ List all users to see all users that have permissions on the connection object.
♦ List all groups to see all groups that have permissions on the connection object.
♦ List all, to see all users, groups, and others that have permissions on the connection object.
♦ Remove each user or group that has permissions on the connection object.
♦ Remove all users and groups that have permissions on the connection object.
♦ Change the owner of the connection object.
If you change the permissions assigned to a user that is currently connected to a repository in a PowerCenter
Client tool, the changed permissions take effect the next time the user reconnects to the repository.

RELATED TOPICS:
♦ “Enabling Enhanced Security” on page 7

Relational Database Connections


Before the Integration Service can access a source or target database in a session, you must configure the
database connections in the Workflow Manager. When you create or modify a session that reads from or writes
to a relational database, you can select configured source and target database connections.
When you create a connection, you must have the following information available:
♦ Database name. Name for the connection.
♦ Database type. Type of the source or target database.
♦ Database user name. Name of a user who has the appropriate database permissions to read from and write to
the database. You can enter a session parameter as the database user name and define it in a parameter file.
For more information about database user names and passwords, see “Database User Names and Passwords”
on page 28.
To use an SQL override with pushdown optimization, the user must also have permission to create views on
the source or target database.
♦ Password. Database password in 7-bit ASCII. You can enter a session parameter as the database password
and define it in a parameter file. For more information about database user names and passwords, see
“Database User Names and Passwords” on page 28.
♦ Connect string. Connect string used to communicate with the database. For more information about
connect strings, see “Database Connect Strings” on page 29.
♦ Database code page. Code page associated with the database. For more information about database code
pages, see “Connection Object Code Pages” on page 20.
For relational databases, you might need to run special SQL commands in the database environment. For
example, you might need to set the quoted identifier parameter. You can set up environment SQL to run SQL
commands at each connection to the database or at each database transaction. For more information about
configuring environment SQL commands, see “Configuring Environment SQL” on page 29.

Database User Names and Passwords


The Workflow Manager requires a database user name and password. You can enter session parameter
$ParamName as the database user name and password, and define the user name and password in a parameter
file. For example, you can use a session parameter, $ParamMyDBUser, as the database user name, and set
$ParamMyDBUser to the user name in the parameter file.

28 Chapter 2: Managing Connection Objects


To use a session parameter for the database password, enable the Use Parameter in Password option and encrypt
the password using the pmpasswd command line program. Encrypt the password using the CRYPT_DATA
encryption type. For example, to encrypt the database password “monday,” enter the following command:
pmpasswd monday -e CRYPT_DATA

Some database drivers, such as ISG Navigator, do not allow user names and passwords. Since the Workflow
Manager requires a database user name and password, PowerCenter provides two reserved words to register
databases that do not allow user names and passwords:
♦ PmNullUser
♦ PmNullPasswd
Use the PmNullUser user name if you use one of the following authentication methods:
♦ Oracle OS Authentication. Oracle OS Authentication lets you log in to an Oracle database if you have a
login name and password for the operating system. You do not need to know a database user name and
password. PowerCenter uses Oracle OS Authentication when the connection user name is PmNullUser and
the connection is for an Oracle database.
♦ IBM DB2 client authentication. IBM DB2 client authentication lets you log in to an IBM DB2 database
without specifying a database user name or password if the IBM DB2 server is configured for external
authentication or if the IBM DB2 server is on the same machine hosting the Integration Service process.
PowerCenter uses IBM DB2 client authentication when the connection user name is PmNullUser and the
connection is for an IBM DB2 database.

Database Connect Strings


When you create a database connection, specify a connect string for that connection. The Integration Service
uses connect strings to communicate with a database.
Table 2-1 lists the native connect string syntax for each supported database when you create or update
connections:

Table 2-1. Native Connect String Syntax

Database Connect String Syntax Example

IBM DB2 dbname mydatabase

Informix dbname@servername mydatabase@informix

Microsoft SQL Server servername@dbname sqlserver@mydatabase

Oracle dbname.world (same as TNSNAMES entry) oracle.world

Sybase ASE servername@dbname sambrown@mydatabase

Teradata* ODBC_data_source_name or TeradataODBC


ODBC_data_source_name@db_name or TeradataODBC@mydatabase
ODBC_data_source_name@db_user_name TeradataODBC@jsmith
*Use Teradata ODBC drivers to connect to source and target databases.

Configuring Environment SQL


The Integration Service runs environment SQL in auto-commit mode and closes the transaction after it issues
the SQL. Use SQL commands that do not depend on a transaction being open during the entire read or write
process. For example, if a source database is set to read only mode and you create an environment SQL
statement in the source connection to set the transaction to read only, the Integration Service issues a commit
after it runs the SQL and cannot read the source in read only mode.
You configure environment SQL in the database connection. Use environment SQL for source, target, lookup,
and stored procedure connections. If the SQL syntax is not valid, the Integration Service does not connect to
the database, and the session fails.
You can configure the following types of environment SQL:

Relational Database Connections 29


♦ Connection environment SQL
♦ Transaction environment SQL

Connection Environment SQL


This custom SQL string sets up the environment for subsequent transactions. The Integration Service runs the
connection environment SQL each time it connects to the database. If you configure connection environment
SQL in a target connection, and you configure three partitions for the pipeline, the Integration Service runs the
SQL three times, once for each connection to the target database. Use SQL commands that do not depend on a
transaction being open during the entire read or write process.
For example, use the following SQL statement to set the quoted identifier parameter for the duration of the
connection:
SET QUOTED_IDENTIFIER ON

Use the SQL statement in the following situations:


♦ You want to set up the connection environment so that double quotation marks are object identifiers.
♦ You configure the target load type to Normal and the Microsoft SQL Server target name includes spaces.

Transaction Environment SQL


This custom SQL string also sets up the environment, but the Integration Service runs the transaction
environment SQL at the beginning of each transaction.
Use SQL commands that depend on a transaction being open during the entire read or write process. For
example, you might use the following statement as transaction environment SQL to modify how the session
handles characters:
ALTER SESSION SET NLS_LENGTH_SEMANTICS=CHAR

This command must be run before each transaction. The command is not appropriate for connection
environment SQL because setting the parameter once for each connection is not sufficient.

Guidelines for Entering Environment SQL


Consider the following guidelines when creating the SQL statements:
♦ You can enter any SQL command that is valid in the database associated with the connection object. The
Integration Service does not allow nested comments, even though the database might.
♦ When you enter SQL in the SQL Editor, you type the SQL statements.
♦ Use a semicolon (;) to separate multiple statements.
♦ The Integration Service ignores semicolons within /*...*/.
♦ If you need to use a semicolon outside of comments, you can escape it with a backslash (\).
♦ You can use parameters and variables in the environment SQL. Use any parameter or variable type that you
can define in the parameter file. You can enter a parameter or variable within the SQL statement, or you can
use a parameter or variable as the environment SQL. For example, you can use a session parameter,
$ParamMyEnvSQL, as the connection or transaction environment SQL, and set $ParamMyEnvSQL to the
SQL statement in a parameter file.
♦ You can configure the table owner name using sqlid in the connection environment SQL for a DB2
connection. However, the table owner name in the target instance overrides the SET sqlid statement in
environment SQL. To use the table owner name specified in the SET sqlid statement, do not enter a name in
the target name prefix.

Database Connection Resilience


Database connection resilience is the ability of the Integration Service to tolerate temporary network failures
when connecting to a relational database or when the database becomes unavailable. The Integration Services is

30 Chapter 2: Managing Connection Objects


resilient to failures when it initializes the connection to the source or target database and when it reads data
from or writes data to a database.
You configure the resilience retry period in the connection object. You can configure the retry period for source,
target, SQL, and Lookup transformation database connections. When a network failure occurs or the source or
target database becomes unavailable, the Integration Service attempts to reconnect for the amount of time
configured for the connection retry period. If the Integration Service cannot reconnect to the database in the
amount of time for the retry period, the session fails.
The Integration Service will not attempt to reconnect to a database in the following situations:
♦ The database connection object is for an ODBC or Informix connection.
♦ The transformation associated with the connection object does not have the output is deterministic option
enabled.
♦ The value for the DTM buffer size is less than what the session requires.
♦ The truncate the target table option is enabled for a target and the connection fails during execution of the
truncate query.
♦ The database connection fails during a commit or rollback.
For more information about resilience, see the PowerCenter Administrator Guide.
Note: For a database connection to be resilient, the database must be a highly available database and you must
have the high availability option or the real-time option.

Configuring a Relational Database Connection


Use the following procedure to configure a relational database connection.

To create a relational database connection:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Relational.
The Relational Connection Browser dialog box appears, listing all the source and target database
connections.
3. Click New.
The Select Subtype dialog box appears.
4. In the Select Subtype dialog box, select the type of database connection you want to create.
5. Click OK.
The Connection Object Definition dialog box appears.
Table 2-2 describes the properties that you configure for a relational database connection:

Table 2-2. Relational Database Connection Information

Property Description

Name Connection name used by the Workflow Manager. Connection name cannot
contain spaces or other special characters, except for the underscore.

Type Type of database.

User Name Database user name with the appropriate read and write database
permissions to access the database. If you use Oracle OS Authentication,
IBM DB2 client authentication, or databases such as ISG Navigator that do
not allow user names, enter PmNullUser. For Teradata connections, this
overrides the default database user name in the ODBC entry.
To define the user name in the parameter file, enter session parameter
$ParamName as the user name, and define the value in the session or
workflow parameter file. The Integration Service interprets user names that
start with $Param as session parameters.

Relational Database Connections 31


Table 2-2. Relational Database Connection Information

Property Description

Use Parameter in Indicates the password for the database user name is a session parameter,
Password $ParamName. If you enable this option, define the password in the workflow
or session parameter file, and encrypt it using the pmpasswd CRYPT_DATA
option. Default is disabled.

Password Password for the database user name. For Oracle OS Authentication, IBM
DB2 client authentication, or databases such as ISG Navigator that do not
allow passwords, enter PmNullPassword. For Teradata connections, this
overrides the database password in the ODBC entry.
Passwords must be in 7-bit ASCII.

Connect String Connect string used to communicate with the database. For syntax, see
“Database Connect Strings” on page 29.
Required for all databases, except Microsoft SQL Server and Sybase ASE.

Code Page Code page the Integration Service uses to read from a source database or
write to a target database or file.

Connection Runs an SQL command with each database connection. Default is disabled.
Environment SQL

Transaction Runs an SQL command before the initiation of each transaction. Default is
Environment SQL disabled.

Enable Parallel Mode Enables parallel processing when loading data into a table in bulk mode.
Default is enabled.

Database Name Name of the database. For Teradata connections, this overrides the default
database name in the ODBC entry. Also, if you do not enter a database name
for a Teradata or Sybase ASE connection, the Integration Service uses the
default database name in the ODBC entry. If you do not enter a database
name, connection-related messages do not show a database name when the
default database is used.

Data Source Name Name of the Teradata ODBC data source.

Server Name Database server name. Use to configure workflows.

Packet Size Use to optimize the native drivers for Sybase ASE and Microsoft SQL Server.

Domain Name The name of the domain. Used for Microsoft SQL Server on Windows.

Use Trusted If selected, the Integration Service uses Windows authentication to access
Connection the Microsoft SQL Server database. The user name that starts the Integration
Service must be a valid Windows user with access to the Microsoft SQL
Server database.

Connection Retry Number of seconds the Integration Service attempts to reconnect to the
Period database if the connection fails. If the Integration Service cannot connect to
the database in the retry period, the session fails. Default value is 0.

6. Click OK.
The new database connection appears in the Connection Browser list.
7. To add more database connections, repeat steps 3 to 6.
8. Click OK.

Copying a Relational Database Connection


After you set up a relational database connection, you can make a copy of it by clicking the Copy As button.
The Workflow Manager lets you choose the relational database type when you make a copy of a relational
database connection.

32 Chapter 2: Managing Connection Objects


When you make a copy of a relational database connection, the Workflow Manager retains the connection
properties that apply to the relational database type you select. The copy of the connection is invalid if a
required connection property is missing. Edit the connection properties manually to validate the connection.
The Workflow Manager appends an underscore and the first three letters of the relational database type to the
name of the new database connection. For example, you have lookup table in the same database as your source
definition. You you make a copy of the Microsoft SQL Server database connection called Dev_Source. The
Workflow Manager names the new database connection Dev_Source_Mic. You can edit the copied connection
to use a different name.

To copy a relational database connection:

1. Click Connections > Relational.


The Relational Connection Browser appears.
2. Select the connection you want to copy.
Tip: Hold the shift key to select more than one connection to copy.

3. Click Copy As.


The Select Subtype dialog box appears.
4. Select a relational database type for the copy of the connection.
If you copy one database connection object as a different type of database connection, you must
reconfigure the connection properties for the copied connection.
5. Click OK.
The Workflow Manager retains connection properties that apply to the database type. If a required
connection property does not exist, the Workflow Manager displays a warning message. This happens when
you copy a connection object as a different database type or copy a connection object that is already
invalid.
6. Click OK to close the warning dialog box.
The copy of the connection appears in the Relational Connection Browser.
7. If the copied connection is invalid, click the Edit button to enter required connection properties.
8. Click Close to close the Relational Connection Browser dialog box.

Replacing a Relational Database Connection


You can replace a relational database connection with another relational database connection. For example, you
might have several sessions that you want to write to another target database. Instead of editing the properties
for each session, you can replace the relational database connection for all sessions in the repository that use the
connection.
When you replace database connections, the Workflow Manager replaces the relational database connections in
the following locations for all sessions using the connection:
♦ Source connection
♦ Target connection
♦ Connection Information property in Lookup and Stored Procedure transformations
♦ $Source Connection Value session property
♦ $Target Connection Value session property
When the repository contains both relational and application connections with the same name, the Workflow
Manager replaces the relational connections only if you specified the connection type as relational in all
locations.
For example, you have a relational and an application source, each called ITEMS. In one session, you specified
the name ITEMS for a source connection instead of Relational:ITEMS.

Relational Database Connections 33


When you replace the relational connection ITEMS with another relational connection, the Workflow Manager
does not replace any relational connection in the repository because it cannot determine the connection type for
the source connection entered as ITEMS.
The Integration Service uses the updated connection information the next time the workflow runs.
You must first close all folders before replacing a relational database connection.

To replace a connection object:

1. Close all folders in the repository.


2. Click Connections > Replace.
The Replace Connections dialog box appears.
3. Click the Add button to replace a connection.
4. In the From list, select a relational database connection you want to replace.
5. In the To list, select the replacement relational database connection.
6. Click Replace.
All sessions in the repository that use the From connection now use the connection you select in the To list.

FTP Connections
Before you can configure a session to use FTP or SFTP, you must create and configure the FTP connection
properties in the Workflow Manager. The Integration Service uses the FTP connection properties to create an
SFTP or FTP connection.

To create an FTP connection:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > FTP.
The FTP Connection Browser appears.
3. Click New.
4. Enter the properties for the FTP connection.
Table 2-3 describes the properties that you configure for an FTP connection:

Table 2-3. FTP Connection Properties

Property Description

Name Connection name used by the Workflow Manager.

User Name User name necessary to access the host machine. Must be in 7-bit ASCII only.
Required to connect to an SFTP server with password based authentication.
To define the user name in the parameter file, enter session parameter
$ParamName as the user name, and define the value in the session or workflow
parameter file. The Integration Service interprets user names that start with $Param
as session parameters.

Use Parameter Indicates the password for the user name is a session parameter, $ParamName. If
in Password you enable this option, define the password in the workflow or session parameter
file, and encrypt it using the pmpasswd CRYPT_DATA option. Default is disabled.

Password Password for the user name. Must be in 7-bit ASCII only. Required to connect to an
SFTP server with password based authentication.

34 Chapter 2: Managing Connection Objects


Table 2-3. FTP Connection Properties

Property Description

Host Name Host name or dotted IP address of the FTP connection.


Optionally, you can specify a port number between 1 and 65535, inclusive. If you do
not specify a port number, the Integration Service uses 21 by default. Use the
following syntax to specify the host name:
hostname:port_number
-or-
IP address:port_number
When you specify a port number, enable that port number for FTP on the host
machine.
If you enable SFTP, specify a host name or port number for an SFTP server. If you
do not specify a port number, the Integration Service uses 22 by default.

Default Remote Default directory on the FTP host used by the Integration Service. Do not enclose
Directory the directory in quotation marks.
You can enter a parameter or variable for the directory. Use any parameter or
variable type that you can define in the parameter file.
Depending on the FTP server you use, you may have limited options to enter FTP
directories. For details, see the FTP server documentation.
In the session, when you enter a file name without a directory, the Integration
Service appends the file name to this directory. This path must contain the
appropriate trailing delimiter. For example, if you enter c:\staging\ and specify
data.out in the session, the Integration Service reads the path and file name as
c:\staging\data.out.
For SAP, you can leave this value blank. SAP sessions use the Source File
Directory session property for the FTP remote directory. If you enter a value, the
Source File Directory session property overrides it.

Retry Period Number of seconds the Integration Service attempts to reconnect to the FTP host if
the connection fails. If the Integration Service cannot reconnect to the FTP host in
the retry period, the session fails.
Default value is 0 and indicates an infinite retry period.

Use SFTP Enables SFTP.

Public Key File Public key file path and file name. Required if the SFTP server uses publickey
Name authentication. Enabled for SFTP.

Private Key File Private key file path and file name. Required if the SFTP server uses publickey
Name authentication. Enabled for SFTP.

Private Key File Private key file password used to decrypt the private key file. Required if the SFTP
Password server uses public key authentication and the private key is encrypted. Enabled for
SFTP.

5. Click OK.

Using an SFTP Connection


To connect to an SFTP server, create an FTP connection and enable SFTP. SFTP uses the SSH2 authentication
protocol. Configure the authentication properties to use the SFTP connection. You can configure publickey or
password authentication. The Integration Service connects to the SFTP server with the authentication
properties you configure. If the authentication does not succeed, the session fails. For more information about
SFTP authentication properties, see Table 2-3 on page 34.

Rules and Guidelines for Mainframes


Use the following guidelines when creating an FTP connection to a mainframe machine:
♦ If you enter a mainframe file name in the default directory for a source or target, enter the closing quote. For
example, if the file is located in the following default remote directory:
‘staging.

FTP Connections 35
To access the file data from the default mainframe directory, enter the following in the Remote file name
field:
data’

When the Integration Service begins or ends the session, it connects to the mainframe host and looks for the
following directory and file name:
‘staging.data’

♦ If you want to use a file in a different directory, enter the directory and file name in the Remote file name
field. For example, you might enter the following file name and directory:
‘overridedir.filename’

External Loader Connections


You configure external loader properties in the Workflow Manager when you create an external loader
connection. You can also override the external loader connection in the session properties. The Integration
Service uses external loader properties to create an external loader connection.
When you configure external loader settings, you might need to consult the database documentation for more
information.

To create an external loader connection:

1. Click Connections > Loader in the Workflow Manager.


The Loader Connection Browser dialog box appears.
2. Click New.
3. Select an external loader type, and then click OK.
The Loader Connection Editor dialog box appears.
4. Enter the properties for the external loader connection:

Property Description

Name Connection name used by the Workflow Manager. Connection name cannot
contain spaces or other special characters, except for the underscore.

User Name Database user name with the appropriate read and write database
permissions to access the database. If you use Oracle OS Authentication or
IBM DB2 client authentication, enter PmNullUser. PowerCenter uses Oracle
OS Authentication when the connection user name is PmNullUser and the
connection is to an Oracle database. PowerCenter uses IBM DB2 client
authentication when the connection user name is PmNullUser and the
connection is to an IBM DB2 database.
To define the user name in the parameter file, enter session parameter
$ParamName as the user name, and define the value in the session or
workflow parameter file. The Integration Service interprets user names that
start with $Param as session parameters.

Use Parameter in Indicates the password for the database user name is a session parameter,
Password $ParamName. If you enable this option, define the password in the workflow
or session parameter file, and encrypt it using the pmpasswd CRYPT_DATA
option. Default is disabled.

36 Chapter 2: Managing Connection Objects


Property Description

Password Password for the database user name. For Oracle OS Authentication or IBM
DB2 client authentication, enter PmNullPassword. For Teradata connections,
you can enter PmNullPasswd to prevent the password from appearing in the
control file. Instead, the Integration Service writes an empty string for the
password in the control file.
Passwords must be in 7-bit ASCII.

Connect String Connect string used to communicate with the database. For syntax, see
“Database Connect Strings” on page 29.

5. Enter the loader connection attributes.


6. Click OK.

RELATED TOPICS:
♦ “External Loading” on page 419

HTTP Connections
You configure connection information for an HTTP transformation in an HTTP application connection. The
Integration Service can use HTTP application connections to connect to HTTP servers. HTTP application
connections enable you to control connection attributes, including the base URL and other parameters.
If you want to connect to an HTTP proxy server, configure the HTTP proxy server settings in the Integration
Service. For more information about configuring the Integration Service to use an HTTP proxy server, see the
PowerCenter Administrator Guide.
Configure an HTTP application connection in the following circumstances:
♦ The HTTP server requires authentication.
♦ You want to configure the connection timeout.
♦ You want to override the base URL in the HTTP transformation.

To configure an HTTP application connection:

1. In the Workflow Manager, connect to a PowerCenter repository.


2. Select Connections > Applications.
The Application Connection Browser dialog box appears.
3. From Select Type, select Http Transformation.
4. Click New.
The Connect Object Definition dialog box appears.
5. Enter a name for the HTTP connection.

HTTP Connections 37
6. Enter values for the connection attributes.

HTTP Connection
Description
Attributes

User Name Authenticated user name for the HTTP server. If the HTTP server does not
require authentication, enter PmNullUser.
To define the user name in the parameter file, enter session parameter
$ParamName as the user name, and define the value in the session or
workflow parameter file. The Integration Service interprets user names that
start with $Param as session parameters.

Use Parameter in Indicates the password for the authenticated user is a session parameter,
Password $ParamName. If you enable this option, define the password in the
workflow or session parameter file, and encrypt it using the pmpasswd
CRYPT_DATA option. Default is disabled.

Password Password for the authenticated user. If the HTTP server does not require
authentication, enter PmNullPasswd.

Base URL URL of the HTTP server. This value overrides the base URL defined in the
HTTP transformation.
You can use a session parameter to configure the base URL. For example,
enter the session parameter $ParamBaseURL in the Base URL field, and
then define $ParamBaseURL in the parameter file.

Timeout Number of seconds the Integration Service waits for a connection to the
HTTP server before it closes the connection.

Domain Authentication domain for the HTTP server. This is required for NTLM
authentication. For more information about authentication with an HTTP
transformation, see the PowerCenter Transformation Guide.

Trust Certificates File File containing the bundle of trusted certificates that the client uses when
authenticating the SSL certificate of a server. You specify the trust
certificates file to have the Integration Service authenticate the HTTP
server. By default, the name of the trust certificates file is ca-bundle.crt. For
information about adding certificates to the trust certificates file, see
“Configuring Certificate Files for SSL Authentication” on page 39.

Certificate File Client certificate that an HTTP server uses when authenticating a client.
You specify the client certificate file if the HTTP server needs to
authenticate the Integration Service.

Certificate File Password Password for the client certificate. You specify the certificate file password
if the HTTP server needs to authenticate the Integration Service.

Certificate File Type File type of the client certificate. You specify the certificate file type if the
HTTP server needs to authenticate the Integration Service. The file type
can be PEM or DER. For information about converting certificate file types
to PEM or DER, see “Configuring Certificate Files for SSL Authentication”
on page 39. Default is PEM.

Private Key File Private key file for the client certificate. You specify the private key file if the
HTTP server needs to authenticate the Integration Service.

Key Password Password for the private key of the client certificate. You specify the key
password if the web service provider needs to authenticate the Integration
Service.

Key File Type File type of the private key of the client certificate. You specify the key file
type if the HTTP server needs to authenticate the Integration Service. The
HTTP transformation uses the PEM file type for SSL authentication.

Authentication Type Select one of the following authentication types to use when the HTTP
server does not return an authentication type to the Integration Service:
- Auto. The Integration Service attempts to determine the authentication
type of the HTTP server.
- Basic. Based on a non-encrypted user name and password.
- Digest. Based on an encrypted user name and password.
- NTLM. Based on encrypted user name, password, and domain.
Default is Auto.

38 Chapter 2: Managing Connection Objects


7. Click OK.
The HTTP connection appears in the Application Connection Browser.

Configuring Certificate Files for SSL Authentication


Before you configure an HTTP application connection to use SSL authentication, you may need to configure
certificate files. If the Integration Service authenticates the HTTP server, you configure the trust certificates
file. If the HTTP server authenticates the Integration Service, you configure the client certificate file and the
corresponding private key file, passwords, and file types.
The trust certificates file (ca-bundle.crt) contains certificate files from major, trusted certificate authorities. If
the certificate bundle does not contain a certificate from a certificate authority that the HTTP transformation
uses, you can convert the certificate of the HTTP server to PEM format and append it to the ca-bundle.crt file.
The private key for a client certificate must be in PEM format.

Converting Certificate Files from Other Formats


Certificate files have the following formats:
♦ DER. Files with the .cer or .der extension.
♦ PEM. Files with the .pem extension.
♦ PKCS12. Files with the .pfx or .P12 extension.
When you append HTTP server certificates to the ca-bundle.crt file, the HTTP server certificate files must use
the PEM format. Use the OpenSSL utility to convert certificates from one format to another. You can get
OpenSSL at http://www.openssl.org.
For example, to convert the DER file named server.der to PEM format, use the following command:
openssl x509 -in server.der -inform DER -out server.pem -outform PEM

If you want to convert the PKCS12 file named server.pfx to PEM format, use the following command:
openssl pkcs12 -in server.pfx -out server.pem

To convert a private key named key.der from DER to PEM format, use the following command:
openssl rsa -in key.der -inform DER -outform PEM -out keyout.pem

For more information, refer to the OpenSSL documentation. After you convert certificate files to the PEM
format, you can append them to the trust certificates file. Also, you can use PEM format private key files with
HTTP transformation.

Adding Certificates to the Trust Certificates File


If the HTTP server uses a certificate that is not included in the ca-bundle.crt file, you can add the certificate to
the ca-bundle.crt file.

To add a certificate to the trust certificates file:

1. Use Internet Explorer to locate the certificate and create a copy:


♦ Access the HTTP server using HTTPS.
♦ Double-click the padlock icon in the status bar of Internet Explorer.
♦ In the Certificate dialog box, click the Details tab.
♦ Select the Authority Information Access field.
♦ Click Copy to File.
♦ Use the Certificate Export Wizard to copy the certificate in DER format.
2. Convert the certificate from DER to PEM format.
3. Append the PEM certificate file to the certificate bundle, ca-bundle.crt.

HTTP Connections 39
The ca-bundle.crt file is located in the following directory: <PowerCenter Installation
Directory>/server/bin
For more information about adding certificates to the ca-bundle.crt file, see the curl documentation at
http://curl.haxx.se/docs/sslcerts.html.

PowerChannel Relational Database Connections


Configure a relational database connection for each database you want to access through PowerChannel. If you
have configured a relational database connection, and you want to create a PowerChannel connection, you can
copy the connection.

To configure a PowerChannel relational database connection:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Relational.
The Relational Connection Browser dialog box appears, listing all the source and target database
connections.
3. Click New.
The Select Subtype dialog box appears.
4. In the Select Subtype dialog box, select the type of PowerChannel database connection you want to create.
5. Click OK.
The Connection Object Definition dialog box appears.
Table 2-4 describes the properties that you configure for a PowerChannel relational database connection:

Table 2-4. PowerChannel Relational Database Connection Information

Property Description

Name Connection name used by the Workflow Manager. Connection name cannot
contain spaces or other special characters, except for the underscore.

Type Type of database.

User Name Database user name with the appropriate read and write database
permissions to access the database. If you use Oracle OS Authentication,
IBM DB2 client authentication, or databases such as ISG Navigator that do
not allow user names, enter PmNullUser.
To define the user name in the parameter file, enter session parameter
$ParamName as the user name, and define the value in the session or
workflow parameter file. The Integration Service interprets user names that
start with $Param as session parameters.

Use Parameter in Indicates the password for the database user name is a session parameter,
Password $ParamName. If you enable this option, define the password in the workflow
or session parameter file, and encrypt it using the pmpasswd CRYPT_DATA
option. Default is disabled.

Password Password for the database user name. For Oracle OS Authentication, IBM
DB2 client authentication, or databases such as ISG Navigator that do not
allow passwords, enter PmNullPassword. For Teradata connections, this
overrides the database password in the ODBC entry.
Passwords must be in 7-bit ASCII.

Connect String Connect string used to communicate with the database. For syntax, see
“Database Connect Strings” on page 29.
Required for all databases except Microsoft SQL Server.

40 Chapter 2: Managing Connection Objects


Table 2-4. PowerChannel Relational Database Connection Information

Property Description

Code Page Code page the Integration Service uses to read from a source database or
write to a target database or file.

Database Name Name of the database. If you do not enter a database name, connection-
related messages do not show a database name when the default database
is used.

Environment SQL Runs an SQL command with each database connection. Default is disabled.

Rollback Segment Name of the rollback segment.

Server Name Database server name. Use to configure workflows.

Packet Size Use to optimize the native drivers for Sybase ASE and Microsoft SQL Server.

Domain Name The name of the domain. Used for Microsoft SQL Server on Windows.

Use Trusted If selected, the Integration Service uses Windows authentication to access
Connection the Microsoft SQL Server database. The user name that starts the Integration
Service must be a valid Windows user with access to the Microsoft SQL
Server database.

Remote PowerChannel Host name or IP address for the remote PowerChannel Server that can
Host Name access the database data.

Remote PowerChannel Port number for the remote PowerChannel Server. Make sure the PORT
Port Number attribute of the ACTIVE_LISTENERS property in the
PowerChannel.properties file uses a value that other applications on the
PowerChannel Server do not use.

Use Local Select to use compression or encryption while extracting or loading data.
PowerChannel When you select this option, you need to specify the local PowerChannel
Server address and port number. The Integration Service uses the local
PowerChannel Server as a client to connect to the remote PowerChannel
Server and access the remote database.

Local PowerChannel Host name or IP address for the local PowerChannel Server. Enter this
Host Name option when you select the Use Local PowerChannel option.

Local PowerChannel Port number for the local PowerChannel Server. Specify this option when you
Port Number select the Use Local PowerChannel option. Make sure the PORT attribute of
the ACTIVE_LISTENERS property in the PowerChannel.properties file uses
a value that other applications on the PowerChannel Server do not use.

Encryption Level Encryption level for the data transfer. Encryption levels range from 0 to 3. 0
indicates no encryption and 3 is the highest encryption level. Default is 0.
Use this option only if you have selected the Use Local PowerChannel
option.

Compression Level Compression level for the data transfer. Compression levels range from 0 to
9. 0 indicates no compression and 9 is the highest compression level. Default
is 2.
Use this option only if you have selected the Use Local PowerChannel
option.

Certificate Account Certificate account to authenticate the local PowerChannel Server to the
remote PowerChannel Server. Use this option only if you have selected the
Use Local PowerChannel option.
If you use the sample PowerChannel repository that the installation program
set up, and you want to use the default certificate account in the repository,
you can enter “default” as the certificate account.

6. Click OK.
The new PowerChannel database connection appears in the Connection Browser list.
7. To add more database connections, repeat steps 3 to 6.
8. Click OK to save all changes.

PowerChannel Relational Database Connections 41


PowerExchange for JMS Connections
Before the Integration Service can extract data from JMS sources or write data to JMS targets, you must
configure application connections for JMS sources and targets in the Workflow Manager. The Integration
Service uses application connections to connect to a JMS provider during a session to read and write JMS
messages.
You must configure two types of JMS application connections:
♦ JNDI application connection
♦ JMS application connection
For more information about PowerCenter and JNDI integration, see the PowerExchange for JMS User Guide.
For more information about JMS and JNDI, see the JMS documentation.

Connection Properties for JNDI Application Connections


Configure a JNDI application connection to connect to a JNDI server during a session. When the Integration
Service connects to the JNDI server, it retrieves information from JNDI about the JMS provider during the
session. When you configure a JNDI application connection, you must specify connection properties in the
Connection Object Definition dialog box.
Table 2-5 describes the properties that you configure for a JNDI application connection:

Table 2-5. JNDI Application Connection Properties

Property Description

JNDI Context Factory Name of the context factory that you specified when you defined the context
factory for your JMS provider.

JNDI Provider URL Provider URL that you specified when you defined the provider URL for your JMS
provider.

JNDI UserName User name.

JNDI Password Password.

Connection Properties for JMS Application Connections


Configure a JMS application connection to connect to JMS providers during a session to read source messages
or write target messages. When you configure a JMS application connection, you specify connection properties
the Integration Service uses to connect to JMS providers during a session. Specify the JMS application
connection properties in the Connection Object Definition dialog box.
Table 2-6 describes the properties that you configure for a JMS application connection:

Table 2-6. JMS Application Connection Properties

Property Description

JMS Destination Type Select QUEUE or TOPIC for the JMS Destination Type. Select QUEUE if
you want to read source messages from a JMS provider queue or write
target messages to a JMS provider queue. Select TOPIC if you want to read
source messages based on the message topic or write target messages
with a particular message topic.

JMS Connection Factory Name Name of the connection factory. The name of the connection factory must
be the same as the connection factory name you configured in JNDI. The
Integration Service uses the connection factory to create a connection with
the JMS provider.

JMS Destination Name of the destination. The destination name must match the name you
configured in JNDI. Optionally, you can use the $ParamName session
parameter for the destination name.

42 Chapter 2: Managing Connection Objects


Table 2-6. JMS Application Connection Properties

Property Description

JMS UserName User name.

JMS Password Password.

JMS Recovery Destination Recovery queue or recovery topic name, based on what you configure for
the JMS Destination Type. Configure this option when you enable recovery
for a real-time session that reads from a JMS or WebSphere MQ source and
writes to a JMS target.
Note: The session fails if the recovery destination does not match a
recovery queue or topic name in the JMS provider.

Creating JNDI and JMS Application Connections


Use the following procedure to configure JNDI and JMS application connections.

To create JNDI and JMS application connections:

1. In the Workflow Manager, connect to a PowerCenter repository.


2. Click Connections > Application.
The Application Connection Browser dialog box appears.
3. From Select Type, select JNDI Connection or JMS Connection.
4. Click New.
The Connection Object Definition dialog box appears.
5. Enter a name for the application connection.
6. Enter the connection information.
7. Click OK.
The new application connection appears in the Application Connection Browser.

PowerExchange for MSMQ Connections


Before the Integration Service can extract data from an MSMQ source or load data to an MSMQ target, you
must configure the queue connection for the source or target queue in the Workflow Manager. The queue
connection you set in the Workflow Manager is saved in the repository.

To create an MSMQ queue connection:

1. In the Workflow Manager, connect to a PowerCenter repository.


2. Click Connections > Queue.
The Queue Connection Browser dialog box displays.
3. From Select Type, select MSMQ.
4. Click New.
The Connect Object Definition dialog box appears.
5. Enter a name for the queue connection.
6. Select a code page. The code page must match the code page of the source.

PowerExchange for MSMQ Connections 43


7. Enter the following connection information:

Property Description

Queue Name Name of the MSMQ queue.

Machine Name Name of the MSMQ machine. If MSMQ is running on the same machine as the
Integration Service, you can enter a period (.).

Queue Type Select public if the MSMQ queue is a public queue. Select private if the MSMQ
queue is a private queue.

8. Click OK.
The new queue connection appears in the Message Queue connection list.

RELATED TOPICS:
♦ “Connection Object Code Pages” on page 20

PowerExchange for PeopleSoft Connections


Before the Integration Service can access PeopleSoft sources, configure an application connection for the
PeopleSoft source database in the Workflow Manager.
The application connection defines how the Integration Service accesses the underlying database for the
PeopleSoft system. When you run a session using PeopleSoft sources, the Integration Service extracts data from
the underlying physical database tables of the PeopleSoft system.

To create a PeopleSoft application connection:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Application.
The Application Connection Browser appears.
3. Select the PeopleSoft application type and click New.
The Connection Object Definition dialog box appears.
4. Enter the following connection information:

Property Description

Name Name you want to use for this connection.

User Name Database user name with SELECT permission on physical database tables in the
PeopleSoft source system.
To define the user name in the parameter file, enter session parameter $ParamName as
the user name, and define the value in the session or workflow parameter file. The
Integration Service interprets user names that start with $Param as session parameters.

Use Indicates the password for the database user name is a session parameter,
Parameter in $ParamName. If you enable this option, define the password in the workflow or session
Password parameter file, and encrypt it using the pmpasswd CRYPT_DATA option. Default is
disabled.

Password Password for the database user name. Must be in US-ASCII.

Connect Connect string for the underlying database of the PeopleSoft system. For more
String information, see Table 2-1 on page 29. This option appears for DB2, Oracle, and
Informix.

44 Chapter 2: Managing Connection Objects


Property Description

Code Page Code page the Integration Service uses to extract data from the source database. When
using relaxed code page validation, select compatible code pages for the source and
target data to prevent data inconsistencies.
For more information about PeopleSoft code pages, see the PowerExchange for
PeopleSoft User Guide.

Language PeopleSoft language code. Enter a language code for language-sensitive data. When
Code you enter a language code, the Integration Service extracts language-sensitive data from
related language tables. If no data exists for the language code, the PowerCenter
extracts data from the base table.
When you do not enter a language code, the Integration Service extracts all data from
the base table. For a list of PeopleSoft language codes, see the PowerExchange for
PeopleSoft User Guide.

Database Name of the underlying database of the PeopleSoft system. This option appears for
Name Sybase ASE and Microsoft SQL Server.

Server Name of the server for the underlying database of the PeopleSoft system. This option
Name appears for Sybase ASE and Microsoft SQL Server.

Domain Domain name for Microsoft SQL Server on Windows.


Name

Packet Size Packet size used to transmit data. This option appears for Sybase ASE and Microsoft
SQL Server.

Use Trusted If selected, the Integration Service uses Windows authentication to access the Microsoft
Connection SQL Server database. The user name that enables the Integration Service must be a
valid Windows user with access to the Microsoft SQL Server database. This option
appears for Microsoft SQL Server.

Rollback Name of the rollback segment for the underlying database of the PeopleSoft system.
Segment This option appears for Oracle.

Environment SQL commands used to set the environment for the underlying database of the
SQL PeopleSoft system.

5. Click OK.
The new application connection appears in the Application Object Browser.
6. To add more application connections, repeat steps 3 to 5.
7. Click OK to save all changes.

PowerExchange for Salesforce Connections


Before the Integration Service can extract data from Salesforce sources or write data to Salesforce targets, you
must configure an application connection for Salesforce sources and targets. The application connection stores
the Salesforce user ID, password, and end point URL information for the run-time connection. Each Salesforce
source or target in a mapping references a Salesforce application connection object. You can use multiple
Salesforce application connections in a mapping to access different sets of Salesforce data for the sources and
targets. You use the Workflow Manager to assign the Salesforce application connections to the sources and
targets.

To configure Salesforce application connections:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Application.
The Application Connection Browser dialog box appears.
3. From Select Type, select Salesforce Connection.

PowerExchange for Salesforce Connections 45


4. Click New.
The Connection Object Definition dialog box appears.
5. Enter a name for the application connection.
6. Enter the Salesforce user name for the application connection.
The Integration Service uses this user name to log in to Salesforce.com.
7. Enter the password for the Salesforce user name.
8. Enter a value for the connection attribute.
Table 2-7 describes the connection attribute for a Salesforce application connection:

Table 2-7. Salesforce Application Connection Attribute

Connection
Description
Attribute

Service URL URL of the Salesforce service you want to access.


Default is https://www.salesforce.com/services/Soap/u/7.0.
In a test or development environment, you might want to access the Salesforce
Sandbox testing environment. For more information about the Salesforce Sandbox,
see the Salesforce documentation.

9. Click OK.
The new application connection appears in the Application Object Browser.

PowerExchange for SAP NetWeaver Connections


Before the Integration Service can read data from SAP or write data to SAP, you must configure connections in
the Workflow Manager. When you configure a connection, you specify properties that the Integration Service
uses to connect to a source or target during a session.
Depending on the method of integration with mySAP applications, configure the following types of
connections:
♦ SAP R/3 application connection. Configure application connections to access the SAP system when you run
a stream or file mode session. For more information about configuring SAP R/3 connections, see
“Configuring an SAP R/3 Application Connection for ABAP Integration” on page 47.
♦ FTP connection. Configure FTP connections to access the staging file through FTP. For more information
about configuring FTP connections, see “Configuring an FTP Connection for ABAP Integration” on
page 48.
♦ SAP_ALE _IDoc_Reader and SAP_ALE _IDoc_Writer connections. Configure SAP_ALE_IDoc_Reader
connections to receive IDocs and business content integration documents using ALE. Configure
SAP_ALE_IDoc_Writer connections to send IDocs using ALE. for more information about configuring
SAP_ALE_IDoc_Writer connections, see “Configuring Application Connections for ALE Integration” on
page 49.
♦ SAP RFC/BAPI interface connection. Configure SAP RFC/BAPI Interface connections if you want to
process data in SAP using BAPI/RFC transformations. For more information about configuring SAP
RFC/BAPI connections, see “Configuring an Application Connection for BAPI/RFC Integration” on
page 50.

46 Chapter 2: Managing Connection Objects


Table 2-8 describes the type of connection you need depending on the method of integration with mySAP
applications:

Table 2-8. Types of Connections for SAP Sessions

Connection Type Integration Method

SAP R/3 application connection ABAP integration with stream and file mode sessions.

FTP connection ABAP integration with file mode sessions.

SAP_ALE _IDoc_Reader connection IDoc ALE and business content integration.

SAP_ALE _IDoc_Writer connection IDoc ALE and business content integration.

SAP RFC/BAPI interface connection BAPI/RFC integration.

Configuring an SAP R/3 Application Connection for ABAP Integration


Before you can configure a session, you need to create a source connection to the SAP system. The application
connections for SAP sources use one of the following connections:
♦ CPI-C. Use a CPI-C connection when you extract data through stream mode. The connection information
for CPI-C is stored in the sideinfo file.
♦ RFC. Use an RFC connection when you extract data through file mode. The connection information for
RFC is stored in saprfc.ini. You must also have authorizations on the SAP system to read SAP tables and to
run file mode and stream mode sessions.

Configuring an Application Connection for a Stream Mode Session


When you configure an application connection for a stream mode session, the connect string you use in the
application connection must match the connect string in the sideinfo file. For example, if the connect string in
the sideinfo file is defined in lowercase, use lowercase to enter the connect string parameter in the application
connection configuration.

Configuring one Application Connection for Stream and File Mode Sessions
You can create separate application connections for file and stream mode, or you can create one connection for
both file and stream mode. Create separate entries if the SAP administrator creates separate authorization
profiles.
To create one connection for both modes, the following conditions must be true:
♦ The saprfc.ini and sideinfo files must have the same entries for connection string and client.
♦ The SAP administrator must have created a single profile with authorizations for both file and stream mode
sessions. For more information about creating a profile, see the PowerExchange for SAP NetWeaver User
Guide.

Steps to Configure an SAP R/3 Application Connection


Complete the following procedure to configure an SAP R/3 application connection.

To create an SAP R/3 application connection:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Application.
The Application Connection Browser dialog box appears.
3. From the Select Type list, select SAP R3 as the application connection type.
4. Click New.
The Connection Object Definition dialog box appears.

PowerExchange for SAP NetWeaver Connections 47


5. Enter the following information for the application connection, depending on whether you are configuring
an application connection for a stream or file mode session.
Table 2-9 describes the properties that you configure for an SAP R/3 connection:

Table 2-9. SAP R/3 Application Connection Properties

Property Values for CPI-C (Stream Mode) Values for RFC (File Mode)

Name Connection name used by the Connection name used by the


Workflow Manager. Workflow Manager.

User Name SAP user name with authorization on SAP user name with authorization on
S_CPIC and S_TABU_DIS objects. S_DATASET, S_TABU_DIS,
To define the user name in the S_PROGRAM, and B_BTCH_JOB
parameter file, enter session objects.
parameter $ParamName as the user To define the user name in the
name, and define the value in the parameter file, enter session parameter
session or workflow parameter file. $ParamName as the user name, and
The Integration Service interprets define the value in the session or
user names that start with $Param as workflow parameter file. The
session parameters. Integration Service interprets user
names that start with $Param as
session parameters.

Use Parameter in Indicates the password for the SAP Indicates the password for the SAP
Password user name is a session parameter, user name is a session parameter,
$ParamName. If you enable this $ParamName. If you enable this
option, define the password in the option, define the password in the
workflow or session parameter file, workflow or session parameter file, and
and encrypt it using the pmpasswd encrypt it using the pmpasswd
CRYPT_DATA option. Default is CRYPT_DATA option. Default is
disabled. disabled.

Password Password for the SAP user name. Password for the SAP user name.

Connect String DEST entry in the sideinfo file. Type A DEST entry in saprfc.ini.

Code Page* Code page compatible with the SAP Code page compatible with the SAP
server. The code page must server. The code page must
correspond to the Language Code. correspond to the Language Code.

Client Code SAP client number. SAP client number.

Language Code* Language code that corresponds to Language code that corresponds to the
the SAP language. SAP language.
* For more information about code pages, see the PowerCenter Administrator Guide. For more information about
selecting a language code, see the PowerExchange for SAP NetWeaver User Guide.

6. Click OK.
The new application connection appears in the Application Connection Browser list.
7. To add more application connections, repeat steps 3 to 6.
8. Click Close.

Configuring an FTP Connection for ABAP Integration


When you run a file mode session, you can configure the session to access the staging file on the SAP system
through FTP. Before you can access a file using FTP, you must configure an FTP connection in the Workflow
Manager.

RELATED TOPICS:
♦ “FTP Connections” on page 34

48 Chapter 2: Managing Connection Objects


Configuring Application Connections for ALE Integration
To receive outbound IDocs and business content integration documents from SAP using ALE, create an
SAP_ALE_IDoc_Reader application connection in the Workflow Manager. To send inbound IDocs to SAP
using ALE, create an SAP_ALE_IDoc_Writer application connection in the Workflow Manager.

Configuring an SAP_ALE_IDoc_Reader Application Connection


Configure the SAP_ALE_IDoc_Reader connection properties with the Type R destination entry in saprfc.ini.
Make sure the Program ID for this destination entry matches the Program ID for the logical system you defined
in SAP to receive IDocs or consume business content data. For business content integration, set to
INFACONTNT. For more information, see the PowerExchange for SAP NetWeaver User Guide.

To create an SAP_ALE_IDoc_Reader application connection:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Application.
The Application Connection Browser dialog box appears.
3. From the Select Type list, select SAP_ALE_IDoc_Reader as the application connection type for outbound
IDoc or business content integration sessions.
4. Click New.
The Connection Object Definition dialog box appears.
5. Enter the connection information.
Table 2-10 describes the properties that you configure for an SAP_ALE_IDoc_Reader application
connection:

Table 2-10. SAP_ALE_IDoc_Reader Application Connection Properties

Property Description

Name Connection name used by the Workflow Manager.

Code Page Code page compatible with the SAP server. For more information about code
pages, see the “Connection Object Code Pages” on page 20.

Destination Entry Type R DEST entry in saprfc.ini. The Program ID for this destination entry must be
the same as the Program ID for the logical system you defined in SAP to receive
IDocs or consume business content data. For business content integration, set to
INFACONTNT. For more information, see the PowerExchange for SAP NetWeaver
User Guide.

6. Click OK.
The new application connection appears in the Application Connection Browser list.

Configuring an SAP_ALE_IDoc_Writer Application Connection


Configure the SAP_ALE_IDoc_Writer connection properties with the Type A destination entry in saprfc.ini.

To create an SAP_ALE_IDoc_Writer application connection:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Application.
The Application Connection Browser dialog box appears.
3. From the Select Type list, select SAP_ALE_IDoc_Writer as the application connection type for inbound
IDoc sessions.
4. Click New.

PowerExchange for SAP NetWeaver Connections 49


The Connection Object Definition dialog box appears.
5. Enter the connection information.
Table 2-11 describes the properties that you configure for an SAP_ALE_IDoc_Writer application
connection:

Table 2-11. SAP_ALE_IDoc_Writer Application Connection Properties

Property Description

Name Connection name used by the Workflow Manager.

User Name SAP user name with authorization on S_DATASET, S_TABU_DIS, S_PROGRAM,
and B_BTCH_JOB objects.
To define the user name in the parameter file, enter session parameter
$ParamName as the user name, and define the value in the session or workflow
parameter file. The Integration Service interprets user names that start with
$Param as session parameters.

Use Parameter in Indicates the password for the SAP user name is a session parameter,
Password $ParamName. If you enable this option, define the password in the workflow or
session parameter file, and encrypt it using the pmpasswd CRYPT_DATA option.
Default is disabled.

Password Password for the SAP user name.


Note: If you want to run a session on Linux 32-bit for an IDoc mapping from
PowerCenter Connect for SAP R/3 6.x, and you want to connect to SAP 4.60, enter
the password in upper case. The SAP system must also use upper case
passwords.

Connect String Type A DEST entry in saprfc.ini.

Code Page* Code page compatible with the SAP server. Must also correspond to the Language
Code.

Language Code* Language code that corresponds to the SAP language.

Client Code SAP client number.


* For more information about code pages, see the PowerCenter Administrator Guide. For more information about
selecting a language code, see the PowerExchange for SAP NetWeaver User Guide.

6. Click OK.
The new application connection appears in the Application Connection Browser list.
7. Click Close.

Configuring an Application Connection for BAPI/RFC Integration


If you want to process BAPI/RFC data in SAP, you must create an application connection of the SAP
RFC/BAPI Interface type in the Workflow Manager. The Integration Service uses this connection to connect to
SAP and make BAPI/RFC function calls to extract, transform, or load data.

To create an application connection of the SAP RFC/BAPI Interface type:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Application.
The Application Connection Browser dialog box appears.
3. From the Select Type list, select SAP RFC/BAPI Interface.
4. Click New.
The Connection Object Definition dialog box appears.
5. Enter the connection properties for the connection.

50 Chapter 2: Managing Connection Objects


Table 2-12 describes the properties that you configure for an SAP RFC/BAPI application connection:

Table 2-12. SAP RFC/BAPI Application Connection Properties

Property Description

Name Connection name used by the Workflow Manager.

User Name SAP user name with authorization on S_DATASET, S_TABU_DIS, S_PROGRAM,
and B_BTCH_JOB objects.
To define the user name in the parameter file, enter session parameter
$ParamName as the user name, and define the value in the session or workflow
parameter file. The Integration Service interprets user names that start with $Param
as session parameters.

Use Parameter in Indicates the password for the SAP user name is a session parameter,
Password $ParamName. If you enable this option, define the password in the workflow or
session parameter file, and encrypt it using the pmpasswd CRYPT_DATA option.
Default is disabled.

Password Password for the SAP user name.


Note: If you want to run a session on 32-bit Linux, and you want to connect to SAP
4.60, enter the password in upper case. The SAP system must also use upper case
passwords.

Connect String Type A DEST entry in saprfc.ini.

Code Page* Code page compatible with the SAP server. Must also correspond to the Language
Code.

Language Code* Language code that corresponds to the SAP language.

Client Code SAP client number.


* For more information about code pages, see the PowerCenter Administrator Guide. For more information about
selecting a language code, see the PowerExchange for SAP NetWeaver User Guide.

6. Click OK.
The new application connection appears in the Application Connection Browser list.
7. Click Close.

PowerExchange for SAP NetWeaver BI Connections


You can configure the following types of connection objects to connect to SAP NetWeaver BI:
♦ SAP BW OHS application connection to connect to SAP NetWeaver BI sources.
♦ SAP BW application connection to connect to SAP NetWeaver BI targets.

Configuring an SAP BW OHS Application Connection


Complete the following procedure to configure an SAP BW OHS application connection.

To create an SAP BW OHS application connection:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Application.
The Application Connection Browser dialog box appears.
3. From the Select Type list, select BW_OHS_READER to create a connection object for SAP NetWeaver BI
sources.
4. Click New.

PowerExchange for SAP NetWeaver BI Connections 51


The Connection Object Definition dialog box appears.
5. Enter the connection information.
Table 2-13 describes the properties that you configure for an SAP BW OHS application connection:

Table 2-13. SAP BW OHS Application Connection Properties

Property Description

Name Connection name used by the Workflow Manager.

User Name SAP NetWeaver BI user name.


To define the user name in the parameter file, enter session parameter
$ParamName as the user name, and define the value in the session or workflow
parameter file. The Integration Service interprets user names that start with $Param
as session parameters.

Use Parameter Indicates the SAP NetWeaver BI password is a session parameter, $ParamName. If
in Password you enable this option, define the password in the workflow or session parameter
file, and encrypt it using the pmpasswd CRYPT_DATA option. Default is disabled.

Password SAP NetWeaver BI password.

Connect String Type A DEST entry in saprfc.ini. The Integration Service uses saprfc.ini to connect
to the SAP NetWeaver BI system.

Code Page* Code page compatible with the SAP NetWeaver BI server.

Client Code SAP NetWeaver BI client. Must match the client you use to log on to the SAP
NetWeaver BI server.

Language Code* Language code that corresponds to the code page.


*For more information about code pages, see the PowerExchange for SAP NetWeaver User Guide.

6. Click OK.
The new application connection appears in the Application Connection Browser list.
7. Click Close.

Configuring an SAP BW Application Connection


Complete the following procedure to configure an SAP BW application connection.

To create an SAP BW application:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Application.
The Application Connection Browser dialog box appears.
3. Select SAP BW to create a connection object for SAP NetWeaver BI targets.
4. Click New.
The Connection Object Definition dialog box appears.
5. Enter the connection information.

52 Chapter 2: Managing Connection Objects


Table 2-14 describes the properties that you configure for an SAP BW application connection:

Table 2-14. SAP BW Application Connection Properties

Property Description

Name Connection name used by the Workflow Manager.

User Name SAP NetWeaver BI user name.


To define the user name in the parameter file, enter session parameter
$ParamName as the user name, and define the value in the session or workflow
parameter file. The Integration Service interprets user names that start with $Param
as session parameters.

Use Parameter Indicates the SAP NetWeaver BI password is a session parameter, $ParamName. If
in Password you enable this option, define the password in the workflow or session parameter
file, and encrypt it using the pmpasswd CRYPT_DATA option. Default is disabled.

Password SAP NetWeaver BI password.

Connect String Type A DEST entry in saprfc.ini. The Integration Service uses saprfc.ini to connect
to the SAP NetWeaver BI system. If you do not enter a connection string, the
Integration Service obtains the connection parameters from the SAP BW Service.

Code Page* Code page compatible with the SAP NetWeaver BI server.

Client Code SAP NetWeaver BI client. Must match the client you use to log in to the SAP
NetWeaver BI server.

Language Code* Language code that corresponds to the code page.


*For more information about code pages, see the PowerExchange for SAP NetWeaver User Guide.

6. Click OK.
The new application connection appears in the Application Connection Browser list.
7. Click Close.

PowerExchange for Siebel Connections


Before the Integration Service can access Siebel sources in a session, you must create an application connection
in the Workflow Manager. The Workflow Manager saves application connections in the repository.
The application connection defines how the Integration Service accesses the underlying database for the Siebel
system. When you run a workflow using Siebel sources, the Integration Service extracts data from the
underlying physical database tables of the Siebel system.

To create a Siebel application connection:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Application.
The Application Connection Browser appears.
3. Select a Siebel application type and click New.
The Connection Object Definition dialog box appears.
4. Enter the connection information.

PowerExchange for Siebel Connections 53


Table 2-15 describes the properties that you configure for a Siebel application connection:

Table 2-15. Siebel Application Connection Properties

Property Description

Name Name you want to use for this connection.

User Name Database user name with SELECT permission on physical database tables in the
Siebel source system.
To define the user name in the parameter file, enter session parameter
$ParamName as the user name, and define the value in the session or workflow
parameter file. The Integration Service interprets user names that start with
$Param as session parameters.

Use Parameter in Indicates the password for the database user name is a session parameter,
Password $ParamName. If you enable this option, define the password in the workflow or
session parameter file, and encrypt it using the pmpasswd CRYPT_DATA option.
Default is disabled.

Password Password for the database user name. Must be in US-ASCII.

Connect String Connect string for the underlying database of the Siebel system. For more
information, see Table 2-1 on page 29. This option appears for DB2, Oracle, and
Informix.

Code Page Specifies the code page the Integration Service uses to extract data from the
source database. When using relaxed code page validation, select compatible code
pages for the source and target data to prevent data inconsistencies.

Database Name Name of the underlying database of the Siebel system. This option appears for
Sybase ASE and Microsoft SQL Server.

Server Name Host name for the underlying database of the Siebel system. This option appears
for Sybase ASE and Microsoft SQL Server.

Domain Name Domain name for Microsoft SQL Server on Windows.

Packet Size Packet size used to transmit data. This option appears for Sybase ASE and
Microsoft SQL Server.

Use Trusted If selected, the Integration Service uses Windows authentication to access the
Connection Microsoft SQL Server database. The user name that starts the Integration Service
must be a valid Windows user with access to the Microsoft SQL Server database.
This option appears for Microsoft SQL Server.

Rollback Segment Name of the rollback segment for the underlying database of the Siebel system.
This option appears for Oracle.

Environment SQL SQL commands used to set the environment for the underlying database of the
Siebel system.

5. Click OK.
The new application connection appears in the Application Connection Browser.
6. To add more application connections, repeat steps 3 to 5.
7. Click OK to save all changes.

PowerExchange for TIBCO Connections


Before the Integration Service can extract data from TIBCO sources or write data to TIBCO targets, you must
configure an application connection for TIBCO sources and targets in the Workflow Manager. The Integration
Service uses the application connection to connect to TIBCO during a session to read and write TIBCO
messages.

54 Chapter 2: Managing Connection Objects


You can configure the following TIBCO application connection types:
♦ TIB/Rendezvous. Configure to read or write messages in TIB/Rendezvous format.
♦ TIB/Adapter SDK. Configure read or write messages in AE format.

Connection Properties for TIB/Rendezvous Application Connections


Configure a TIB/Rendezvous application connection to connect to TIBCO during a session to read source
messages or write target messages in TIB/Rendezvous format. When you configure a TIB/Rendezvous
application connection, you specify connection properties for the Integration Service to connect to a TIBCO
daemon. Enter connection properties in the connection object.
Table 2-16 describes the properties you configure for a TIB/Rendezvous application connection:

Table 2-16. TIB/Rendezvous Application Connection Properties

Connection
Description
Attributes

Name Name you want to use for this connection.

Code Page Code page the Integration Service uses to extract data from the TIBCO. When using relaxed
code page validation, select compatible code pages for the source and target data to prevent
data inconsistencies.

Subject Default subject for source and target messages. During a session, the Integration Service
reads messages with this subject from TIBCO sources. It also writes messages with this
subject to TIBCO targets.
You can overwrite the default subject for TIBCO targets when you link the SendSubject port in
a TIBCO target definition in a mapping. For more information about the SendSubject field in
TIBCO target definitions, see the PowerExchange for TIBCO User Guide. For more
information about subject names, see the TIBCO documentation.

Service Service attribute value. Enter a value if you want to include a service name, service number,
or port number. For more information about the service attribute, see the TIBCO
documentation.

Network Network attribute value. Enter a value if your machine contains more than one network card.
For more information about specifying a network attribute, see the TIBCO documentation.

Daemon TIBCO daemon you want to connect to during a session. If you leave this option blank, the
Integration Service connects to the local daemon during a session.
If you want to specify a remote daemon, which resides on a different host than the Integration
Service, enter the following values:
<remote hostname>:<port number>
For example, you can enter host2:7501 to specify a remote daemon. For more information
about the TIBCO daemon, see the TIBCO documentation.

Certified Select if you want the Integration Service to read or write certified messages. For more
information about reading and writing certified messages in a session, see the
PowerExchange for TIBCO User Guide.

CmName Unique CM name for the CM transport when you choose certified messaging. For more
information about CM transports, see the TIBCO documentation.

Relay Agent Enter a relay agent when you choose certified messaging and the machine running the
Integration Service is not constantly connected to a network. The Relay Agent name must be
fewer than 127 characters. For more information about relay agents, see the TIBCO
documentation.

Ledger File Enter a unique ledger file name when you want the Integration Service to read or write
certified messages. The ledger file records the status of each certified message.
Configure a file-based ledger when you want the TIBCO daemon to send unconfirmed
certified messages to TIBCO targets. You also configure a file-based ledger with Request Old
when you want the Integration Service to receive unconfirmed certified messages from
TIBCO sources. For more information about file-based ledgers, see the TIBCO
documentation.

Synchronized Select if you want PowerCenter to wait until it writes the status of each certified message to
Ledger the ledger file before continuing message delivery or receipt.

PowerExchange for TIBCO Connections 55


Table 2-16. TIB/Rendezvous Application Connection Properties

Connection
Description
Attributes

Request Old Select if you want the Integration Service to receive certified messages that it did not confirm
with the source during a previous session run. When you select Request Old, you should also
specify a file-based ledger for the Ledger File attribute. For more information about Request
Old, see the TIBCO documentation.

User Register the user certificate with a private key when you want to connect to a secure
Certificate TIB/Rendezvous daemon during the session. The text of the user certificate must be in PEM
encoding or PKCS #12 binary format.

Username Enter a user name for the secure TIB/Rendezvous daemon.

Password Enter a password for the secure TIB/Rendezvous daemon.

Connection Properties for TIB/Adapter SDK Connections


Configure a TIB/Adapter SDK application connection to connect to TIB/Adapter SDK during a session to read
source messages or write target messages in AE format. When you configure a TIB/Adapter SDK connection,
you specify properties for the TIBCO adapter instance through which you want to connect to TIBCO during a
session. Specify connection properties in the Connection Object Definition dialog box.
Note: The adapter instances you specify in TIB/Adapter SDK connections should only contain one session.

Table 2-17 describes the connection properties you configure for a TIB/Adapter SDK application connection:

Table 2-17. TIB/Adapter SDK Application Connection Properties

Connection
Description
Attributes

Name Name you want to use for this connection.

Code Page Code page the Integration Service uses to extract data from the TIBCO. When using
relaxed code page validation, select compatible code pages for the source and target
data to prevent data inconsistencies.

Subject Default subject for source and target messages. During a workflow, the Integration
Service reads messages with this subject from TIBCO sources. It also writes messages
with this subject to TIBCO targets.
You can overwrite the default subject for TIBCO targets when you link the SendSubject
port in a TIBCO target definition in a mapping. For more information about the
SendSubject field in TIBCO target definitions, see the PowerExchange for TIBCO User
Guide.
For more information about subject names, see the TIBCO documentation.

Application Name Name of an adapter instance.

Repository URL URL for the TIB/Repository instance you want to connect to. You can enter the server
process variable $PMSourceFileDir for the Repository URL.

Configuration URL URL for the adapter instance.

Session Name Name of the TIBCO session associated with the adapter instance.

Validate Messages Select Validate Messages when you want the Integration Service to read and write
messages in AE format.

56 Chapter 2: Managing Connection Objects


Configuring TIBCO Application Connections
Use the following procedure to configure TIB/Rendezvous or TIB/Adapter SDK application connections.

To create a TIBCO application connection:


1. In the Workflow Manager, connect to a PowerCenter repository.
2. Click Connections > Application.
The Application Object Browser dialog box appears.
3. From Select Type, select one of the following types:
♦ TIB/Rendezvous
♦ TIB/Adapter SDK
4. Click New.
The Connection Object Definition dialog box appears.
5. Enter a name for the application connection and verify the code page.
6. Enter the connection information.
7. Click OK.
The new TIBCO application connection appears in the Application Object Browser.

RELATED TOPICS:
♦ “Connection Object Code Pages” on page 20

PowerExchange for Web Services Connections


The Integration Service can use Web Services Consumer application connections to extract data from web
service sources, write data to web service targets, or transform data using Web Services Consumer
transformations. Web Services Consumer application connections allow you to control connection properties,
including the endpoint URL and authentication parameters.
To connect to a web service, the Integration Service requires an endpoint URL. If you do not configure a Web
Services Consumer application connection or if you configure one without providing an endpoint URL, the
Integration Service uses the endpoint URL contained in the WSDL file on which the source, target, or Web
Services Consumer transformation is based.
Use the following guidelines to determine when to configure a Web Services Consumer application connection:
♦ Configure a Web Services Consumer application connection with an endpoint URL if the web service you
connect to requires authentication or if you want to use an endpoint URL that differs from the one
contained in the WSDL file.
♦ Configure a Web Services Consumer application connection without an endpoint URL if the web service
you connect to requires authentication but you want to use the endpoint URL contained in the WSDL file.
♦ You do not need to configure a Web Services Consumer application connection if the web service you
connect to does not require authentication and you want to use the endpoint URL contained in the WSDL
file.
If you need to configure SSL authentication, enter values for the SSL authentication-related properties in the
Web Services application connection. For information about SSL authentication, see the PowerExchange for Web
Services User Guide.

PowerExchange for Web Services Connections 57


To create a Web Services Consumer application connection:

1. In the Workflow Manager, connect to a PowerCenter repository.


2. Click Connections > Application.
The Application Connection Browser dialog box appears.
3. From Select Type, select Web Services Consumer.
4. Click New.
The Connection Object Definition dialog box appears.
5. Enter a name for the application connection.
6. Enter the following connection information:

Connection
Description
Attributes

User Name User name that the web service requires. If the web service does not require a
user name, enter PmNullUser.
To define the user name in the parameter file, enter session parameter
$ParamName as the user name, and define the value in the session or workflow
parameter file. The Integration Service interprets user names that start with
$Param as session parameters.

Use Parameter in Indicates the web service password is a session parameter, $ParamName. If you
Password enable this option, define the password in the workflow or session parameter file,
and encrypt it using the pmpasswd CRYPT_DATA option. Default is disabled.

Password Password that the web service requires. If the web service does not require a
password, enter PmNullPasswd.

Code Page Connection code page. The Repository Service uses the character set encoded
in the repository code page when writing data to the repository.

End Point URL Endpoint URL for the web service that you want to access. The WSDL file
specifies this URL in the location element.
You can use session parameter $ParamName, a mapping parameter, or a
mapping variable as the endpoint URL. For example, you can use a session
parameter, $ParamMyURL, as the endpoint URL, and set $ParamMyURL to the
URL in the parameter file.

Domain Domain for authentication.

Timeout Number of seconds the Integration Service waits for a connection to the web
service provider before it closes the connection and fails the session. Also, the
number of seconds the Integration Service waits for a SOAP response after
sending a SOAP request before it fails the session.

Trust Certificates File containing the bundle of trusted certificates that the Integration Service uses
File when authenticating the SSL certificate of the web services provider. Default is
ca-bundle.crt.

Certificate File Client certificate that a web service provider uses when authenticating a client.
You specify the client certificate file if the web service provider needs to
authenticate the Integration Service.

Certificate File Password for the client certificate. You specify the certificate file password if the
Password web service provider needs to authenticate the Integration Service.

Certificate File File type of the client certificate. You specify the certificate file type if the web
Type service provider needs to authenticate the Integration Service. The file type can
be either PEM or DER. For information about converting the file type of certificate
files, see the PowerExchange for Web Services User Guide.

Private Key File Private key file for the client certificate. You specify the private key file if the web
service provider needs to authenticate the Integration Service.

Key Password Password for the private key of the client certificate. You specify the key
password if the web service provider needs to authenticate the Integration
Service.

58 Chapter 2: Managing Connection Objects


Connection
Description
Attributes

Key File Type File type of the private key of the client certificate. You specify the key file type if
the web service provider needs to authenticate the Integration Service.
PowerExchange for Web Services requires the PEM file type for SSL
authentication. For information, see the PowerExchange for Web Services User
Guide.

Authentication Select one of the following authentication types to use when the web service
Type provider does not return an authentication type to the Integration Service:
- Auto. The Integration Service attempts to determine the authentication type of
the web service provider.
- Basic. Based on a non-encrypted user name and password.
- Digest. Based on an encrypted user name and password.
- NTLM. Based on encrypted user name, password, and domain.
Default is Auto.

7. Click OK.
The application connection appears in the Application Connection Browser.

PowerExchange for webMethods Connections


Before the Integration Service can extract data from webMethods sources or write data to webMethods targets,
you must configure an application connection for webMethods sources and targets in the Workflow Manager.
When you configure a webMethods application connection, you specify connection properties the Integration
Service uses to connect to a webMethods Broker during a session.
The Integration Service uses webMethods application connections to connect to a webMethods Broker when it
reads webMethods source documents and writes webMethods target documents.

To create a webMethods application connection:

1. In the Workflow Manager, connect to a PowerCenter repository.


2. Click Connections > Application.
The Application Connection Browser dialog box appears.
3. From Select Type, select webMethods Broker.
4. Click New.
The Connection Object Definition dialog box appears.
5. Enter the connection information.
Table 2-18 describes the properties that you configure for a webMethods application connection:

Table 2-18. webMethods Application Connection Properties

Property Description

Name Name you want to use for this connection.

Broker Host Enter the host name of the Broker you want the Integration Service to connect to. If the
port number for the Broker is not the default port number, also enter the port number.
Default port number is 6849.
Enter the host name and port number in the following format:
<host name:port>

Broker Name Enter the name of the Broker. If you do not enter a Broker name, the Integration
Service uses the default Broker.

PowerExchange for webMethods Connections 59


Table 2-18. webMethods Application Connection Properties

Property Description

Client ID Enter a client ID for the Integration Service to use when it connects to the Broker
during the session. If you do not enter a client ID, the Broker generates a random client
ID.
If you select Preserve Client State, enter a client ID.

Client Group Enter the name of the group to which the client belongs.

Application Enter the name of the application that will run the Broker Client. Default is “Informatica
Name PowerCenter Connect for webMethods.”

Automatic Select this option to enable the Integration Service to reconnect to the Broker if the
Reconnection connection to the Broker is lost.

Preserve Select this option to maintain the client state across sessions. The client state is the
Client State information the Broker keeps about the client, such as the client ID, application name,
and client group.
Preserving the client state enables the webMethods Broker to retain documents it
sends when a subscribing client application, such as the Integration Service, is not
listening for documents. Preserving the client state also allows the Broker to maintain
the publication ID sequence across sessions when writing documents to webMethods
targets.
If you select this option, configure a Client ID in the application connection. You should
also configure guaranteed storage for your webMethods Broker. For more information
about storage types, see the webMethods documentation.
If you do not select this option, the Integration Service destroys the client state when it
disconnects from the Broker.

6. Click OK.
The application connection appears in the Application Object Browser.

PowerExchange for WebSphere MQ Connections


Before the Integration Service can read data from a WebSphere MQ source or write data to WebSphere MQ
target, configure a Message Queue queue connection for source and target queues.

To configure a Message Queue queue connection:

1. In the Workflow Manager, connect to a repository.


2. Click Connections > Queue.
The Queue Connection Browser appears.
3. Click New.
4. Select Message Queue as the subtype and click OK.
The Connection Object Definition dialog box appears.
5. Enter the connection information.
Table 2-19 describes the properties that you configure for a Message Queue queue connection:

Table 2-19. Message Queue Connection Properties

Property Description

Name Name you want to use for this connection.

Code Page Code page that is the same as or a a subset of the code page of the queue
manager coded character set identifier (CCSID).

60 Chapter 2: Managing Connection Objects


Table 2-19. Message Queue Connection Properties

Property Description

Queue Manager Name of the queue manager for the message queue.

Queue Name Name of the message queue.

Connection Retry Period Number of seconds the Integration Service attempts to reconnect to the
WebSphere MQ queue if the connection fails. If the Integration Service
cannot reconnect to the WebSphere MQ queue in the retry period, the
session fails. Default is 0.

6. Click OK.
The queue connection appears in the Message Queue connection list.
7. To add more connections, repeat steps 3 to 6.
To edit or delete a queue connection, select the queue connection from the list and click the appropriate button.

Test Queue Connections


Before you use PowerExchange for WebSphere MQ to extract data from message queues or load data to message
queues, you can test the queue connections configured in the Workflow Manager. If the connections are not
valid, the Integration Service cannot connect to the source or target message queues at run time. You can test
queue connections with WebSphere MQ tools provided with the WebSphere MQ Server.

Testing a Queue Connection on Windows


Use the following procedure to test a queue connection on Windows.

To test a queue connection on Windows:

1. From the command prompt of the WebSphere MQ server machine, go to the <WebSphere MQ>\bin
directory.
2. Use one of the following commands to test the connection for the queue:
♦ amqsputc. Use if you installed the WebSphere MQ client on the Integration Service machine.
♦ amqsput. Use if you installed the WebSphere MQ server on the Integration Service machine.
The amqsputc and amqsput commands put a new message on the queue. If you test the connection to a
queue in a production environment, terminate the command to avoid writing a message to a production
queue.
For example, to test the connection to the queue “production,” which is administered by the queue
manager “QM_s153664.informatica.com,” enter one of the following commands:
amqsputc production QM_s153664.informatica.com

-or-
amqsput production QM_s153664.informatica.com

If the connection is valid, the command returns a connection acknowledgment. If the connection is not
valid, it returns an WebSphere MQ error message.
3. If the connection is successful, press Ctrl+C at the prompt to terminate the connection and the command.

PowerExchange for WebSphere MQ Connections 61


Testing a Queue Connection on UNIX
Use the following procedure to test a queue connection on UNIX.

To test a queue connection on UNIX:

1. On the WebSphere MQ Server system, go to the <WebSphere MQ>/samp/bin directory.


2. Use one of the following commands to test the connection for the queue:
♦ amqsputc. Use if you installed the WebSphere MQ client on the Integration Service machine.
♦ amqsput. Use if you installed the WebSphere MQ server on the Integration Service machine.
The amqsputc and amqsput commands put a new message on the queue. If you test the connection to a
queue in a production environment, make sure you terminate the command to avoid writing a message to a
production queue.
For example, to test the connection to the queue “production,” which is administered by the queue
manager “QM_s153664.informatica.com,” enter one of the following commands:
amqsputc production QM_s153664.informatica.com

-or-
amqsput production QM_s153664.informatica.com

If the connection is valid, the command returns a connection acknowledgment. If the connection is not
valid, it returns an WebSphere MQ error message.
3. If the connection is successful, press Ctrl+C at the prompt to terminate the connection and the command.

62 Chapter 2: Managing Connection Objects


CHAPTER 3

Working with Workflows


This chapter includes the following topics:
♦ Overview, 63
♦ Creating a Workflow, 65
♦ Using the Workflow Wizard, 66
♦ Assigning an Integration Service, 68
♦ Working with Links, 70
♦ Using the Expression Editor, 72
♦ Scheduling a Workflow, 73
♦ Validating a Workflow, 77
♦ Manually Starting a Workflow, 79
♦ Suspending the Workflow, 81
♦ Stopping or Aborting the Workflow, 82
♦ Viewing a Workflow Report, 83

Overview
A workflow is a set of instructions that tells the Integration Service how to run tasks such as sessions, email
notifications, and shell commands. After you create tasks in the Task Developer and Workflow Designer, you
connect the tasks with links to create a workflow.
In the Workflow Designer, you can specify conditional links and use workflow variables to create branches in
the workflow. The Workflow Manager also provides Event-Wait and Event-Raise tasks to control the sequence
of task execution in the workflow. You can also create worklets and nest them inside the workflow.
Every workflow contains a Start task, which represents the beginning of the workflow.

63
Figure 3-1 shows a sample workflow:

Figure 3-1. Sample Workflow


Workflow Tasks

Start Task Session Task Assignment Task Link Command Task

You can create workflows with branches to run tasks concurrently.


Figure 3-2 shows a sample workflow with two branches:

Figure 3-2. Sample Workflow With Two Branches

When you create a workflow, select an Integration Service to run the workflow. You can start the workflow
using the Workflow Manager, Workflow Monitor, or pmcmd.
Use the Workflow Monitor to see the progress of a workflow during its run. The Workflow Monitor can also
show the history of a workflow.
Use the following guidelines when you develop a workflow:
1. Create a workflow. Create a workflow in the Workflow Designer or by using the Workflow Generation
Wizard in the PowerCenter Designer. For more information about creating a new workflow, see “Creating
a Workflow Manually” on page 65.
2. Add tasks to the workflow. You might have already created tasks in the Task Developer. Or, you can add
tasks to the workflow as you develop the workflow in the Workflow Designer. For more information about
workflow tasks, see “Working with Tasks” on page 95.
3. Connect tasks with links. After you add tasks to the workflow, connect them with links to specify the order
of execution in the workflow. For more information about links, see “Working with Links” on page 70.
4. Specify conditions for each link. You can specify conditions on the links to create branches and
dependencies. For more information, see “Working with Links” on page 70.

64 Chapter 3: Working with Workflows


5. Validate workflow. Validate the workflow in the Workflow Designer to identify errors. For more
information about validation rules, see “Validating a Workflow” on page 77.
6. Save workflow. When you save the workflow, the Workflow Manager validates the workflow and updates
the repository.
7. Run workflow. In the workflow properties, select an Integration Service to run the workflow. Run the
workflow from the Workflow Manager, Workflow Monitor, or pmcmd. You can monitor the workflow in
the Workflow Monitor.

RELATED TOPICS:
♦ “Manually Starting a Workflow” on page 79

♦ “Monitoring Workflows” on page 335


♦ “Workflow Properties Reference” on page 509

Creating a Workflow
A workflow must contain a Start task. The Start task represents the beginning of a workflow. When you create a
workflow, the Workflow Designer creates a Start task and adds it to the workflow. You cannot delete the Start
task.
After you create a workflow, you can add tasks to the workflow. The Workflow Manager includes tasks such as
the Session, Command, and Email tasks.
Finally, you connect workflow tasks with links to specify the order of execution in the workflow. You can add
conditions to links.
When you edit a workflow, the Repository Service updates the workflow information when you save the
workflow. If a workflow is running when you make edits, the Integration Service uses the updated information
the next time you run the workflow.
You can also create a workflow through the Workflow Wizard in the Workflow Manager or the Workflow
Generation Wizard in the PowerCenter Designer.

Creating a Workflow Manually


Use the following procedure to create a workflow manually.

To create a workflow manually:

1. Open the Workflow Designer.


2. Click Workflows > Create.
3. Enter a name for the new workflow.
4. Click OK.
The Workflow Designer creates a Start task in the workflow.

Creating a Workflow Automatically


Use the following procedure to create a workflow automatically.

To create a workflow automatically:

1. Open the Workflow Designer. Close any open workflow.


2. Click the session button on the Tasks toolbar.

Creating a Workflow 65
3. Click in the Workflow Designer workspace.
The Mappings dialog box appears.
4. Select a mapping to associate with the session and click OK.
The Create Workflow dialog box appears. The Workflow Designer names the workflow wf_SessionName by
default. You can rename the workflow or change other workflow properties.
5. Click OK.
The Workflow Designer creates a workflow for the session.

Adding Tasks to Workflows


After you create a workflow, you add tasks you want to run in the workflow. You may already have created tasks
in the Task Developer. Or, you may want to create tasks in the Workflow Designer as you develop the workflow.
If you have already created tasks in the Task Developer, add them to the workflow by dragging the tasks from
the Navigator to the Workflow Designer workspace.
To create and add tasks as you develop the workflow, click Tasks > Create in the Workflow Designer. Or, use the
Tasks toolbar to create and add tasks to the workflow. Click the button on the Tasks toolbar for the task you
want to create. Click again in the Workflow Designer workspace to create and add the task.
Tasks you create in the Workflow Designer are non-reusable. Tasks you create in the Task Developer are
reusable.

RELATED TOPICS:
♦ “Reusable Workflow Tasks” on page 97

Deleting a Workflow
You may decide to delete a workflow that you no longer use. When you delete a workflow, you delete all non-
reusable tasks and reusable task instances associated with the workflow. Reusable tasks used in the workflow
remain in the folder when you delete the workflow.
If you delete a workflow that is running, the Integration Service aborts the workflow. If you delete a workflow
that is scheduled to run, the Integration Service removes the workflow from the schedule.
You can delete a workflow in the Navigator window, or you can delete the workflow currently displayed in the
Workflow Designer workspace:
♦ To delete a workflow from the Navigator window, open the folder, select the workflow and press the Delete
key.
♦ To delete a workflow currently displayed in the Workflow Designer workspace, click Workflows > Delete.

Using the Workflow Wizard


Use the Workflow Wizard to automate the process of creating sessions, adding sessions to a workflow, and
linking sessions to create a workflow. The Workflow Wizard creates sessions from mappings and adds them to
the workflow. It also creates a Start task and lets you schedule the workflow. You can add tasks and edit other
workflow properties after the Workflow Wizard completes. If you want to create concurrent sessions, use the
Workflow Designer to manually build a workflow.
Before you create a workflow, verify that the folder contains a valid mapping for the Session task.
Complete the following steps to build a workflow using the Workflow Wizard:

66 Chapter 3: Working with Workflows


1. Assign a name and Integration Service to the workflow.
2. Create a session.
3. Schedule the workflow.
You can also use the Workflow Generation Wizard in the PowerCenter Designer to generate sessions and
workflows.

Step 1. Assign a Name and Integration Service to the Workflow


In the first step of the Workflow Wizard, you add the name and description of the workflow and choose the
Integration Service to run the workflow.

To create the workflow:

1. In the Workflow Manager, open the folder containing the mapping you want to use in the workflow.
2. Open the Workflow Designer.
3. Click Workflows > Wizard.
The Workflow Wizard appears.
4. Enter a name for the workflow.
The convention for naming workflows is wf_WorkflowName. For more information about naming
conventions for repository objects, see PowerCenter Getting Started.
5. Enter a description for the workflow.
6. Select the Integration Service to run the workflow and click Next.

Step 2. Create a Session


In the second step of the Workflow Wizard, you create a session based on a mapping. You can add tasks later in
the Workflow Designer workspace.

To create a session:

1. In the second step of the Workflow Wizard, select a valid mapping and click the right arrow button.
The Workflow Wizard creates a Session task in the right pane using the selected mapping and names it
s_MappingName by default.
The following figure shows a mapping selected for a session:
2. You can select additional mappings to create more Session tasks in the workflow.
When you add multiple mappings to the list, the Workflow Wizard creates sequential sessions in the order
you add them.
3. Use the arrow buttons to change the session order.
4. Specify whether the session should be reusable.
When you create a reusable session, use the session in other workflows.
5. Specify how you want the Integration Service to run the workflow.
You can specify that the Integration Service runs sessions only if previous sessions complete, or you can
specify that the Integration Service always runs each session. When you select this option, it applies to all
sessions you create using the Workflow Wizard.

Using the Workflow Wizard 67


Step 3. Schedule a Workflow
In the third step of the Workflow Wizard, you can schedule a workflow to run continuously, repeat at a given
time or interval, or start manually. The Integration Service runs a workflow unless the prior workflow run fails.
When a workflow fails, the Integration Service removes the workflow from the schedule, and you must
reschedule it. You can do this in the Workflow Manger or using pmcmd.

To schedule a workflow:

1. In the third step of the Workflow Wizard, configure the scheduling and run options.
2. Click Next.
The Workflow Wizard displays the settings for the workflow.
3. Verify the workflow settings and click Finish. To edit settings, click Back.
The completed workflow opens in the Workflow Designer workspace. From the workspace, you can add
tasks, create concurrent sessions, add conditions to links, or modify properties.

RELATED TOPICS:
♦ “Scheduling a Workflow” on page 73

Assigning an Integration Service


Before you can run a workflow, you must assign an Integration Service to run it. You can choose an Integration
Service to run a workflow by editing the workflow properties. You can also assign an Integration Service from
the menu. When you assign a service from the menu, you can assign multiple workflows without editing each
workflow.

Assigning a Service from the Workflow Properties


Use the following procedure to assign a service within the workflow properties.

To select an Integration Service to run a workflow:

1. In the Workflow Designer, open the Workflow.


2. Click Workflows > Edit.
The Edit Workflow dialog box appears.

68 Chapter 3: Working with Workflows


3. On the General tab, click the Browse Integration Services button.

Select an
Integration Service.

A list of Integration Services appears.


4. Select the Integration Service that you want to run the workflow.
5. Click OK twice to select the Integration Service for the workflow.

Assigning a Service from the Menu


When you assign an Integration Service to a workflow you overwrite the service selected in the workflow
properties.

To assign an Integration Service to a workflow:

1. Close all folders in the repository.


2. Click Service > Assign Integration Service.
-or-
Right-click the Integration Service name in the Navigator and choose Assign to Workflows.

Assigning an Integration Service 69


The Assign Integration Service dialog box appears.

Select a service
to assign.

Select a folder.

Assign an Integration
Service to a workflow.

Select all workflows


and sessions.

3. From the Choose Integration Service list, select the service you want to assign.
4. From the Show Folder list, select the folder you want to view. Or, click All to view workflows in all folders
in the repository.
5. Click the Selected check box for each workflow you want the Integration Service to run.
6. Click Assign.

Working with Links


Use links to connect each workflow task. You can specify conditions with links to create branches in the
workflow. The Workflow Manager does not allow you to use links to create loops in the workflow. Each link in
the workflow can run only once.

To link two tasks:

1. In the Tasks toolbar, click the Link Tasks button.

Link Tasks
Button

2. In the workspace, click the first task you want to connect and drag it to the second task.
3. A link appears between the two tasks.
If you want to link multiple tasks concurrently, you may not want to connect each link manually.

To link tasks concurrently:

1. In the workspace, click the first task you want to connect.


2. Ctrl-click all other tasks you want to connect.
Note: Do not use Ctrl+A or Edit > Select All to choose tasks.

3. Click Tasks > Link Concurrent.


A link appears between the first task you selected and each task you added. The first task you selected links
to each task concurrently.

70 Chapter 3: Working with Workflows


If you have a number of tasks that you want to link sequentially, you may not wish to connect each link
manually.

To link tasks sequentially:

1. In the workspace, click the first task you want to connect.


2. Ctrl-click the next task you want to connect. Continue to add tasks in the order you want them to run.
3. Click Tasks > Link Sequential.
Links appear in sequential order between the first task and each subsequent task you added.

Specifying Link Conditions


After you create links between tasks, you can specify conditions for each link to determine the order of
execution in the workflow. If you do not specify conditions for each link, the Integration Service runs the next
task in the workflow by default.
Use predefined or user-defined workflow variables in the link condition. If the link condition evaluates to True,
the Integration Service runs the next task in the workflow. If the link condition evaluates to False, the
Integration Service does not run the next task in the workflow.
You can view results of link evaluation during workflow runs in the workflow log file.

Example of Link Conditions


Use link conditions to specify the order of execution in the workflow or to create branches in the workflow. For
example, you have two Session tasks in the workflow, s_STORES_CA and s_STORES_AZ. You want the
Integration Service to run the second Session task only if the first Session task has no target failed rows.
To accomplish this, you can set the following link condition between the sessions so that the s_STORES_AZ
runs only if the number of failed target rows for S_STORES_CA is zero:
$s_STORES_CA.TgtFailedRows = 0

After you specify the link condition in the Expression Editor, the Workflow Manager validates the link
condition and displays it next to the link in the workflow.

To specify a condition for a link:

1. In the Workflow Designer workspace, double-click the link you want to specify.
-or-
Right-click the link and choose Edit. The Expression Editor appears.
2. In the Expression Editor, enter the link condition.
The Expression Editor provides predefined workflow variables, user-defined workflow variables, variable
functions, and boolean and arithmetic operators.
3. Validate the expression using the Validate button.
The Workflow Manager displays validation results in the Output window.
Tip: Drag the end point of a link to move it from one task to another without losing the link condition.

Viewing Links in a Workflow or Worklet


When you edit a workflow or worklet, you can view the forward or backward link paths to other tasks. You can
highlight paths to see links in the workflow branch from the Start task to the last task in the branch.
Note: You can configure the color the Workflow Manager uses to display links. To configure the color for links,
click Tools > Options > Format, and choose the Link Selection option.

Working with Links 71


To view link paths:

1. In the Worklet Designer or Workflow Designer, right-click a task and choose Highlight Path.
2. Select Forward Path, Backward Path, or Both.
The Workflow Manager highlights all links in the branch you select.

Deleting Links in a Workflow or Worklet


When you edit a workflow or worklet, you can delete multiple links at once without deleting the connected
tasks.

To delete multiple links:

1. In the Worklet Designer or Workflow Designer, select all links you want to delete.
Tip: Use the mouse to drag the selection, or you can Ctrl-click the tasks and links.
2. Click Edit > Delete Links.
The Workflow Manager removes all selected links.

Using the Expression Editor


The Workflow Manager provides an Expression Editor for any expression in the workflow. You can enter
expressions using the Expression Editor for Link conditions, Decision tasks, and Assignment tasks.
The Expression Editor displays built-in variables, user-defined workflow variables, and predefined workflow
variables such as $Session.status. For more information about workflow variables, see “Working with Workflow
Variables” on page 85.
The Expression Editor also displays the following functions:
♦ Transformation language functions. SQL-like functions designed to handle common expressions.
♦ User-defined functions. Functions you create in PowerCenter based on transformation language functions.
♦ Custom functions. Functions you create with the Custom Function API.
For more information about the transformation language and custom functions, see the PowerCenter
Transformation Language Reference. For more information about user-defined functions, see the Designer Guide.

Adding Comments
You can add comments using -- or // comment indicators with the Expression Editor. Use comments to give
descriptive information about the expression, or you can specify a valid URL to access business documentation
about the expression.

Validating Expressions
Use the Validate button to validate an expression. If you do not validate an expression, the Workflow Manager
validates it when you close the Expression Editor. You cannot run a workflow with invalid expressions.
Expressions in link conditions and Decision task conditions must evaluate to a numeric value. Workflow
variables used in expressions must exist in the workflow.

72 Chapter 3: Working with Workflows


Expression Editor Display
The Expression Editor can display syntax expressions in different colors for better readability. If you have the
latest Rich Edit control, riched20.dll, installed on the system, the Expression Editor displays expression
functions in blue, comments in grey, and quoted strings in green.
You can resize the Expression Editor. Expand the dialog box by dragging from the borders. The Workflow
Manager saves the new size for the dialog box as a client setting.

Scheduling a Workflow
You can schedule a workflow to run continuously, repeat at a given time or interval, or you can manually start a
workflow. Each workflow has an associated scheduler. A scheduler is a repository object that contains a set of
schedule settings. You can create a non-reusable scheduler for the workflow. Or, you can create a reusable
scheduler to use the same set of schedule settings for workflows in the folder.
You can change the schedule settings by editing the scheduler. By default, the workflow runs on demand. If you
change schedule settings, the Integration Service reschedules the workflow according to the new settings. The
Integration Service runs a scheduled workflow as configured. The Workflow Manager marks a workflow invalid
if you delete the scheduler associated with the workflow.
If you configure multiple instances of a workflow, and you schedule the workflow run time, the Integration
Service runs all instances at the scheduled time. You cannot schedule workflow instances to run at different
times.
If you choose a different Integration Service for the workflow or restart the Integration Service, it reschedules all
workflows. This includes workflows that are scheduled to run continuously but whose start time has passed and
workflows that are scheduled to run continuously but were unscheduled. You must manually reschedule
workflows whose start time has passed if they are not scheduled to run continuously.
If you delete a folder, the Integration Service removes workflows from the schedule when it receives notification
from the Repository Service. If you copy a folder into a repository, the Integration Service reschedules all
workflows in the folder when it receives the notification.
The Integration Service does not run the workflow in the following situations:
♦ The prior workflow run fails. When a workflow fails, the Integration Service removes the workflow from
the schedule, and you must manually reschedule it. You can reschedule the workflow in the Workflow
Manager or using pmcmd. In the Workflow Manager Navigator window, right-click the workflow and select
Schedule Workflow.
♦ You remove the workflow from the schedule. You can remove the workflow from the schedule in the
Workflow Manager or using pmcmd. In the Workflow Manager Navigator window, right-click the workflow
and select Unschedule Workflow. For more information about removing workflows from the schedule, see
“Unscheduling a Workflow” on page 74.
♦ The Integration Service is running in safe mode. In safe mode, the Integration Service does not run
scheduled workflows, including workflows scheduled to run continuously or run on service initialization.
When you enable the Integration Service in normal mode, the Integration Service runs the scheduled
workflows.
Note: The Integration Service schedules the workflow in the time zone of the Integration Service machine. For
example, the PowerCenter Client is in the current time zone and the Integration Service is in a time zone two
hours later. If you schedule the workflow to start at 9 a.m., it starts at 9 a.m. in the time zone of the Integration
Service machine and 7 a.m. current time.

To schedule a workflow:

1. In the Workflow Designer, open the workflow.


2. Click Workflows > Edit.

Scheduling a Workflow 73
3. In the Scheduler tab, choose Non-reusable if you want to create a non-reusable set of schedule settings for
the workflow.
Select Reusable if you want to select an existing reusable scheduler for the workflow.
Note: If you do not have a reusable scheduler in the folder, you must create one before you choose Reusable.
The Workflow Manager displays a warning message if you do not have an existing reusable scheduler.
4. Click the right side of the Scheduler field to edit scheduling settings for the scheduler.

Edit scheduler
settings.

5. If you select Reusable, choose a reusable scheduler from the Scheduler Browser dialog box.
6. Click OK.
To reschedule a workflow on its original schedule, right-click the workflow in the Navigator window and choose
Schedule Workflow.

RELATED TOPICS:
♦ “Configuring Scheduler Settings” on page 75

Unscheduling a Workflow
To remove a workflow from its schedule, right-click the workflow in the Navigator and choose Unschedule
Workflow.
To permanently remove a workflow from a schedule, configure the workflow schedule to run on demand.
Note: When the Integration Service restarts, it reschedules all unscheduled workflows that are scheduled to run
continuously.

RELATED TOPICS:
♦ “Configuring Scheduler Settings” on page 75

Creating a Reusable Scheduler


For each folder, the Workflow Manager lets you create reusable schedulers so you can reuse the same set of
scheduling settings for workflows in the folder. Use a reusable scheduler so you do not need to configure the
same set of scheduling settings in each workflow.
When you delete a reusable scheduler, all workflows that use the deleted scheduler becomes invalid. To make
the workflows valid, you must edit them and replace the missing scheduler.

74 Chapter 3: Working with Workflows


To create a reusable scheduler:

1. In the Workflow Designer, click Workflows > Schedulers.


2. Click Add to add a new scheduler.
3. In the General tab, enter a name for the scheduler.
4. Configure the scheduler settings in the Scheduler tab.

Configuring Scheduler Settings


Configure the Schedule tab of the scheduler to set run options, schedule options, start options, and end options
for the schedule.
Table 3-1 describes the settings on the Schedule tab:

Table 3-1. Schedule Tab Settings

Scheduler Options Description

Run Options: Indicates the workflow schedule type.


Run On Integration Service If you select Run On Integration Service Initialization, the Integration Service
Initialization/Run On runs the workflow as soon as the service is initialized. The Integration Service
Demand/Run Continuously then starts the next run of the workflow according to settings in Schedule
Options.
If you select Run On Demand, the Integration Service runs the workflow when
you start the workflow manually.
If you select Run Continuously, the Integration Service runs the workflow as
soon as the service initializes. The Integration Service then starts the next run
of the workflow as soon as it finishes the previous run. If you edit a workflow
that is set to run continuously, you must stop or unschedule the workflow, save
the workflow, and then restart or reschedule the workflow.

Schedule Options: Required if you select Run On Integration Service Initialization, or if you do not
Run Once/Run choose any setting in Run Options.
Every/Customized Repeat If you select Run Once, the Integration Service runs the workflow once, as
scheduled in the scheduler.
If you select Run Every, the Integration Service runs the workflow at regular
intervals, as configured.
If you select Customized Repeat, the Integration Service runs the workflow on
the dates and times specified in the Repeat dialog box.
When you select Customized Repeat, click Edit to open the Repeat dialog box.
The Repeat dialog box lets you schedule specific dates and times for the
workflow run. The selected scheduler appears at the bottom of the page.

Start Options: Start Start Date indicates the date on which the Integration Service begins the
Date/Start Time workflow schedule.
Start Time indicates the time at which the Integration Service begins the
workflow schedule.

End Options: End On/End Required if the workflow schedule is Run Every or Customized Repeat.
After/Forever If you select End On, the Integration Service stops scheduling the workflow in
the selected date.
If you select End After, the Integration Service stops scheduling the workflow
after the set number of workflow runs.
If you select Forever, the Integration Service schedules the workflow as long as
the workflow does not fail.

Customizing Repeat Option


You can schedule the workflow to run once or at an interval. You can customize the repeat option. Click the
Edit button to open the Customized Repeat dialog box.

Scheduling a Workflow 75
Table 3-2 describes options in the Customized Repeat dialog box:

Table 3-2. Repeat Dialog Box Options

Repeat Option Description

Repeat Every Enter the numeric interval you would like the Integration Service to schedule the
workflow, and then select Days, Weeks, or Months, as appropriate.
If you select Days, select the appropriate Daily Frequency settings.
If you select Weeks, select the appropriate Weekly and Daily Frequency settings.
If you select Months, select the appropriate Monthly and Daily Frequency settings.

Weekly Required to enter a weekly schedule. Select the day or days of the week on which you
would like the Integration Service to run the workflow.

Monthly Required to enter a monthly schedule.


If you select Run On Day, select the dates on which you want the workflow scheduled
on a monthly basis. The Integration Service schedules the workflow to run on the
selected dates. If you select a numeric date exceeding the number of days within a
given month, the Integration Service schedules the workflow for the last day of the
month, including leap years. For example, if you schedule the workflow to run on the
31st of every month, the Integration Service schedules the session on the 30th of the
following months: April, June, September, and November.
If you select Run On The, select the week(s) of the month, then day of the week on
which you want the workflow to run. For example, if you select Second and Last, then
select Wednesday, the Integration Service schedules the workflow to run on the second
and last Wednesday of every month.

Daily Frequency Enter the number of times you would like the Integration Service to run the workflow on
any day the session is scheduled.
If you select Run Once, the Integration Service schedules the workflow once on the
selected day, at the time entered on the Start Time setting on the Time tab.
If you select Run Every, enter Hours and Minutes to define the interval at which the
Integration Service runs the workflow. The Integration Service then schedules the
workflow at regular intervals on the selected day. The Integration Service uses the Start
Time setting for the first scheduled workflow of the day.

Editing Scheduler Settings


You can edit scheduler settings for both non-reusable and reusable schedulers.
♦ Non-reusable schedulers. When you configure or edit a non-reusable scheduler, check in the workflow to
allow the schedule to take effect.
You can update the schedule manually with the workflow checked out. Right-click the workflow in the
Navigator, and select Schedule Workflow. Note that the changes are applied to the latest checked-in version
of the workflow.
♦ Reusable schedulers. When you edit settings for a reusable scheduler, the repository creates a new version of
the scheduler and increments the version number by one. To update a workflow with the latest schedule,
check in the scheduler after you edit it.
When you configure a reusable scheduler for a new workflow, you must check in both the workflow and the
scheduler to enable the schedule to take effect. Thereafter, when you check in the scheduler after revising it,
the workflow schedule is updated even if it is checked out.
You need to update the workflow schedule manually if you do not check in the scheduler. To update a
workflow schedule manually, right-click the workflow in the Navigator and select Schedule Workflow. The
new schedule is implemented for the latest version of the workflow that is checked in. Workflows that are
checked out are not updated with the new schedule.

Disabling Workflows
You may want to disable the workflow while you edit it. This prevents the Integration Service from running the
workflow on its schedule. Select the Disable Workflows option on the General tab of the workflow properties.

76 Chapter 3: Working with Workflows


The Integration Service does not run disabled workflows until you clear the Disable Workflows option. Once
you clear the Disable Workflows option, the Integration Service reschedules the workflow.

Scheduling Workflows During Daylight Savings Time


On Windows, the Integration Service does not run a scheduled workflow during the last hour of Daylight
Saving Time (DST). If a workflow is scheduled to run between 1:00 a.m. and 1:59 a.m. DST, the Integration
Service resumes the workflow after 1:00 a.m. Standard Time (ST). If you try to schedule a workflow during the
last hour of DST or the first hour of ST, you receive an error. Wait until 2:00 a.m. to create a scheduler.

Validating a Workflow
Before you can run a workflow, you must validate it. When you validate the workflow, you validate all task
instances in the workflow, including nested worklets.
The Workflow Manager validates the following properties:
♦ Expressions. Expressions in the workflow must be valid.
♦ Tasks. Non-reusable task and Reusable task instances in the workflow must follow validation rules.
♦ Scheduler. If the workflow uses a reusable scheduler, the Workflow Manager verifies that the scheduler
exists.
The Workflow Manager also verifies that you linked each task properly.
Note: The Workflow Manager validates Session tasks separately. If a session is invalid, the workflow may still be
valid.

RELATED TOPICS:
♦ “Validating a Session” on page 129

Expression Validation
The Workflow Manager validates all expressions in the workflow. You can enter expressions in the Assignment
task, Decision task, and link conditions. The Workflow Manager writes any error message to the Output
window.
Expressions in link conditions and Decision task conditions must evaluate to a numerical value. Workflow
variables used in expressions must exist in the workflow.
The Workflow Manager marks the workflow invalid if a link condition is invalid.

Task Validation
The Workflow Manager validates each task in the workflow as you create it. When you save or validate the
workflow, the Workflow Manager validates all tasks in the workflow except Session tasks. It marks the workflow
invalid if it detects any invalid task in the workflow.
The Workflow Manager verifies that attributes in the tasks follow validation rules. For example, the user-
defined event you specify in an Event task must exist in the workflow. The Workflow Manager also verifies that
you linked each task properly. For example, you must link the Start task to at least one task in the workflow.
When you delete a reusable task, the Workflow Manager removes the instance of the deleted task from
workflows. The Workflow Manager also marks the workflow invalid when you delete a reusable task used in a
workflow.
The Workflow Manager verifies that there are no duplicate task names in a folder, and that there are no
duplicate task instances in the workflow.

Validating a Workflow 77
RELATED TOPICS:
♦ “Validating Tasks” on page 99

Workflow Properties Validation


The Workflow Manager marks the workflow invalid if the scheduler you specify for the workflow does not exist
in the folder.

Running Validation
When you validate a workflow, you validate worklet instances, worklet objects, and all other nested worklets in
the workflow. You validate task instances and worklets, regardless of whether you have edited them.
The Workflow Manager validates the worklet object using the same validation rules for workflows. The
Workflow Manager validates the worklet instance by verifying attributes in the Parameter tab of the worklet
instance.
If the workflow contains nested worklets, you can select a worklet to validate the worklet and all other worklets
nested under it. To validate a worklet and its nested worklets, right-click the worklet and choose Validate.

RELATED TOPICS:
♦ “Validating Worklets” on page 116

Example
For example, you have a workflow that contains a non-reusable worklet called Worklet_1. Worklet_1 contains a
nested worklet called Worklet_a. The workflow also contains a reusable worklet instance called Worklet_2.
Worklet_2 contains a nested worklet called Worklet_b.
In the example workflow in Figure 3-3, the Workflow Manager validates links, conditions, and tasks in the
workflow. The Workflow Manager validates all tasks in the workflow, including tasks in Worklet_1,
Worklet_2, Worklet_a, and Worklet_b.
You can validate a part of the workflow. Right-click Worklet_1 and choose Validate. The Workflow Manager
validates all tasks in Worklet_1 and Worklet_a.
Figure 3-3 shows the example workflow:

Figure 3-3. Example Workflow - Validation


Worklet_1: Non-reusable Worklet_2: Reusable
worklet. Contains a worklet. Contains a
nested worklet called nested worklet called
Worklet_a. Worklet_b.

Validating Multiple Workflows


You can validate multiple workflows or worklets without fetching them into the workspace. To validate multiple
workflows, you must select and validate the workflows from a query results view or a view dependencies list.
When you validate multiple workflows, the validation does not include sessions, nested worklets, or reusable
worklet objects in the workflows.
Note: You can also select and validate multiple workflows from the Navigator in the Repository Manager.

You can save and optionally check in workflows that change from invalid to valid status. For more information
about validating multiple objects, see the PowerCenter Repository Guide.

78 Chapter 3: Working with Workflows


To validate multiple workflows:

1. Select workflows from either a query list or a view dependencies list.


2. Check out the objects you want to validate.
3. Right-click one of the selected workflows and choose Validate.
The Validate Objects dialog box appears.
4. Choose to save objects and check in objects that you validate.

Manually Starting a Workflow


Before you can run a workflow, you must select an Integration Service to run the workflow. You can select an
Integration Service when you edit a workflow or from the Assign Integration Service dialog box. If you select an
Integration Service from the Assign Integration Service dialog box, the Workflow Manager overwrites the
Integration Service assigned in the workflow properties.
You can manually start a workflow configured to run on demand or to run on a schedule. Use the Workflow
Manager, Workflow Monitor, or pmcmd to run a workflow. You can choose to run the entire workflow, part of a
workflow, or a task in the workflow.
You can also use advanced options to override the Integration Service or operating system profile assigned to the
workflow and select concurrent workflow run instances.

Running a Workflow
When you click Workflows > Start Workflow, the Integration Service runs the entire workflow. To run a
workflow from pmcmd, use the startworkflow command.

To run a workflow from the Workflow Manager:

1. Open the folder containing the workflow.


2. From the Navigator, select the workflow that you want to start.
3. Right-click the workflow in the Navigator and choose Start Workflow.
The Integration Service runs the entire workflow.
After the Workflow Manager sends a request to the Integration Service, the Output window displays the
Integration Service response. If an error appears, check the workflow log or session log for error messages.
You can also manually start a workflow by right-clicking in the Workflow Designer workspace and choosing
Start Workflow.

Running a Workflow with Advanced Options


Use the advanced options to override the Integration Service or operating system profile assigned to the
workflow and select concurrent workflow run instances.

To run a workflow using advanced options from the Workflow Manager:

1. Open the folder containing the workflow.


2. From the Navigator, select the workflow that you want to start.
3. Right-click the workflow in the Navigator and click Start Workflow Advanced.
The Start Workflow - Advanced options dialog box appears.

Manually Starting a Workflow 79


4. Configure the following options:

Advanced Options Description

Integration Service Overrides the Integration Service configured for the workflow.

Operating System Profile Overrides the operating system profile assigned to the folder.

Workflow Run Instances The workflow instances you want to run. Appears if the workflow is
configured for concurrent execution. For more information about workflow
run instances, see “Configuring Unique Workflow Instances” on page 326.

5. Click OK.

Running Part of a Workflow


To run part of the workflow, right-click the task that you want the Integration Service to run and choose Start
Workflow From Task. The Integration Service runs the workflow from the selected task to the end of the
workflow.
For example, you have a workflow with multiple tasks and multiple branches, as shown in Figure 3-4. If you
want to run the tasks commandtask2, e_email2, and command3, start the workflow from commandtask2. All
subsequent tasks in the branch run.

Figure 3-4. Running Part of a Workflow - Example

When you start the workflow from


commandtask2, the Integration Service
runs this portion of the workflow.

To run a part of a workflow from pmcmd, use the startfrom flag of the startworkflow command.

To run a part of a workflow:

1. Connect to the folder containing the workflow.


2. In the Navigator, drill down the Workflow node to show the tasks in the workflow.
-or-
In the Workflow Designer workspace, select the task from which you want the Integration Service to begin
running the workflow.
3. Right-click the task for which you want the Integration Service to begin running the workflow.
4. Click Start Workflow From Task.

Running a Task in the Workflow


When you start a task in the workflow, the Workflow Manager locks the entire workflow so another user cannot
start the workflow. The Integration Service runs the selected task. It does not run the rest of the workflow.
To run a task using the Workflow Manager, select the task in the Workflow Designer workspace. Right-click the
task and choose Start Task.

80 Chapter 3: Working with Workflows


You can also use menu commands in the Workflow Manager to start a task. In the Navigator, drill down the
Workflow node to locate the task. Right-click the task you want to start and choose Start Task.
To start a task in a workflow from pmcmd, use the starttask command.

Suspending the Workflow


When a task in the workflow fails, you might want to suspend the workflow, fix the error, and recover the
workflow. The Integration Service suspends the workflow when you enable the Suspend on Error option in the
workflow properties. Optionally, you can set a suspension email so the Integration Service sends an email when
it suspends a workflow.
When you enable the workflow to suspend on error, the Integration Service suspends the workflow when one of
the following tasks fail:
♦ Session
♦ Command
♦ Worklet
♦ Email
When a task fails in the workflow, the Integration Service stops running tasks in the path. The Integration
Service does not evaluate the output link of the failed task. If no other task is running in the workflow, the
Workflow Monitor displays the status of the workflow as “Suspended.”
If you have the high availability option, the Integration Service suspends the workflow depending on how
automatic task recovery is set. If you configure the workflow to suspend on error and do not enable automatic
task recovery, the workflow suspends when a task fails. If you enable automatic task recovery, the Integration
Service first attempts to restart the task up to the specified recovery limit, and then suspends the workflow if it
cannot restart the failed task.
If one or more tasks are still running in the workflow when a task fails, the Integration Service stops running the
failed task and continues running tasks in other paths. The Workflow Monitor displays the status of the
workflow as “Suspending.”
When the status of the workflow is “Suspended” or “Suspending,” you can fix the error, such as a target
database error, and recover the workflow in the Workflow Monitor. When you recover a workflow, the
Integration Service restarts the failed tasks and continues evaluating the rest of the tasks in the workflow. The
Integration Service does not run any task that already completed successfully.
Note: Editing a suspended workflow or tasks inside a suspended workflow can cause repository inconsistencies.

To suspend a workflow:

1. In the Workflow Designer, open the workflow.


2. Click Workflows > Edit.
3. In the General tab, enable Suspend on Error.
4. Click OK.

RELATED TOPICS:
♦ “Configuring Task Recovery” on page 227

Configuring Suspension Email


You can configure the workflow so that the Integration Service sends an email when it suspends a workflow.
Select an existing reusable email task for the suspension email. When a task fails, the Integration Service starts

Suspending the Workflow 81


suspending the workflow and sends the suspension email. If another task fails while the Integration Service is
suspending the workflow, you do not receive the suspension email again.
The Integration Service sends a suspension email if another task fails after you resume the workflow.

RELATED TOPICS:
♦ “Working with Suspension Email” on page 244

Stopping or Aborting the Workflow


You can specify when and how you want the Integration Service to stop or abort a workflow by using the
Control task in the workflow. After you start a workflow, you can stop or abort it through the Workflow
Monitor or pmcmd. You can issue the stop or abort command at any time during the execution of a workflow.
You can stop or abort a workflow by performing one of the following actions:
♦ Use a Control task in the workflow. For more information, see “Working with the Control Task” on
page 103.
♦ Issue a stop or abort command in the Workflow Monitor. For more information, see “Monitoring
Workflows” on page 335.
♦ Issue a stop or abort command in pmcmd. For more information, see the PowerCenter Command Reference.
You can also stop or abort a task within a workflow. For more information about stopping the Session task, see
“Stopping and Aborting a Session” on page 130.

How the Integration Service Handles Stop and Abort


When you stop a workflow, the Integration Service tries to stop all the tasks that are currently running in the
workflow. If the workflow contains a worklet, the Integration Service also tries to stop all the tasks that are
currently running in the worklet. If it cannot stop the workflow, you need to abort the workflow.
The Integration Service can stop the following tasks completely:
♦ Session
♦ Command
♦ Timer
♦ Event-Wait
♦ Worklet
When you stop a Command task that contains multiple commands, the Integration Service finishes executing
the current command and does not run the rest of the commands. The Integration Service cannot stop tasks
such as the Email task. For example, if the Integration Service has already started sending an email when you
issue the stop command, the Integration Service finishes sending the email before it stops running the
workflow.
The Integration Service aborts the workflow if the Repository Service process shuts down.

Stopping or Aborting a Task


You can stop or abort a task within a workflow from the Workflow Monitor. When you stop or abort a task, the
Integration Service stops processing the task. The Integration Service does not process other tasks in the path of
the stopped or aborted task. The Integration Service continues processing concurrent tasks in the workflow. If
the Integration Service cannot stop the task, you can abort the task.
When you abort a task, the Integration Service kills the process on the task. The Integration Service continues
processing concurrent tasks in the workflow when you abort a task.

82 Chapter 3: Working with Workflows


You can also stop or abort a worklet. The Integration Service stops and aborts a worklet similar to stopping and
aborting a task. The Integration Service stops the worklet while executing concurrent tasks in the workflow. You
can also stop or abort tasks within a worklet.

Stopping or Aborting a Session Task


If the Integration Service is executing a Session task when you issue the stop command, the Integration Service
stops reading data. It continues processing and writing data and committing data to targets. If the Integration
Service cannot finish processing and committing data, you can issue the abort command.
The Integration Service handles the abort command for the Session task like the stop command, except it has a
timeout period of 60 seconds. If the Integration Service cannot finish processing and committing data within
the timeout period, it kills the DTM process and terminates the session.

RELATED TOPICS:
♦ “Stopping and Aborting a Session” on page 130

Viewing a Workflow Report


You can view PowerCenter Repository Reports for workflows in the Workflow Manager. View the Workflow
Composite Report to get more information about the workflow tasks, events, and variables in a workflow.
When you view the report, the Designer launches the Data Analyzer application in a browser window and
displays the report.
Before you run the report in the Workflow Manager, create a Reporting Service in the PowerCenter domain that
contains the PowerCenter repository. Use the PowerCenter repository that contains the workflow you want to
report on as the data source when you create the Reporting Service. When you create a Reporting Service for a
PowerCenter repository, Data Analyzer imports the PowerCenter Repository Reports.
The Workflow Composite Report includes information about the following components in a workflow:
♦ Tasks. Tasks contained in the workflow.
♦ Link conditions. Links between objects in the workflow.
♦ Events. User-defined and built-in events in the workflow.
♦ Variables. User-defined and built-in variables in the workflow.

To view a Workflow Composite Report:

1. In the Workflow Manager, open a workflow.


2. Right-click in the workspace and choose View Workflow Report.
The Workflow Manager launches Data Analyzer in the default browser for the client machine and runs the
Workflow Composite Report.

Viewing a Workflow Report 83


84 Chapter 3: Working with Workflows
CHAPTER 4

Working with Workflow Variables


This chapter includes the following topics:
♦ Overview, 85
♦ Predefined Workflow Variables, 86
♦ User-Defined Workflow Variables, 90

Overview
You can create and use variables in a workflow to reference values and record information. For example, use a
variable in a Decision task to determine whether the previous task ran properly. If it did, you can run the next
task. If not, you can stop the workflow.
Use the following types of workflow variables:
♦ Predefined workflow variables. The Workflow Manager provides predefined workflow variables for tasks
within a workflow. For more information, see “Predefined Workflow Variables” on page 86.
♦ User-defined workflow variables. You create user-defined workflow variables when you create a workflow.
For more information, see “User-Defined Workflow Variables” on page 90.
Use workflow variables when you configure the following types of tasks:
♦ Assignment tasks. Use an Assignment task to assign a value to a user-defined workflow variable. For
example, you can increment a user-defined counter variable by setting the variable to its current value plus 1.
For more information about using workflow variables in Assignment tasks, see “Working with the
Assignment Task” on page 100.
♦ Decision tasks. Decision tasks determine how the Integration Service runs a workflow. For example, use the
Status variable to run a second session only if the first session completes successfully. For more information
about using workflow variables in Decision tasks, see “Working with the Decision Task” on page 103.
♦ Links. Links connect each workflow task. Use workflow variables in links to create branches in the workflow.
For example, after a Decision task, you can create one link to follow when the decision condition evaluates
to true, and another link to follow when the decision condition evaluates to false. For more information
about using workflow variables in Link tasks, see “Working with Links” on page 70.
♦ Timer tasks. Timer tasks specify when the Integration Service begins to run the next task in the workflow.
Use a user-defined date/time variable to specify the time the Integration Service starts to run the next task.
For more information about using workflow variables in Timer tasks, see “Working with the Timer Task”
on page 109.
Use the Expression Editor to create an expression that uses variables. When you build an expression, you can
select predefined variables on the Predefined tab. You can select user-defined variables on the User-Defined tab.

85
The Functions tab contains functions that you use with workflow variables. Use the point-and-click method to
enter an expression using a variable.
Use the following keywords to write expressions for user-defined and predefined workflow variables:
♦ AND
♦ OR
♦ NOT
♦ TRUE
♦ FALSE
♦ NULL
♦ SYSDATE

RELATED TOPICS:
♦ “Using the Expression Editor” on page 72

Predefined Workflow Variables


Each workflow contains a set of predefined variables that you use to evaluate workflow and task conditions. Use
the following types of predefined variables:
♦ Task-specific variables. The Workflow Manager provides a set of task-specific variables for each task in the
workflow. Use task-specific variables in a link condition to control the path the Integration Service takes
when running the workflow. The Workflow Manager lists task-specific variables under the task name in the
Expression Editor.
♦ Built-in variables. Use built-in variables in a workflow to return run-time or system information such as
folder name, Integration Service Name, system date, or workflow start time. The Workflow Manager lists
built-in variables under the Built-in node in the Expression Editor. For more information about built-in
variables, see the PowerCenter Transformation Language Reference.
Tip: When you set the error severity level for log files to Tracing in the PowerCenter Server setup, the workflow
log displays the values of workflow variables. Use this logging level for troubleshooting only.
Table 4-1 lists the task-specific workflow variables available in the Workflow Manager:

Table 4-1. Task-Specific Workflow Variables

Task-Specific Task
Description Datatype
Variables Types

Condition Evaluation result of decision condition expression. Decision Integer


If the task fails, the Workflow Manager keeps the condition
set to null.
Sample syntax:
$Dec_TaskStatus.Condition =
<TRUE | FALSE | NULL | any integer>

EndTime Date and time the associated task ended. Precision is to All tasks Date/Time
the second.
Sample syntax:
$s_item_summary.EndTime >
TO_DATE('11/10/2004 08:13:25')

86 Chapter 4: Working with Workflow Variables


Table 4-1. Task-Specific Workflow Variables

Task-Specific Task
Description Datatype
Variables Types

ErrorCode Last error code for the associated task. If there is no error, All tasks Integer
the Integration Service sets ErrorCode to 0 when the task
completes.
Sample syntax:
$s_item_summary.ErrorCode = 24013
Note: You might use this variable when a task consistently
fails with this final error message.

ErrorMsg Last error message for the associated task. All tasks Nstring*
If there is no error, the Integration Service sets ErrorMsg to
an empty string when the task completes.
Sample syntax:
$s_item_summary.ErrorMsg =
'PETL_24013 Session run completed
with failure
Note: You might use this variable when a task consistently
fails with this final error message.

FirstErrorCode Error code for the first error message in the session. Session Integer
If there is no error, the Integration Service sets
FirstErrorCode to 0 when the session completes.
Sample syntax:
$s_item_summary.FirstErrorCode = 7086

FirstErrorMsg First error message in the session. Session Nstring*


If there is no error, the Integration Service sets
FirstErrorMsg to an empty string when the task completes.
Sample syntax:
$s_item_summary.FirstErrorMsg =
'TE_7086 Tscrubber: Debug info…
Failed to evalWrapUp'

PrevTaskStatus Status of the previous task in the workflow that the All tasks Integer
Integration Service ran. Statuses include:
- ABORTED
- FAILED
- STOPPED
- SUCCEEDED
Use these key words when writing expressions to evaluate
the status of the previous task.
Sample syntax:
$Dec_TaskStatus.PrevTaskStatus =
FAILED

SrcFailedRows Total number of rows the Integration Service failed to read Session Integer
from the source.
Sample syntax:
$s_dist_loc.SrcFailedRows = 0

SrcSuccessRows Total number of rows successfully read from the sources. Session Integer
Sample syntax:
$s_dist_loc.SrcSuccessRows > 2500

StartTime Date and time the associated task started. Precision is to All tasks Date/Time
the second.
Sample syntax:
$s_item_summary.StartTime >
TO_DATE('11/10/2004 08:13:25')

Predefined Workflow Variables 87


Table 4-1. Task-Specific Workflow Variables

Task-Specific Task
Description Datatype
Variables Types

Status Status of the previous task in the workflow. Statuses All tasks Integer
include:
- ABORTED
- DISABLED
- FAILED
- NOTSTARTED
- STARTED
- STOPPED
- SUCCEEDED
Use these key words when writing expressions to evaluate
the status of the current task.
Sample syntax:
$s_dist_loc.Status = SUCCEEDED

TgtFailedRows Total number of rows the Integration Service failed to write Session Integer
to the target.
Sample syntax:
$s_dist_loc.TgtFailedRows = 0

TgtSuccessRows Total number of rows successfully written to the target Session Integer
Sample syntax:
$s_dist_loc.TgtSuccessRows > 0

TotalTransErrors Total number of transformation errors. Session Integer


Sample syntax:
$s_dist_loc.TotalTransErrors = 5
* Variables of type Nstring can have a maximum length of 600 characters.

All predefined workflow variables except Status have a default value of null. The Integration Service uses the
default value of null when it encounters a predefined variable from a task that has not yet run in the workflow.
Therefore, expressions and link conditions that depend upon tasks not yet run are valid. The default value of
Status is NOTSTARTED.

RELATED TOPICS:
♦ “Evaluating Task Status in a Workflow” on page 89

Using Predefined Workflow Variables in Expressions


When you use a workflow variable in an expression, the Integration Service evaluates the expression and returns
True or False. If the condition evaluates to true, the Integration Service runs the next task. The Integration
Service writes an entry in the workflow log similar to the following message:
INFO : LM_36506 : (1980|1040) Link [Session2 --> Session3]: condition is TRUE for the
expression [$Session2.PrevTaskStatus = SUCCEEDED].

The Expression Editor displays the predefined workflow variables on the Predefined tab. The Workflow
Manager groups task-specific variables by task and lists built-in variables under the Built-in node. To use a
variable in an expression, double-click the variable. The Expression Editor displays task-specific variables in the
Expression field in the following format:
$<TaskName>.<predefinedVariable>

Evaluating Condition in a Workflow


Use Condition in link conditions to evaluate the result of a decision condition expression.

88 Chapter 4: Working with Workflow Variables


Figure 4-1 shows a workflow with link conditions using Condition:

Figure 4-1. Condition Variable Example


Command Task
Decision Task:
$Check_for_File.Status =
SUCCEEDED

Email Task

Command Task

Link condition:
$FileExists.Condition = TRUE
The Integration Service returns value based on
the decision task, FileExists.

Link condition:
$FileExists.Condition = FALSE
The Integration Service returns value based on
the decision task, FileExists.

When you run the workflow, the Integration Service evaluates the link condition and returns the value based on
the decision condition expression of the FileExists task. The Integration Service triggers either the email task or
the command task depending on the Check_for_File task outcome.

Evaluating Task Status in a Workflow


Use Status in link conditions to test the status of the previous task in the workflow.
Figure 4-2 shows a workflow with link conditions using Status:

Figure 4-2. Status Variable Example

Previous Task in Workflow

Link condition:
$Session2.Status = SUCCEEDED
The Integration Service returns value based on the
previous task in the workflow, Session2.

When you run the workflow, the Integration Service evaluates the link condition and returns the value based on
the status of Session2.

Evaluating Previous Task Status in a Workflow


Use PrevTaskStatus in link conditions to test the status of the previous task in the workflow that the Integration
Service ran.
Use PrevTaskStatus if you disable a task in the workflow. Status and PrevTaskStatus return the same value unless
the condition uses a disabled task.

Predefined Workflow Variables 89


Figure 4-3 shows a workflow with link conditions using PrevTaskStatus:

Figure 4-3. PrevTaskStatus Variable Example


Previous Task Run

Disabled Task

Link condition:
$Session2.PrevTaskStatus = SUCCEEDED
The Integration Service returns value based on the
previous task run, Session1.

When you run the workflow, the Integration Service skips Session2 because the session is disabled. When the
Integration Service evaluates the link condition, it returns the value based on the status of Session1.
Tip: If you do not disable Session2, the Integration Service returns the value based on the status of Session2. You
do not need to change the link condition when you enable and disable Session2.

User-Defined Workflow Variables


You can create variables within a workflow. When you create a variable in a workflow, it is valid only in that
workflow. Use the variable in tasks within that workflow. You can edit and delete user-defined workflow
variables.
Use user-defined variables when you need to make a workflow decision based on criteria you specify. For
example, you create a workflow to load data to an orders database nightly. You also need to load a subset of this
data to headquarters periodically, every tenth time you update the local orders database. Create separate sessions
to update the local database and the one at headquarters. The workflow looks like Figure 4-4:

Figure 4-4. Sample Workflow Using Workflow Variable

Use a user-defined variable to determine when to run the session that updates the orders database at
headquarters.
To configure user-defined workflow variables, complete the following steps:
1. Create a persistent workflow variable, $$WorkflowCount, to represent the number of times the workflow
has run.
2. Add a Start task and both sessions to the workflow.
3. Place a Decision task after the session that updates the local orders database.

90 Chapter 4: Working with Workflow Variables


Set up the decision condition to check to see if the number of workflow runs is evenly divisible by 10. Use
the modulus (MOD) function to do this.
4. Create an Assignment task to increment the $$WorkflowCount variable by one.
5. Link the Decision task to the session that updates the database at headquarters when the decision condition
evaluates to true. Link it to the Assignment task when the decision condition evaluates to false.
When you configure workflow variables using conditions, the session that updates the local database runs every
time the workflow runs. The session that updates the database at headquarters runs every 10th time the
workflow runs.

Workflow Variable Start and Current Values


Conceptually, the Integration Service holds two different values for a workflow variable during a workflow run:
♦ Start value of a workflow variable
♦ Current value of a workflow variable
The start value is the value of the variable at the start of the workflow. The start value could be a value defined
in the parameter file for the variable, a value saved in the repository from the previous run of the workflow, a
user-defined initial value for the variable, or the default value based on the variable datatype.
The Integration Service looks for the start value of a variable in the following order:
1. Value in parameter file
2. Value saved in the repository (if the variable is persistent)
3. User-specified default value
4. Datatype default value
For more information about datatype default values, see Table 4-2 on page 92.
For example, you create a workflow variable in a workflow and enter a default value, but you do not define a
value for the variable in a parameter file. The first time the Integration Service runs the workflow, it evaluates
the start value of the variable to the user-defined default value.
If you declare the variable as persistent, the Integration Service saves the value of the variable to the repository at
the end of the workflow run. The next time the workflow runs, the Integration Service evaluates the start value
of the variable as the value saved in the repository.
If the variable is non-persistent, the Integration Service does not save the value of the variable. The next time the
workflow runs, the Integration Service evaluates the start value of the variable as the user-specified default
value.
If you want to override the value saved in the repository before running a workflow, you need to define a value
for the variable in a parameter file. When you define a workflow variable in the parameter file, the Integration
Service uses this value instead of the value saved in the repository or the configured initial value for the variable.
The current value is the value of the variable as the workflow progresses. When a workflow starts, the current
value of a variable is the same as the start value. The value of the variable can change as the workflow progresses
if you create an Assignment task that updates the value of the variable.
If the variable is persistent, the Integration Service saves the current value of the variable to the repository at the
end of a successful workflow run. If the workflow fails to complete, the Integration Service does not update the
value of the variable in the repository.
The Integration Service states the value saved to the repository for each workflow variable in the workflow log.

Datatype Default Values


If the Integration Service cannot determine the start value of a variable by any other means, it uses a default
value for the variable based on its datatype.

User-Defined Workflow Variables 91


Table 4-2 lists the datatype default values for user-defined workflow variables:

Table 4-2. Datatype Default Values for User-Defined Workflow Variables

Datatype Workflow Manager Default Value

Date/Time 1/1/1753 00:00:00.000000000 A.D.

Double 0

Integer 0

Nstring Empty string

RELATED TOPICS:
♦ “Workflow Variable Start and Current Values” on page 91

Creating User-Defined Workflow Variables


You can create workflow variables for a workflow in the workflow properties.

To create a workflow variable:

1. In the Workflow Designer, create a new workflow or edit an existing one.


2. Select the Variables tab.
3. Click Add.
4. Enter the information in Table 4-3 and click OK:

Table 4-3. Options for Declaring User-Defined Workflow Variables

Field Description

Name Variable name. The correct format is $$VariableName. Workflow variable names
are not case sensitive.
Do not use a single dollar sign ($) for a user-defined workflow variable. The single
dollar sign is reserved for predefined workflow variables.

Datatype Datatype of the variable. You can select from the following datatypes:
- Date/Time
- Double
- Integer
- Nstring

Persistent Whether the variable is persistent. Enable this option if you want the value of the
variable retained from one execution of the workflow to the next.

Default Value Default value of the variable. The Integration Service uses this value for the
variable during sessions if you do not set a value for the variable in the parameter
file and there is no value stored in the repository.
Variables of type Date/Time can have the following formats:
- MM/DD/RR
- MM/DD/YYYY
- MM/DD/RR HH24:MI
- MM/DD/YYYY HH24:MI
- MM/DD/RR HH24:MI:SS
- MM/DD/YYYY HH24:MI:SS
- MM/DD/RR HH24:MI:SS.MS
- MM/DD/YYYY HH24:MI:SS.MS
- MM/DD/RR HH24:MI:SS.US
- MM/DD/YYYY HH24:MI:SS.US
- MM/DD/RR HH24:MI:SS.NS
- MM/DD/YYYY HH24:MI:SS.NS
You can use the following separators: dash (-), slash (/), backslash (\), colon (:),
period (.), and space. The Integration Service ignores extra spaces. You cannot
use one- or three-digit values for year or the “HH12” format for hour.
Variables of type Nstring can have a maximum length of 600 characters.

92 Chapter 4: Working with Workflow Variables


Table 4-3. Options for Declaring User-Defined Workflow Variables

Field Description

Is Null Whether the default value of the variable is null. If the default value is null, enable
this option.

Description Description associated with the variable.

5. To validate the default value of the new workflow variable, click the Validate button.
6. Click Apply to save the new workflow variable.
7. Click OK.

RELATED TOPICS:
♦ “Workflow Variable Start and Current Values” on page 91

User-Defined Workflow Variables 93


94 Chapter 4: Working with Workflow Variables
CHAPTER 5

Working with Tasks


This chapter includes the following topics:
♦ Overview, 95
♦ Creating a Task, 96
♦ Configuring Tasks, 97
♦ Validating Tasks, 99
♦ Working with the Assignment Task, 100
♦ Working with the Command Task, 100
♦ Working with the Control Task, 103
♦ Working with the Decision Task, 103
♦ Working with Event Tasks, 105
♦ Working with the Timer Task, 109

Overview
The Workflow Manager contains many types of tasks to help you build workflows and worklets. You can create
reusable tasks in the Task Developer. Or, create and add tasks in the Workflow or Worklet Designer as you
develop the workflow.
Table 5-1 summarizes workflow tasks available in Workflow Manager:

Table 5-1. Workflow Tasks

Task Name Tool Reusable Description

Assignment Workflow Designer No Assigns a value to a workflow variable. For more


Worklet Designer information, see “Working with the Assignment Task”
on page 100.

Command Task Developer Yes Specifies shell commands to run during the workflow.
Workflow Designer You can choose to run the Command task if the
Worklet Designer previous task in the workflow completes. For more
information, see “Working with the Command Task” on
page 100.

Control Workflow Designer No Stops or aborts the workflow. For more information,
Worklet Designer see “Working with the Control Task” on page 103.

95
Table 5-1. Workflow Tasks

Task Name Tool Reusable Description

Decision Workflow Designer No Specifies a condition to evaluate in the workflow. Use


Worklet Designer the Decision task to create branches in a workflow. For
more information, see “Working with the Decision Task”
on page 103.

Email Task Developer Yes Sends email during the workflow. For more information,
Workflow Designer see “Sending Email” on page 237.
Worklet Designer

Event-Raise Workflow Designer No Represents the location of a user-defined event. The


Worklet Designer Event-Raise task triggers the user-defined event when
the Integration Service runs the Event-Raise task. For
more information, see “Working with Event Tasks” on
page 105.

Event-Wait Workflow Designer No Waits for a user-defined or a predefined event to occur.


Worklet Designer Once the event occurs, the Integration Service
completes the rest of the workflow. For more
information, see “Working with Event Tasks” on
page 105.

Session Task Developer Yes Set of instructions to run a mapping. For more
Workflow Designer information, see “Working with Sessions” on page 119.
Worklet Designer

Timer Workflow Designer No Waits for a specified period of time to run the next task.
Worklet Designer For more information, see “Working with Event Tasks”
on page 105.

The Workflow Manager validates tasks attributes and links. If a task is invalid, the workflow becomes invalid.
Workflows containing invalid sessions may still be valid. For more information about validating tasks, see
“Validating Tasks” on page 99.

Creating a Task
You can create tasks in the Task Developer, or you can create them in the Workflow Designer or the Worklet
Designer as you develop the workflow or worklet. Tasks you create in the Task Developer are reusable. Tasks
you create in the Workflow Designer and Worklet Designer are non-reusable by default.

RELATED TOPICS:
♦ “Reusable Workflow Tasks” on page 97

Creating a Task in the Task Developer


You can create the following types of task in the Task Developer:
♦ Command
♦ Session
♦ Email

To create a task in the Task Developer:

1. In the Task Developer, click Tasks > Create.


2. Select the task type you want to create, Command, Session, or Email.
3. Enter a name for the task.
4. For session tasks, select the mapping you want to associate with the session.

96 Chapter 5: Working with Tasks


5. Click Create.
The Task Developer creates the workflow task.
6. Click Done to close the Create Task dialog box.

Creating a Task in the Workflow or Worklet Designer


You can create and add tasks in the Workflow Designer or Worklet Designer as you develop the workflow or
worklet. You can create any type of task in the Workflow Designer or Worklet Designer. Tasks you create in the
Workflow Designer or Worklet Designer are non-reusable. Edit the General tab of the task properties to
promote a non-reusable task to a reusable task.

To create tasks in the Workflow Designer or Worklet Designer:

1. In the Workflow Designer or Worklet Designer, open a workflow or worklet.


2. Click Tasks > Create.
3. Select the type of task you want to create.
4. Enter a name for the task.
5. Click Create.
The Workflow Designer or Worklet Designer creates the task and adds it to the workspace.
6. Click Done.
You can also use the Tasks toolbar to create and add tasks to the workflow. Click the button on the Tasks
toolbar for the task you want to create. Click again in the Workflow Designer or Worklet Designer workspace to
create and add the task. The Workflow Designer or Worklet Designer creates the task with a default task name
when you use the Tasks toolbar.

Configuring Tasks
After you create the task, you can configure general task options on the General tab. For each task instance in
the workflow, you can configure how the Integration Service runs the task and the other objects associated with
the selected task. You can also disable the task so you can run rest of the workflow without the selected task.
When you use a task in the workflow, you can edit the task in the Workflow Designer and configure the
following task options in the General tab:
♦ Fail parent if this task fails. Choose to fail the workflow or worklet containing the task if the task fails.
♦ Fail parent if this task does not run. Choose to fail the workflow or worklet containing the task if the task
does not run.
♦ Disable this task. Choose to disable the task so you can run the rest of the workflow without the task.
♦ Treat input link as AND or OR. Choose to have the Integration Service run the task when all or one of the
input link conditions evaluates to True.

Reusable Workflow Tasks


Workflows can contain reusable task instances and non-reusable tasks. Non-reusable tasks exist within a single
workflow. Reusable tasks can be used in multiple workflows in the same folder.
You can create any task as non-reusable or reusable. Tasks you create in the Task Developer are reusable. Tasks
you create in the Workflow Designer are non-reusable by default. However, you can edit the general properties
of a task to promote it to a reusable task.

Configuring Tasks 97
The Workflow Manager stores each reusable task separate from the workflows that use the task. You can view a
list of reusable tasks in the Tasks node in the Navigator window. You can see a list of all reusable Session tasks in
the Sessions node in the Navigator window.

Promoting a Non-Reusable Workflow Task


You can promote a non-reusable workflow task to a reusable task. Reusable tasks must have unique names
within the repository. When you promote a non-reusable task, the repository checks for naming conflicts. If a
reusable task with the same name already exists, the repository appends a number to the reusable task name to
make it unique. The repository applies the appended name to the checked-out version and to the latest
checked-in version of the reusable task.

To promote a non-reusable workflow task:

1. In the Workflow Designer, double-click the task you want to make reusable.
2. In the General tab of the Edit Task dialog box, select the Make Reusable option.
3. When prompted whether you are sure you want to promote the task, click Yes.
4. Click OK.
The newly promoted task appears in the list of reusable tasks in the Tasks node in the Navigator window.

Instances and Inherited Changes


When you add a reusable task to a workflow, you add an instance of the task. The definition of the task exists
outside the workflow, while an instance of the task exists in the workflow.
You can edit the task instance in the Workflow Designer. Changes you make in the task instance exist only in
the workflow. The task definition remains unchanged in the Task Developer.
When you make changes to a reusable task definition in the Task Developer, the changes reflect in the instance
of the task in the workflow if you have not edited the instance.

Reverting Changes in Reusable Tasks Instances


When you edit an instance of a reusable task in the workflow, you can revert back to the settings in the task
definition. When you change settings in the task instance, the Revert button appears. The Revert button
appears after you override task properties. You cannot use the Revert button for settings that are read-only or
locked by another user.

AND or OR Input Links


For each task, you can choose to treat the input link as an AND link or an OR link. When a task has one input
link, the Integration Service processes the task when the previous object completes and the link condition
evaluates to True. If you have multiple links going into one task, you can choose to have an AND input link so
that the Integration Service runs the task when all the link conditions evaluates to True. Or, you can choose to
have an OR input link so that the Integration Service runs the task as soon as any link condition evaluates to
True.
To set the type of input links, double-click the task to open the Edit Tasks dialog box. Select AND or OR for
the input link type.

RELATED TOPICS:
♦ “Working with Links” on page 70

98 Chapter 5: Working with Tasks


Disabling Tasks
In the Workflow Designer, you can disable a workflow task so that the Integration Service runs the workflow
without the disabled task. The status of a disabled task is DISABLED. Disable a task in the workflow by
selecting the Disable This Task option in the Edit Tasks dialog box.

Failing Parent Workflow or Worklet


You can choose to fail the workflow or worklet if a task fails or does not run. The workflow or worklet that
contains the task instance is called the parent. A task might not run when the input condition for the task
evaluates to False.
To fail the parent workflow or worklet if the task fails, double-click the task and select the Fail Parent If This
Task Fails option in the General tab. When you select this option and a task fails, it does not prevent the other
tasks in the workflow or worklet from running. Instead, the Integration Service marks the status of the
workflow or worklet as failed. If you have a session nested within multiple worklets, you must select the Fail
Parent If This Task Fails option for each worklet instance to see the failure at the workflow level.
To fail the parent workflow or worklet if the task does not run, double-click the task and select the Fail Parent If
This Task Does Not Run option in the General tab. When you choose this option, the Integration Service fails
the parent workflow if a task did not run.
Note: The Integration Service does not fail the parent workflow if you disable a task.

Validating Tasks
You can validate reusable tasks in the Task Developer. Or, you can validate task instances in the Workflow
Designer. When you validate a task, the Workflow Manager validates task attributes and links. For example, the
user-defined event you specify in an Event tasks must exist in the workflow.
The Workflow Manager uses the following rules to validate tasks:
♦ Assignment. The Workflow Manager validates the expression you enter for the Assignment task. For
example, the Workflow Manager verifies that you assigned a matching datatype value to the workflow
variable in the assignment expression.
♦ Command. The Workflow Manager does not validate the shell command you enter for the Command task.
♦ Event-Wait. If you choose to wait for a predefined event, the Workflow Manager verifies that you specified
a file to watch. If you choose to use the Event-Wait task to wait for a user-defined event, the Workflow
Manager verifies that you specified an event.
♦ Event-Raise. The Workflow Manager verifies that you specified a user-defined event for the Event-Raise
task.
♦ Timer. The Workflow Manager verifies that the variable you specified for the Absolute Time setting has the
Date/Time datatype.
♦ Start. The Workflow Manager verifies that you linked the Start task to at least one task in the workflow.
When a task instance is invalid, the workflow using the task instance becomes invalid. When a reusable task is
invalid, it does not affect the validity of the task instance used in the workflow. However, if a Session task
instance is invalid, the workflow may still be valid. The Workflow Manager validates sessions differently.
To validate a task, select the task in the workspace and click Tasks > Validate. Or, right-click the task in the
workspace and choose Validate.

RELATED TOPICS:
♦ “Validating a Session” on page 129

Validating Tasks 99
Working with the Assignment Task
You can assign a value to a user-defined workflow variable with the Assignment task. To use an Assignment task
in the workflow, first create and add the Assignment task to the workflow. Then configure the Assignment task
to assign values or expressions to user-defined variables. After you assign a value to a variable using the
Assignment task, the Integration Service uses the assigned value for the variable during the remainder of the
workflow.
You must create a variable before you can assign values to it. You cannot assign values to predefined workflow
variables.

To create an Assignment task:

1. In the Workflow Designer, click Tasks > Create.


2. Select Assignment Task for the task type.
3. Enter a name for the Assignment task. Click Create. Then click Done.
The Workflow Designer creates and adds the Assignment task to the workflow.
4. Double-click the Assignment task to open the Edit Task dialog box.
5. On the Expressions tab, click Add to add an assignment.
6. Click the Open button in the User Defined Variables field.
7. Select the variable for which you want to assign a value. Click OK.
8. Click the Edit button in the Expression field to open the Expression Editor.
The Expression Editor shows predefined workflow variables, user-defined workflow variables, variable
functions, and boolean and arithmetic operators.
9. Enter the value or expression you want to assign.
For example, if you want to assign the value 500 to the user-defined variable $$custno1, enter the number
500 in the Expression Editor.
10. Click Validate.
Validate the expression before you close the Expression Editor.
11. Repeat steps 6 to 8 to add more variable assignments.
Use the up and down arrows in the Expressions tab to change the order of the variable assignments.
12. Click OK.

Working with the Command Task


You can specify one or more shell commands to run during the workflow with the Command task. For example,
you can specify shell commands in the Command task to delete reject files, copy a file, or archive target files.
Use a Command task in the following ways:
♦ Standalone Command task. Use a Command task anywhere in the workflow or worklet to run shell
commands.
♦ Pre- and post-session shell command. You can call a Command task as the pre- or post-session shell
command for a Session task.
Use any valid UNIX command or shell script for UNIX servers, or any valid DOS or batch file for Windows
servers. For example, you might use a shell command to copy a file from one directory to another. For a

100 Chapter 5: Working with Tasks


Windows server you would use the following shell command to copy the SALES_ ADJ file from the source
directory, L, to the target, H:
copy L:\sales\sales_adj H:\marketing\

For a UNIX server, you would use the following command to perform a similar operation:
cp sales/sales_adj marketing/

Each shell command runs in the same environment as the Integration Service. Environment settings in one shell
command script do not carry over to other scripts. To run all shell commands in the same environment, call a
single shell script that invokes other scripts.

RELATED TOPICS:
♦ “Using Pre- and Post-Session Shell Commands” on page 127

Using Parameters and Variables


You can use parameters and variables in standalone Command tasks and pre- and post-session shell commands.
For example, you might use a service process variable instead of hard-coding a directory name.
You can use the following parameters and variables in commands:
♦ Standalone Command tasks. You can use service, service process, workflow, and worklet variables in
standalone Command tasks. You cannot use session parameters, mapping parameters, or mapping variables
in standalone Command tasks. The Integration Service does not expand these types of parameters and
variables in standalone Command tasks.
♦ Pre- and post-session shell commands. You can use any parameter or variable type that you can define in the
parameter file.

RELATED TOPICS:
♦ “Where to Use Parameters and Variables” on page 405

Assigning Resources
You can assign resources to Command task instances in the Worklet or Workflow Designer. You might want to
assign resources to a Command task if you assign the workflow to an Integration Service associated with a grid.
When you assign a resource to a Command task and the Integration Service is configured to check resources,
the Load Balancer dispatches the task to a node that has the resource available. A task fails if the Load Balancer
cannot find a node where the required resource is available.
For information about assigning resources to Command and Session tasks, see “Assigning Resources to Tasks”
on page 372. For information about configuring the Integration Service to check resources, see the PowerCenter
Administrator Guide.

Creating a Command Task


Complete the following steps to create a Command task.

To create a Command task:

1. In the Workflow Designer or the Task Developer, click Task > Create.
2. Select Command Task for the task type.
3. Enter a name for the Command task. Click Create. Then click Done.
4. Double-click the Command task in the workspace to open the Edit Tasks dialog box.
5. In the Commands tab, click the Add button to add a command.
6. In the Name field, enter a name for the new command.
7. In the Command field, click the Edit button to open the Command Editor.

Working with the Command Task 101


8. Enter the command you want to run. Enter one command in the Command Editor. You can use service,
service process, workflow, and worklet variables in the command.
9. Click OK to close the Command Editor.
10. Repeat steps 4 to 9 to add more commands in the task.
11. Optionally, click the General tab in the Edit Tasks dialog to assign resources to the Command task.
12. Click OK.
If you specify non-reusable shell commands for a session, you can promote the non-reusable shell commands to
a reusable Command task.

RELATED TOPICS:
♦ “Assigning Resources to Tasks” on page 372

♦ “Creating a Reusable Command Task from Pre- or Post-Session Commands” on page 129

Executing Commands in the Command Task


The Integration Service runs shell commands in the order you specify them. If the Load Balancer has more
Command tasks to dispatch than the Integration Service can run at the time, the Load Balancer places the tasks
it cannot run in a queue. When the Integration Service becomes available, the Load Balancer dispatches tasks
from the queue in the order determined by the workflow service level. For more information about how the
Load Balancer uses service levels, see the PowerCenter Administrator Guide.
You can choose to run a command only if the previous command completed successfully. Or, you can choose to
run all commands in the Command task, regardless of the result of the previous command. If you configure
multiple commands in a Command task to run on UNIX, each command runs in a separate shell.
If you choose to run a command only if the previous command completes successfully, the Integration Service
stops running the rest of the commands and fails the task when one of the commands in the Command task
fails. If you do not choose this option, the Integration Service runs all the commands in the Command task and
treats the task as completed, even if a command fails. If you want the Integration Service to perform the next
command only if the previous command completes successfully, select Fail Task if Any Command Fails in the
Properties tab of the Command task.
Figure 5-1 shows the Fail Task if Any Command Fails option:

Figure 5-1. Fail Task if Any Command Fails Option

Set the task to fail if


any command in the
task fails.

Set the command


recovery strategy to
restart the task or fail
the task and continue
the workflow.

102 Chapter 5: Working with Tasks


You can choose a recovery strategy for the task. The recovery strategy determines how the Integration Service
recovers the task when you configure workflow recovery and the task fails. You can configure the task to restart
or you can configure the task to fail and continue running the workflow.

RELATED TOPICS:
♦ “Configuring Task Recovery” on page 227

Working with the Control Task


Use the Control task to stop, abort, or fail the top-level workflow or the parent workflow based on an input link
condition. A parent workflow or worklet is the workflow or worklet that contains the Control task.

To create a Control task:

1. In the Workflow Designer, click Tasks > Create.


2. Select Control Task for the task type.
3. Enter a name for the Control task. Click Create. Then click Done.
The Workflow Manager creates and adds the Control task to the workflow.
4. Double-click the Control task in the workspace to open it.
5. Configure the following control options on the Properties tab:

Control Option Description

Fail Me Marks the Control task as “Failed.” The Integration Service fails the
Control task if you choose this option. If you choose Fail Me in the
Properties tab and choose Fail Parent If This Task Fails in the General
tab, the Integration Service fails the parent workflow.

Fail Parent Marks the status of the workflow or worklet that contains the Control
task as failed after the workflow or worklet completes.

Stop Parent Stops the workflow or worklet that contains the Control task.

Abort Parent Aborts the workflow or worklet that contains the Control task.

Fail Top-Level Workflow Fails the workflow that is running.

Stop Top-Level Workflow Stops the workflow that is running.

Abort Top-Level Workflow Aborts the workflow that is running.

Working with the Decision Task


You can enter a condition that determines the execution of the workflow, similar to a link condition with the
Decision task. The Decision task has a predefined variable called $Decision_task_name.condition that represents
the result of the decision condition. The Integration Service evaluates the condition in the Decision task and
sets the predefined condition variable to True (1) or False (0).
You can specify one decision condition per Decision task. After the Integration Service evaluates the Decision
task, use the predefined condition variable in other expressions in the workflow to help you develop the
workflow.

Working with the Control Task 103


Depending on the workflow, you might use link conditions instead of a Decision task. However, the Decision
task simplifies the workflow. If you do not specify a condition in the Decision task, the Integration Service
evaluates the Decision task to True.

RELATED TOPICS:
♦ “Working with Links” on page 70

Using the Decision Task


Use the Decision task instead of multiple link conditions in a workflow. Instead of specifying multiple link
conditions, use the predefined condition variable in a Decision task to simplify link conditions.

Example
For example, you have a Command task that depends on the status of the three sessions in the workflow. You
want the Integration Service to run the Command task when any of the three sessions fails. To accomplish this,
use a Decision task with the following decision condition:
$Q1_session.status = FAILED OR $Q2_session.status = FAILED OR $Q3_session.status = FAILED

You can then use the predefined condition variable in the input link condition of the Command task.
Configure the input link with the following link condition:
$Decision.condition = True

Figure 5-2 shows the sample workflow using a Decision task:

Figure 5-2. Sample Workflow Using a Decision Task

You can configure the same logic in the workflow without the Decision task. Without the Decision task, you
need to use three link conditions and treat the input links to the Command task as OR links.
Figure 5-3 shows the sample workflow without the Decision task:

Figure 5-3. Sample Workflow Without a Decision Task

104 Chapter 5: Working with Tasks


You can further expand the sample workflow in Figure 5-2. In Figure 5-2, the Integration Service runs the
Command task if any of the three Session tasks fails. Suppose now you want the Integration Service to also run
an Email task if all three Session tasks succeed.
To do this, add an Email task and use the decision condition variable in the link condition.
Figure 5-4 shows the expanded sample workflow using a Decision task:

Figure 5-4. Expanded Sample Workflow Using a Decision Task

$Decision.condition = True

$Decision.condition = False

Creating a Decision Task


Complete the following steps to create a Decision task.

To create a Decision task:

1. In the Workflow Designer, click Tasks > Create.


2. Select Decision Task for the task type.
3. Enter a name for the Decision task. Click Create. Then click Done.
The Workflow Designer creates and adds the Decision task to the workspace.
4. Double-click the Decision task to open it.
5. Click the Open button in the Value field to open the Expression Editor.
6. In the Expression Editor, enter the condition you want the Integration Service to evaluate.
Validate the expression before you close the Expression Editor.
7. Click OK.

Working with Event Tasks


You can define events in the workflow to specify the sequence of task execution. The event is triggered based on
the completion of the sequence of tasks. Use the following tasks to help you use events in the workflow:
♦ Event-Raise task. Event-Raise task represents a user-defined event. When the Integration Service runs the
Event-Raise task, the Event-Raise task triggers the event. Use the Event-Raise task with the Event-Wait task
to define events.
♦ Event-Wait task. The Event-Wait task waits for an event to occur. Once the event triggers, the Integration
Service continues executing the rest of the workflow.
To coordinate the execution of the workflow, you may specify the following types of events for the Event-Wait
and Event-Raise tasks:
♦ Predefined event. A predefined event is a file-watch event. For predefined events, use an Event-Wait task to
instruct the Integration Service to wait for the specified indicator file to appear before continuing with the

Working with Event Tasks 105


rest of the workflow. When the Integration Service locates the indicator file, it starts the next task in the
workflow.
♦ User-defined event. A user-defined event is a sequence of tasks in the workflow. Use an Event-Raise task to
specify the location of the user-defined event in the workflow. A user-defined event is sequence of tasks in
the branch from the Start task leading to the Event-Raise task.
When all the tasks in the branch from the Start task to the Event-Raise task complete, the Event-Raise task
triggers the event. The Event-Wait task waits for the Event-Raise task to trigger the event before continuing
with the rest of the tasks in its branch.

Example of User-Defined Events


Say you have four sessions you want to run in a workflow. You want Q1_session and Q2_session to run
concurrently to save time. You also want to run Q3_session after Q1_session completes. You want to run
Q4_session only when Q1_session, Q2_session, and Q3_session complete.
Figure 5-5 shows how to accomplish this using the Event-Raise and Event-Wait tasks:

Figure 5-5. Example of User-Defined Events


User-defined event: Q1Q3_Complete

To configure the workflow shown in Figure 5-5, complete the following steps:
1. Link Q1_session and Q2_session concurrently.
2. Add Q3_session after Q1_session.
3. Declare an event called Q1Q3_Complete in the Events tab of the workflow properties.
4. In the workspace, add an Event-Raise task after Q3_session.
5. Specify the Q1Q3_Complete event in the Event-Raise task properties. This allows the Event-Raise task to
trigger the event when Q1_session and Q3_session complete.
6. Add an Event-Wait task after Q2_session.
7. Specify the Q1Q3_Complete event for the Event-Wait task.
8. Add Q4_session after the Event-Wait task. When the Integration Service processes the Event-Wait task, it
waits until the Event-Raise task triggers Q1Q3_Complete before it runs Q4_session.
The Integration Service runs the workflow shown in Figure 5-5 in the following order:
1. The Integration Service runs Q1_session and Q2_session concurrently.
2. When Q1_session completes, the Integration Service runs Q3_session.
3. The Integration Service finishes executing Q2_session.
4. The Event-Wait task waits for the Event-Raise task to trigger the event.
5. The Integration Service completes Q3_session.
6. The Event-Raise task triggers the event, Q1Q3_complete.
7. The Integration Service runs Q4_session because the event, Q1Q3_Complete, has been triggered.
8. The Integration Service runs the Email task.

106 Chapter 5: Working with Tasks


Working with Event-Raise Tasks
The Event-Raise task represents the location of a user-defined event. A user-defined event is the sequence of
tasks in the branch from the Start task to the Event-Raise task. When the Integration Service runs the Event-
Raise task, the Event-Raise task triggers the user-defined event.
To use an Event-Raise task, you must first declare the user-defined event. Then, create an Event-Raise task in
the workflow to represent the location of the user-defined event you just declared. In the Event-Raise task
properties, specify the name of a user-defined event.

Declaring a User-Defined Event


Complete the following steps to declare a name for a user-defined event.

To declare a user-defined event:

1. In the Workflow Designer, click Workflow > Edit.


2. Select the Events tab in the Edit Workflow dialog box.
3. Click the Add button to add an event name.
Event name is not case sensitive.
4. Click OK.

Using the Event-Raise Task for a User-Defined Event


After you declare a user-defined event, use the Event-Raise task to represent the location of the event and to
trigger the event.

To use an Event-Raise task:

1. In the Workflow Designer workspace, create an Event-Raise task and place it in the workflow to represent
the user-defined event you want to trigger.
A user-defined event is the sequence of tasks in the branch from the Start task to the Event-Raise task.
2. Double-click the Event-Raise task to open it.
3. On the Properties tab, click the Open button in the Value field to open the Events Browser for user-defined
events.
4. Choose an event in the Events Browser.
5. Click OK twice.

Working with Event-Wait Tasks


The Event-Wait task waits for a predefined event or a user-defined event. A predefined event is a file-watch
event. When you use the Event-Wait task to wait for a predefined event, you specify an indicator file for the
Integration Service to watch. The Integration Service waits for the indicator file to appear. Once the indicator
file appears, the Integration Service continues running tasks after the Event-Wait task.
You can assign resources to Event-Wait tasks that wait for predefined events. You may want to assign a resource
to a predefined Event-Wait task if you are running on a grid and the indicator file appears on a specific node or
in a specific directory. When you assign a resource to a predefined Event-Wait task and the Integration Service
is configured to check resources, the Load Balancer distributes the task to a node where the required resource is
available. For more information about assigning resources to tasks, see “Assigning Resources to Tasks” on
page 372. For more information about configuring the Integration Service to check resources, see the
PowerCenter Administrator Guide.
Note: If you use the Event-Raise task to trigger the event when you wait for a predefined event, you may not be
able to successfully recover the workflow.

Working with Event Tasks 107


You can also use the Event-Wait task to wait for a user-defined event. To use the Event-Wait task for a user-
defined event, specify the name of the user-defined event in the Event-Wait task properties. The Integration
Service waits for the Event-Raise task to trigger the user-defined event. Once the user-defined event is triggered,
the Integration Service continues running tasks after the Event-Wait task.

Waiting for User-Defined Events


Use the Event-Wait task to wait for a user-defined event. A user-defined event is triggered by the Event-Raise
task. To wait for a user-defined event, you must first use an Event-Raise task to trigger the user-defined event.

To wait for a user-defined event:

1. In the workflow, create an Event-Wait task and double-click the Event-Wait task to open it.
2. In the Events tab of the task, select User-Defined.
3. Click the Event button to open the Events Browser dialog box.
4. Select a user-defined event for the Integration Service to wait.
5. Click OK twice.

Waiting for Predefined Events


To use a predefined event, you need a shell command, script, or batch file to create an indicator file. The file
must be created or sent to a directory that the Integration Service can access. The file can be any format
recognized by the Integration Service operating system. You can choose to have the Integration Service delete
the indicator file after it detects the file, or you can manually delete the indicator file. The Integration Service
marks the status of the Event-Wait task as failed if it cannot delete the indicator file.
When you specify the indicator file in the Event-Wait task, enter the directory in which the file appears and the
name of the indicator file. You must provide the absolute path for the file. If you specify the file name and not
the directory, the Integration Service looks for the indicator file in the following directory:
♦ On Windows, the Integration Service looks for the file in the system directory. For example, on Windows
2000, the system directory is c:\winnt\system32.
♦ On UNIX, the Integration Service looks for the indicator file in the current working directory for the
Integration Service process. On UNIX this directory is /server/bin.
You can enter the actual name of the file or use process variables to specify the location of the file. You can also
use user-defined workflow and worklet variables to specify the file name and location. For example, create a
workflow variable, $$MyFileWatchFile, for the indicator file name and location, and set $$MyFileWatchFile to
the file name and location in the parameter file.
The Integration Service writes the time the file appears in the workflow log.
Note: Do not use a source or target file name as the indicator file name because you may accidentally delete a
source or target file. Or, the Integration Service may try to delete the file before the session finishes writing to
the target.

To wait for a predefined event in the workflow:

1. Create an Event-Wait task and double-click the Event-Wait task to open it.
2. In the Events tab of the task, select Predefined.
3. Enter the path of the indicator file.
4. If you want the Integration Service to delete the indicator file after it detects the file, select the Delete
Filewatch File option in the Properties tab.
5. Click OK.

108 Chapter 5: Working with Tasks


Enabling Past Events
By default, the Event-Wait task waits for the Event-Raise task to trigger the event. By default, the Event-Wait
task does not check if the event already occurred. You can select the Enable Past Events option so that the
Integration Service verifies that the event has already occurred.
When you select Enable Past Events, the Integration Service continues executing the next tasks if the event
already occurred.
Select the Enable Past Events option in the Properties tab of the Event-Wait task.

Working with the Timer Task


You can specify the period of time to wait before the Integration Service runs the next task in the workflow with
the Timer task. You can choose to start the next task in the workflow at a specified time and date. You can also
choose to wait a period of time after the start time of another task, workflow, or worklet before starting the next
task.
The Timer task has two types of settings:
♦ Absolute time. You specify the time that the Integration Service starts running the next task in the workflow.
You may specify the date and time, or you can choose a user-defined workflow variable to specify the time.
♦ Relative time. You instruct the Integration Service to wait for a specified period of time after the Timer task,
the parent workflow, or the top-level workflow starts.
For example, a workflow contains two sessions. You want the Integration Service wait 10 minutes after the first
session completes before it runs the second session. Use a Timer task after the first session. In the Relative Time
setting of the Timer task, specify ten minutes from the start time of the Timer task. Use a Timer task anywhere
in the workflow after the Start task.

To create a Timer task:

1. In the Workflow Designer, click Tasks > Create.


2. Select Timer Task for the task type.
3. Double-click the Timer task to open it.
4. On the General tab, enter a name for the Timer task.
5. Click the Timer tab to specify when the Integration Service starts the next task in the workflow.
6. Specify attributes for Absolute Time or Relative Time described in Table 5-2:

Table 5-2. Timer Task Attributes

Timer Attribute Description

Absolute Time: Specify the Integration Service starts the next task in the workflow at the date
exact time to start and time you specify.

Absolute Time: Use this Specify a user-defined date-time workflow variable. The Integration
workflow date-time variable to Service starts the next task in the workflow at the time you choose.
calculate the wait The Workflow Manager verifies that the variable you specify has
the Date/Time datatype. If the variable precision includes
subseconds, the Integration Service ignores the subsecond portion
of the time value.
The Timer task fails if the date-time workflow variable evaluates to
NULL.

Relative time: Start after Specify the period of time the Integration Service waits to start
executing the next task in the workflow.

Working with the Timer Task 109


Table 5-2. Timer Task Attributes

Timer Attribute Description

Relative time: from the start Select this option to wait a specified period of time after the start
time of this task time of the Timer task to run the next task.

Relative time: from the start Select this option to wait a specified period of time after the start
time of the parent time of the parent workflow/worklet to run the next task.
workflow/worklet

Relative time: from the start Choose this option to wait a specified period of time after the start
time of the top-level workflow time of the top-level workflow to run the next task.

110 Chapter 5: Working with Tasks


CHAPTER 6

Working with Worklets


This chapter includes the following topics:
♦ Overview, 111
♦ Developing a Worklet, 111
♦ Using Worklet Variables, 114
♦ Assigning Variable Values in a Worklet, 114
♦ Validating Worklets, 116

Overview
A worklet is an object representing a set of tasks created to reuse a set of workflow logic in multiple workflows.
You can create a worklet in the Worklet Designer.
To run a worklet, include the worklet in a workflow. The workflow that contains the worklet is called the
parent workflow. When the Integration Service runs a worklet, it expands the worklet to run tasks and evaluate
links within the worklet. It writes information about worklet execution in the workflow log.

Suspending Worklets
When you choose Suspend on Error for the parent workflow, the Integration Service also suspends the worklet
if a task in the worklet fails. When a task in the worklet fails, the Integration Service stops executing the failed
task and other tasks in its path. If no other task is running in the worklet, the worklet status is “Suspended.” If
one or more tasks are still running in the worklet, the worklet status is “Suspending.” The Integration Service
suspends the parent workflow when the status of the worklet is “Suspended” or “Suspending.”

RELATED TOPICS:
♦ “Suspending the Workflow” on page 81

Developing a Worklet
To develop a worklet, you must first create a worklet. After you create a worklet, configure worklet properties
and add tasks to the worklet. You can create reusable worklets in the Worklet Designer. You can also create
non-reusable worklets in the Workflow Designer as you develop the workflow.

111
Creating a Reusable Worklet
Create reusable worklets in the Worklet Designer. You can view a list of reusable worklets in the Navigator
Worklets node.

To create a reusable worklet:

1. In the Worklet Designer, click Worklet > Create.


The Create Worklet dialog box appears.
2. Enter a name for the worklet.
3. If you are adding the worklet to a workflow that is enabled for concurrent execution, enable the worklet for
concurrent execution.
4. Click OK.
The Worklet Designer creates a Start task in the worklet.

RELATED TOPICS:
♦ “Starting and Stopping Concurrent Workflows” on page 329

Creating a Non-Reusable Worklet


You can create a non-reusable worklet in the Workflow Designer as you develop the workflow. Non-reusable
worklets only exist in the workflow. You cannot use a non-reusable worklet in another workflow. After you
create the worklet in the Workflow Designer, open the worklet to edit it in the Worklet Designer.
You can promote non-reusable worklets to reusable worklets by selecting the Make Reusable option in the
worklet properties. To rename a non-reusable worklet, open the worklet properties in the Workflow Designer.

To create a non-reusable worklet:

1. In the Workflow Designer, open a workflow.


2. Click Tasks > Create.
3. For the Task type, select Worklet.
4. Enter a name for the task.
5. Click Create.
The Workflow Designer creates the worklet and adds it to the workspace.
6. Click Done.

Configuring Worklet Properties


When you use a worklet in a workflow, you can configure the same set of general task settings on the General
tab as any other task. For example, you can make a worklet reusable, disable a worklet, configure the input link
to the worklet, or fail the parent workflow based on the worklet.
In addition to general task settings, you can configure the following worklet properties:
♦ Worklet variables. Use worklet variables to reference values and record information. You use worklet
variables the same way you use workflow variables. You can assign a workflow variable to a worklet variable
to override its initial value.
♦ Events. To use the Event-Wait and Event-Raise tasks in the worklet, you must first declare an event in the
worklet properties.
♦ Metadata extension. Extend the metadata stored in the repository by associating information with
repository objects.

112 Chapter 6: Working with Worklets


RELATED TOPICS:
♦ “Using Worklet Variables” on page 114
♦ “Working with Event Tasks” on page 105
♦ “Working with Metadata Extensions” on page 15

Adding Tasks in Worklets


After you create a worklet, add tasks by opening the worklet in the Worklet Designer. A worklet must contain a
Start task. The Start task represents the beginning of a worklet. When you create a worklet, the Worklet
Designer creates a Start task for you.

To add tasks to a non-reusable worklet:

1. Create a non-reusable worklet in the Workflow Designer workspace.


2. Right-click the worklet and choose Open Worklet.
The Worklet Designer opens so you can add tasks in the worklet.
3. Add tasks in the worklet by using the Tasks toolbar or click Tasks > Create in the Worklet Designer.
4. Connect tasks with links.

Declaring Events in Worklets


Use Event-Wait and Event-Raise tasks in a worklet like you would use workflows. To use the Event-Raise task,
you first declare a user-defined event in the worklet. Events in one instance of a worklet do not affect events in
other instances of the worklet. You cannot specify worklet events in the Event tasks in the parent workflow.

RELATED TOPICS:
♦ “Working with Event Tasks” on page 105

Viewing Links in a Worklet


When you edit a workflow or worklet, you can view the forward or backward link paths to other tasks. You can
highlight paths to see links in the workflow branch from the Start task to the last task in the branch.

RELATED TOPICS:
♦ “Creating a Workflow” on page 65

Nesting Worklets
You can nest a worklet within another worklet. When you run a workflow containing nested worklets, the
Integration Service runs the nested worklet from within the parent worklet. You can group several worklets
together by function or simplify the design of a complex workflow when you nest worklets.
You might choose to nest worklets to load data to fact and dimension tables. Create a nested worklet to load
fact and dimension data into a staging area. Then, create a nested worklet to load the fact and dimension data
from the staging area to the data warehouse.
You might choose to nest worklets to simplify the design of a complex workflow. Nest worklets that can be
grouped together within one worklet. To nest an existing reusable worklet, click Tasks > Insert Worklet. To
create a non-reusable nested worklet, click Tasks > Create, and select worklet.

Developing a Worklet 113


Using Worklet Variables
Worklet variables are similar to workflow variables. A worklet has the same set of predefined variables as any
task. You can also create user-defined worklet variables. Like user-defined workflow variables, user-defined
worklet variables can be persistent or non-persistent.

RELATED TOPICS:
♦ “Working with Workflow Variables” on page 85

Persistent Worklet Variables


User-defined worklet variables can be persistent or non-persistent. To create a persistent worklet variable, select
Persistent when you create the variable. When you create a persistent worklet variable, the worklet variable
retains its value the next time the Integration Service runs the worklet in the parent workflow.
For example, you have a worklet with a persistent variable. Use two instances of the worklet in a workflow to
run the worklet twice. You name the first instance of the worklet Worklet1 and the second instance Worklet2.
When you run the workflow, the persistent worklet variable retains its value from Worklet1 and becomes the
initial value in Worklet2. After the Integration Service runs Worklet2, it retains the value of the persistent
variable in the repository and uses the value the next time you run the workflow.
Worklet variables only persist when you run the same workflow. A worklet variable does not retain its value
when you use instances of the worklet in different workflows.

Overriding the Initial Value


For each worklet instance, you can override the initial value of the worklet variable by assigning a workflow
variable to it.

To override the initial value of a worklet variable:

1. Double-click the worklet instance in the Workflow Designer workspace.


2. On the Variables tab, click the Add button in the pre-worklet variable assignment.
3. Click the open button in the User-Defined Worklet Variables field to select a worklet variable.
4. Click Apply.
The worklet variable in this worklet instance now has the selected workflow variable as its initial value.

Rules and Guidelines


Use the following rules and guidelines when you work with worklet variables:
♦ You cannot use variables from the parent workflow in the worklet.
♦ You cannot use user-defined worklet variables in the parent workflow.
♦ You can use predefined worklet variables in the parent workflow, just as you use predefined variables for
other tasks in the workflow.

Assigning Variable Values in a Worklet


You can update the values of variables before or after a worklet runs. This allows you to pass information from
one worklet to another within the same workflow or parent worklet. For example, you have a workflow that
contains two worklets that need to increment the same counter. You can increment the counter in the first

114 Chapter 6: Working with Worklets


worklet, pass the updated counter value to the second worklet, and increment the counter again in the second
worklet.
You can also pass information from a worklet to a non-reusable session or from a non-reusable session to a
worklet as long as the worklet and session are in the same workflow or parent worklet. For more information
about passing information between sessions, see “Assigning Parameter and Variable Values in a Session” on
page 137. You can assign variables in reusable and non-reusable worklets.
You can update the values of different variables depending on whether you assign them before or after a worklet
runs. You can update the following types of variables before or after a worklet runs:
♦ Pre-worklet variable assignment. You can update user-defined worklet variables before a worklet runs. You
can assign these variables the values of parent workflow or worklet variables or the values of mapping
variables from other tasks in the workflow or parent worklet.
You can update worklet variables with values from the parent of the worklet. Therefore, if a worklet is in
another worklet within a workflow, you can assign values from the parent worklet variables, but not the
workflow variables.
♦ Post-worklet variable assignment. You can update parent workflow or worklet variables after the worklet
completes. You can assign these variables the values of user-defined worklet variables.
You assign variables on the Variables tab when you edit a worklet.

Passing Variable Values between Worklets


You can assign variable values in a worklet to pass values from one worklet to any subsequent worklet in the
same workflow or parent worklet. For example, a workflow contains two worklets wklt_CreateCustList and
wklt_UpdateCustOrders. Worklet wklt_UpdateCustOrders needs to use the value of a worklet variable updated
in wklt_CreateCustList.
The following figure shows the workflow:

To pass the worklet variable value from wklt_CreateCustList to wklt_UpdateCustOrders, complete the
following steps:
1. Configure worklet wklt_CreateCustList to use a worklet variable, for example, $$URLString1.
2. Configure worklet wklt_UpdateCustOrders to use a worklet variable, for example, $$URLString2.
3. Configure the workflow to use a workflow variable, for example, $$PassURLString.
4. Configure worklet wklt_CreateCustList to assign the value of worklet variable $$URLString1 to workflow
variable $$PassURLString after the worklet completes.
5. Configure worklet wklt_UpdateCustOrders to assign the value of workflow variable $$PassURLString to
worklet variable $$URLString2 before the worklet starts.

RELATED TOPICS:
♦ “Using Worklet Variables” on page 114

♦ “User-Defined Workflow Variables” on page 90

Configuring Variable Assignments


Assign variables on the Variables tab when you edit a worklet. Assign values to the following types of variables
before or after a worklet runs:
♦ Pre-worklet variable assignment. Update user-defined worklet variables with the values of parent workflow
or worklet variables or the values of mapping variables from other tasks in the workflow or parent worklet
that run before this worklet.

Assigning Variable Values in a Worklet 115


♦ Post-worklet variable assignment. Update parent workflow and worklet variables with the values of user-
defined worklet variables.

To assign variables in a worklet:

1. Edit the worklet for which you want to assign variables.


2. Click the Variables tab.
3. Select the variable assignment type:
♦ Pre-worklet variable assignment. Assign values to user-defined worklet variables before a worklet runs.
♦ Post-worklet variable assignment. Assign values to parent workflow and worklet variables after a
worklet completes.
4. Click the edit button in the variable assignment field.
5. In the pre- or post-worklet variable assignment area, click the add button to add a variable assignment
statement.
6. Click the open button in the User-Defined Worklet Variables and Parent Workflow/Worklet Variables
fields to select the variables whose values you wish to read or assign. For pre-worklet variable assignment,
you may enter parameter and variable names into these fields. The Workflow Manager does not validate
parameter and variable names.
The Workflow Manager assigns values from the right side of the assignment statement to variables on the
left side of the statement. So, if the variable assignment statement is
“$$SiteURL_WFVar=$$SiteURL_WkltVar,” the Workflow Manager assigns the value of
$$SiteURL_WkltVar to $$SiteURL_WFVar.
7. Repeat steps 5 to 6 to add more variable assignment statements.
To delete a variable assignment statement, click one of the fields in the assignment statement, and click the
cut button.
8. Click OK.

Validating Worklets
The Workflow Manager validates worklets when you save the worklet in the Worklet Designer. In addition,
when you use worklets in a workflow, the Integration Service validates the workflow according to the following
validation rules at run time:
♦ You cannot run two instances of the same worklet concurrently in the same workflow.
♦ You cannot run two instances of the same worklet concurrently across two different workflows.
♦ Each worklet instance in the workflow can run once.
When a worklet instance is invalid, the workflow using the worklet instance remains valid.
The Workflow Manager displays a red invalid icon if the worklet object is invalid. The Workflow Manager
validates the worklet object using the same validation rules for workflows. The Workflow Manager displays a
blue invalid icon if the worklet instance in the workflow is invalid. The worklet instance may be invalid when
any of the following conditions occurs:
♦ The parent workflow or worklet variable you assign to the user-defined worklet variable does not have a
matching datatype.
♦ The user-defined worklet variable you used in the worklet properties does not exist.
♦ You do not specify the parent workflow or worklet variable you want to assign.
For non-reusable worklets, you may see both red and blue invalid icons displayed over the worklet icon in the
Navigator.

116 Chapter 6: Working with Worklets


RELATED TOPICS:
♦ “Validating a Workflow” on page 77

Validating Worklets 117


118 Chapter 6: Working with Worklets
CHAPTER 7

Working with Sessions


This chapter includes the following topics:
♦ Overview, 119
♦ Creating a Session Task, 120
♦ Editing a Session, 120
♦ Understanding Buffer Memory, 123
♦ Configuring Automatic Memory Settings, 124
♦ Configuring Performance Details, 126
♦ Using Pre- and Post-Session SQL Commands, 127
♦ Using Pre- and Post-Session Shell Commands, 127
♦ Validating a Session, 129
♦ Stopping and Aborting a Session, 130
♦ Working with Session Parameters, 132
♦ Mapping Parameters and Variables in Sessions, 136
♦ Assigning Parameter and Variable Values in a Session, 137
♦ Handling High Precision Data, 139

Overview
A session is a set of instructions that tells the Integration Service how and when to move data from sources to
targets. A session is a type of task, similar to other tasks available in the Workflow Manager. In the Workflow
Manager, you configure a session by creating a Session task. To run a session, you must first create a workflow to
contain the Session task.
When you create a Session task, enter general information such as the session name, session schedule, and the
Integration Service to run the session. You can select options to run pre-session shell commands, send
On-Success or On-Failure email, and use FTP to transfer source and target files.
Configure the session to override parameters established in the mapping, such as source and target location,
source and target type, error tracing levels, and transformation attributes. You can also configure the session to
collect performance details for the session and store them in the PowerCenter repository. You might view
performance details for a session to tune the session.
You can run as many sessions in a workflow as you need. You can run the Session tasks sequentially or
concurrently, depending on the requirement.

119
The Integration Service creates several files and in-memory caches depending on the transformations and
options used in the session.

Creating a Session Task


You create a Session task for each mapping you want the Integration Service to run. The Integration Service
uses the instructions configured in the session to move data from sources to targets.
You can create a reusable Session task in the Task Developer. You can also create non-reusable Session tasks in
the Workflow Designer as you develop the workflow. After you create the session, you can edit the session
properties at any time.
Note: Before you create a Session task, you must configure the Workflow Manager to communicate with
databases and the Integration Service. You must assign appropriate permissions for any database, FTP, or
external loader connections you configure.

RELATED TOPICS:
♦ “Working with Connection Objects” on page 20

Steps to Create a Session Task


Create the Session task in the Task Developer or the Workflow Designer. Session tasks created in the Task
Developer are reusable.

To create a Session task:

1. In the Workflow Designer, click Tasks > Create.


2. Select Session Task for the task type.
3. Enter a name for the Session task.
4. Click Create.
5. Select the mapping you want to use in the Session task and click OK.
6. Click Done.

RELATED TOPICS:
♦ “Reusable Workflow Tasks” on page 97

Editing a Session
After you create a session, you can edit it. For example, you might need to adjust the buffer and cache sizes,
modify the update strategy, or clear a variable value saved in the repository.
Double-click the Session task to open the session properties. The session has the following tabs, and each of
those tabs has multiple settings:
♦ General tab. Enter session name, mapping name, and description for the Session task, assign resources, and
configure additional task options.
♦ Properties tab. Enter session log information, test load settings, and performance configuration.
♦ Config Object tab. Enter advanced settings, log options, and error handling configuration.
♦ Mapping tab. Enter source and target information, override transformation properties, and configure the
session for partitioning.

120 Chapter 7: Working with Sessions


♦ Components tab. Configure pre- or post-session shell commands and emails.
♦ Metadata Extension tab. Configure metadata extension options.
You can edit session properties at any time. The repository updates the session properties immediately.
If the session is running when you edit the session, the repository updates the session when the session
completes. If the mapping changes, the Workflow Manager might issue a warning that the session is invalid.
The Workflow Manager then lets you continue editing the session properties. After you edit the session
properties, the Integration Service validates the session and reschedules the session.

RELATED TOPICS:
♦ “Validating a Session” on page 129

♦ “Session Properties Reference” on page 473

Applying Attributes to All Instances


When you edit the session properties, you can apply source, target, and transformation settings to all instances
of the same type in the session. You can also apply settings to all partitions in a pipeline. You can apply reader or
writer settings, connection settings, and properties settings.
For example, you might need to change a relational connection from a test to a production database for all the
target instances in a session. You can change the connection value for one target in a session and apply the
connection to the other relational target objects.
Figure 7-1 shows the writers, connections, and properties settings for a target instance in a session:

Figure 7-1. Session Target Object Settings

For a target
instance, you
can change
writers,
connections,
and properties
settings.

Editing a Session 121


Table 7-1 shows the options you can use to apply attributes to objects in a session. You can apply different
options depending on whether the setting is a reader or writer, connection, or an object property.

Table 7-1. Apply All Options

Setting Option Description

Reader Apply Type to All Instances Applies a reader or writer type to all instances of the
Writer same object type in the session. For example, you can
apply a relational reader type to all the other readers in
the session.

Reader Apply Type to All Partitions Applies a reader or writer type to all the partitions in a
Writer pipeline. For example, if you have four partitions, you can
change the writer type in one partition for a target
instance. Use this option to apply the change to the other
three partitions.

Connections Apply Connection Type Applies the same type of connection to all instances.
Connection types are relational, FTP, queue, application,
or external loader.

Connections Apply Connection Value Apply a connection value to all instances or partitions.
The connection value defines a specific connection that
you can view in the connection browser. You can apply a
connection value that is valid for the existing connection
type.

Connections Apply Connection Attributes Apply only the connection attribute values to all instances
or partitions. Each type of connection has different
attributes. You can apply connection attributes separately
from connection values.

Connections Apply Connection Data Apply the connection value and its connection attributes
to all the other instances that have the same connection
type. This option combines the connection option and the
connection attribute option.

Connections Apply All Connection Applies the connection value and its attributes to all the
Information other instances even if they do not have the same
connection type. This option is similar to Apply
Connection Data, but it lets you change the connection
type.

Properties Apply Attribute to all Applies an attribute value to all instances of the same
Instances object type in the session. For example, if you have a
relational target you can choose to truncate a table
before you load data. You can apply the attribute value to
all the relational targets in the session.

Properties Apply Attribute to all Applies an attribute value to all partitions in a pipeline.
Partitions For example, you can change the name of the reject file
name in one partition for a target instance, then apply the
file name change to the other partitions.

Applying Connection Settings


When you apply connection settings you can apply the connection type, connection value, and connection
attributes. You can only apply a connection value that is valid for a connection type unless you choose the Apply
All Connection Information option. For example, if a target instance uses an FTP connection, you can only
choose an FTP connection value to apply to it. The Apply All Connection Information option lets you apply a
new connection type, connection value, and connection attributes.

Applying Attributes to Partitions or Instances


When you apply attributes to all instances or partitions in a session, you must open the session and edit one of
the session objects. You apply attributes or properties to other instances by choosing an attribute in that object
and selecting to apply its value to the other instances or partitions.

122 Chapter 7: Working with Sessions


To apply attributes to all instances or partitions:

1. Open a session in the workspace.


2. Click the Mappings tab.
3. Choose a source, target, or transformation instance from the Navigator. Settings for properties,
connections, and readers or writers might display, depending on the object you choose.
4. Right-click a reader, writer, property, or connection value.
A list of options appears.

5. Select an option from the list and choose to apply it to all instances or all partitions.
6. Click OK to apply the attribute or property.

Understanding Buffer Memory


When you run a session, the Integration Service process starts the Data Transformation Manager (DTM). The
DTM allocates buffer memory to the session at run time based on the DTM Buffer Size setting in the session
properties.
The DTM divides the memory into buffer blocks as configured in the Default Buffer Block Size setting in the
session properties. The reader, transformation, and writer threads use buffer blocks to move data from sources
to targets. The buffer block size should be larger than the precision for the largest row of data in a source or
target.
The Integration Service allocates at least two buffer blocks for each source and target partition. Use the
following calculation to determine buffer block requirements:
[(total number of sources + total number of targets)* 2] = (session buffer blocks)

For example, a session that contains a single partition using a mapping that contains 50 sources and 50 targets
requires a minimum of 200 buffer blocks.
[(50 + 50)* 2] = 200

You configure buffer memory settings by adjusting the following session properties:
♦ DTM Buffer Size. The DTM buffer size specifies the amount of buffer memory the Integration Service uses
when the DTM processes a session. Configure the DTM buffer size on the Properties tab in the session
properties.

Understanding Buffer Memory 123


♦ Default Buffer Block Size. The buffer block size specifies the amount of buffer memory used to move a
block of data from the source to the target. Configure the buffer block size on the Config Object tab in the
session properties.
The Integration Service specifies a minimum memory allocation for the buffer memory and buffer blocks. By
default, the Integration Service allocates 12,000,000 bytes of memory to the buffer memory and 64,000 bytes
per block.
If the DTM cannot allocate the configured amount of buffer memory for the session, the session cannot
initialize. Usually, you do not need more than 1 GB for the buffer memory.
You can configure a numeric value for the buffer size, or you can configure the session to determine the buffer
memory size at run time.

RELATED TOPICS:
♦ “Configuring Automatic Memory Settings” on page 124

Configuring Automatic Memory Settings


You can configure the Integration Service to determine buffer memory size and session cache size at run time.
When you run a session, the Integration Service allocates buffer memory to the session to move the data from
the source to the first transformation and from the last transformation to the target. It also creates session
caches in memory. Session caches include index and data caches for the Aggregator, Rank, Joiner, and Lookup
transformations, as well as Sorter and XML target caches. The values stored in the data and index caches depend
on the requirements of the transformation. For example, the Aggregator index cache stores group values as
configured in the group by ports, and the data cache stores calculations based on the group by ports. When the
Integration Service processes a Sorter transformation or writes data to an XML target, it also creates a cache.
You configure buffer memory and cache memory settings in the transformation and session properties. When
you configure buffer memory and cache memory settings, consider the overall memory usage for best
performance.
You enable automatic memory settings by configuring a value for the Maximum Memory Allowed for Auto
Memory Attributes and the Maximum Percentage of Total Memory Allowed for Auto Memory Attributes. If
you set Maximum Memory Allowed for Auto Memory Attributes or the Maximum Percentage of Total Memory
Allowed for Auto Memory Attributes to zero, the Integration Service uses default values for memory attributes
that you set to auto.

RELATED TOPICS:
♦ “Session Caches” on page 455

Configuring Buffer Memory


The Integration Service can determine the memory requirements for the following buffer memory:
♦ DTM Buffer Size
♦ Default Buffer Block Size
You can configure DTM buffer size and the default buffer block size in the session properties. If you specify a
numeric value that is less than 12MB for the DTM buffer size, the Integration Service updates the DTM buffer
size to 12MB. When the session requires more memory than the value you configure for the DTM buffer size,
session performance decreases and the session can fail.
If the session is configured to retry on deadlock and the value for the DTM buffer size is less than what the
session requires, the Integration Service writes the following message in the session log:
WRT_8193 Deadlock retry will not be used. The free buffer pool must be at least [number
of bytes] bytes. The current size of the free buffer pool is [number of bytes] bytes.

124 Chapter 7: Working with Sessions


To configure automatic memory settings for the DTM buffer size:

1. Open the session, and click the Config Object tab.


2. Enter a value for the Default Buffer Block Size.
You can specify auto or a numeric value. If you enter 2000, the Integration Service interprets the number as
2,000 bytes. Append KB, MB, or GB to the value to specify other units. For example, specify 512MB.
3. Click the Properties tab.
4. Enter a value for the DTM buffer size.
You can specify auto or a numeric value. If you enter 2000, the Integration Service interprets the number as
2,000 bytes. Append KB, MB, or GB to the value to specify other units. For example, specify 512MB.
Note: If you specify auto for the DTM buffer size or the default Buffer Block Size, enable automatic memory
settings by configuring a non-zero value for the Maximum Memory Allowed for Auto Memory Attributes and
the Maximum Percentage of Total Memory Allowed for Auto Memory Attributes. If you do not enable
automatic memory settings after you specify auto for the DTM buffer size or the default Buffer Block Size, the
Integration Service uses default values.

Configuring Session Cache Memory


The Integration Service can determine memory requirements for the following session caches:
♦ Lookup transformation index and data caches
♦ Aggregator transformation index and data caches
♦ Rank transformation index and data caches
♦ Joiner transformation index and data caches
♦ Sorter transformation cache
♦ XML target cache
You can configure auto for the index and data cache size in the transformation properties or on the mappings
tab of the session properties.

Configuring Maximum Memory Limits


When you configure automatic memory settings for session caches, configure the maximum memory limits.
Configuring memory limits allows you to ensure that you reserve a designated amount or percentage of memory
for other processes. You can configure the memory limit as a numeric value and as a percent of total memory.
Because available memory varies, the Integration Service bases the percentage value on the total memory on the
Integration Service process machine.
For example, configure automatic caching for three Lookup transformations in a session. Then, configure a
maximum memory limit of 500 MB for the session. When you run the session, the Integration Service divides
the 500 MB of allocated memory among the index and data caches for the Lookup transformations. The
maximum memory limit for the session does not apply to transformations that you did not configure for
automatic cashing.
When you configure a maximum memory value, the Integration Service divides memory among transformation
caches based on the transformation type.
When you configure a maximum memory, you must specify the value as both a numeric value and a percentage.
When you configure a numeric value and a percent, the Integration Service compares the values and determines
which value is lower. The Integration Service uses the lesser of these values as the maximum memory limit.
When you configure automatic memory settings, the Integration Service specifies a minimum memory
allocation for the index and data caches. By default, the Integration Service allocates 1megabyte to the index
cache and 2 megabytes to the data cache for each transformation instance. If you configure a maximum memory
limit that is less than the minimum value for an index or data cache, the Integration Service overrides the value
based on the transformation metadata.

Configuring Automatic Memory Settings 125


When you run a session on a grid and you configure Maximum Memory Allowed For Auto Memory Attributes,
the Integration Service divides the allocated memory among all the nodes in the grid. When you configure
Maximum Percentage of Total Memory Allowed For Auto Memory Attributes, the Integration Service allocates
the specified percentage of memory on each node in the grid.

Configuring Automatic Memory Settings for Session Caches


To use automatic memory settings for session caches, configure the caches for auto and configure the maximum
memory size.

To configure automatic memory settings for session caches:

1. Open the transformation in the Transformation Developer or the Mappings tab of the session properties.
2. In the transformation properties, select or enter auto for the following cache size settings:
♦ Index and data cache
♦ Sorter cache
♦ XML cache
3. Open the session in the Task Developer or Workflow Designer, and click the Config Object tab.
4. Enter a value for the Maximum Memory Allowed for Auto Memory Attributes.
If you enter 2000, the Integration Service interprets the number as 2,000 bytes. Append KB, MB, or GB to
the value to specify other units. For example, specify 512MB. This value specifies the maximum amount of
memory to use for session caches. If you set the value to zero, the Integration Service uses default values for
memory attributes that you set to auto.
5. Enter a value for the Maximum Percentage of Total Memory Allowed for Auto Memory Attributes.
This value specifies the maximum percentage of total memory the session caches may use. If the value is set
to zero, the Integration Service uses default values for memory attributes that you set to auto.

Configuring Performance Details


You can configure a session to collect performance details and store them in the PowerCenter repository. Collect
performance data for a session to view performance details while the session runs. Write performance data for a
session in the PowerCenter repository to store and view performance details for previous session runs.
If you want to write performance data to the repository you must perform the following tasks:
♦ Configure the session to collect performance data.
♦ Configure the session to write performance data to repository.
♦ Configure Integration Service to persist run-time statistics to the repository at the verbose level.
The Workflow Monitor displays performance details for each session that is configured to collect or write
performance details.

To configure performance details:

1. In the Workflow Manager, open the session properties and select the Properties tab.
2. Select Collect performance data to view performance details while the session runs.
3. Select Write Performance Data to Repository to store and view performance details for previous session
runs.
You must also configure the Integration Service to store the run-time information at the verbose level.
4. Click OK.

126 Chapter 7: Working with Sessions


RELATED TOPICS:
♦ “Viewing Performance Details” on page 361
♦ “Performance Settings” on page 478

Using Pre- and Post-Session SQL Commands


You can specify pre- and post-session SQL in the Source Qualifier transformation and the target instance when
you create a mapping. When you create a Session task in the Workflow Manager you can override the SQL
commands on the Mapping tab. You might want to use these commands to drop indexes on the target before
the session runs, and then recreate them when the session completes.
The Integration Service runs pre-session SQL commands before it reads the source. It runs post-session SQL
commands after it writes to the target.
You can use parameters and variables in SQL executed against the source and target. Use any parameter or
variable type that you can define in the parameter file. You can enter a parameter or variable within the SQL
statement, or you can use a parameter or variable as the command. For example, you can use a session
parameter, $ParamMyPreSQL, as the source pre-session SQL command, and set $ParamMyPreSQL to the SQL
statement in the parameter file.

Guidelines for Entering Pre- and Post-Session SQL Commands


Use the following guidelines when creating the SQL statements:
♦ Use any command that is valid for the database type. However, the Integration Service does not allow nested
comments, even though the database might.
♦ Use a semicolon (;) to separate multiple statements. The Integration Service issues a commit after each
statement.
♦ The Integration Service ignores semicolons within /* ...*/.
♦ If you need to use a semicolon outside of comments, you can escape it with a backslash (\).
♦ The Workflow Manager does not validate the SQL.

Error Handling
You can configure error handling on the Config Object tab. You can choose to stop or continue the session if
the Integration Service encounters an error issuing the pre- or post- session SQL command.

Using Pre- and Post-Session Shell Commands


The Integration Service can perform shell commands at the beginning of the session or at the end of the session.
Shell commands are operating system commands. Use pre- or post-session shell commands, for example, to
delete a reject file or session log, or to archive target files before the session begins.
The Workflow Manager provides the following types of shell commands for each Session task:
♦ Pre-session command. The Integration Service performs pre-session shell commands at the beginning of a
session. You can configure a session to stop or continue if a pre-session shell command fails.
♦ Post-session success command. The Integration Service performs post-session success commands only if the
session completed successfully.

Using Pre- and Post-Session SQL Commands 127


♦ Post-session failure command. The Integration Service performs post-session failure commands only if the
session failed to complete.
Use the following guidelines to call a shell command:
♦ Use any valid UNIX command or shell script for UNIX nodes, or any valid DOS or batch file for Windows
nodes.
♦ Configure the session to run the pre- or post-session shell commands.
The Workflow Manager provides a task called the Command task that lets you configure shell commands
anywhere in the workflow. You can choose a reusable Command task for the pre- or post-session shell
command. Or, you can create non-reusable shell commands for the pre- or post-session shell commands.
If you create a non-reusable pre- or post-session shell command, you can make it into a reusable Command
task.
The Workflow Manager lets you choose from the following options when you configure shell commands:
♦ Create non-reusable shell commands. Create a non-reusable set of shell commands for the session. Other
sessions in the folder cannot use this set of shell commands.
♦ Use an existing reusable Command task. Select an existing Command task to run as the pre- or post-session
shell command.
Configure pre- and post-session shell commands in the Components tab of the session properties.

RELATED TOPICS:
♦ “Working with the Command Task” on page 100

Using Parameters and Variables


You can use parameters and variables in pre- and post-session commands. Use any parameter or variable type
that you can define in the parameter file. You can enter a parameter or variable within the command, or you can
use a parameter or variable as the command. For example, you can include service process variable
$PMTargetFileDir in the command text in pre- and post-session commands. When you use a service process
variable instead of entering a specific directory, you can run the same workflow on different Integration Services
without changing session properties. You can also use a session parameter, $ParamMyCommand, as the pre- or
post-session shell command, and set $ParamMyCommand to the command in a parameter file.

RELATED TOPICS:
♦ “Where to Use Parameters and Variables” on page 405

Configuring Non-Reusable Shell Commands


When you create non-reusable pre- or post-session shell commands, the commands are only visible in the
session properties. The Workflow Manager does not create Command tasks from these non-reusable
commands. You can convert a non-reusable shell command to a reusable Command task.

To create non-reusable pre- or post-session shell commands:

1. In the Components tab of the session properties, select Non-reusable for pre- or post-session shell
command.
2. Click the Edit button in the Value field to open the Edit Pre- or Post-Session Command dialog box.
3. Enter a name for the command in the General tab.
4. If you want the Integration Service to perform the next command only if the previous command completed
successfully, select Fail Task if Any Command Fails in the Properties tab.
5. In the Commands tab, click the Add button to add shell commands.
Enter one command for each line.
6. Click OK.

128 Chapter 7: Working with Sessions


Creating a Reusable Command Task from Pre- or Post-Session Commands
If you create non-reusable pre- or post-session shell commands, you can make them into a reusable Command
task. After you make the pre- or post-session shell commands into a reusable Command task, you cannot revert
back.
To create a Command Task from non-reusable pre- or post-session shell commands, click the Edit button to
open the Edit dialog box for the shell commands. In the General tab, select the Make Reusable check box.
After you select the Make Reusable check box and click OK, a new Command task appears in the Tasks folder
in the Navigator window. Use this Command task in other workflows, just as you do with any other reusable
workflow tasks.

Configuring Reusable Shell Commands


Use the following procedure to call an existing reusable Command task as the pre- or post-session shell
command for the Session task.

To select an existing Command task as the pre-session shell command:

1. In the Components tab of the session properties, click Reusable for the pre- or post-session shell command.
2. Click the Edit button in the Value field to open the Task Browser dialog box.
3. Select the Command task you want to run as the pre- or post-session shell command.
4. Click the Override button in the Task Browser dialog box if you want to change the order of the
commands, or if you want to specify whether to run the next command when the previous command fails.
Changes you make to the Command task from the session properties only apply to the session. In the
session properties, you cannot edit the commands in the Command task.
5. Click OK to select the Command task for the pre- or post-session shell command.
The name of the Command task you select appears in the Value field for the shell command.

Pre-Session Shell Command Errors


You can configure the session to stop or continue if a pre-session shell command fails. If you select stop, the
Integration Service stops the session, but continues with the rest of the workflow. If you select Continue, the
Integration Service ignores the errors and continues the session. By default the Integration Service stops the
session upon shell command errors.
Configure the session to stop or continue if a pre-session shell command fails in the Error Handling settings on
the Config Object tab.

Validating a Session
The Workflow Manager validates a Session task when you save it. You can also manually validate Session tasks
and session instances. Validate reusable Session tasks in the Task Developer. Validate non-reusable sessions and
reusable session instances in the Workflow Designer.
The Workflow Manager marks a reusable session or session instance invalid if you perform one of the following
tasks:
♦ Edit the mapping in a way that might invalidate the session. You can edit the mapping used by a session at
any time. When you edit and save a mapping, the repository might invalidate sessions that already use the
mapping. The Integration Service does not run invalid sessions.
You must reconnect to the folder to see the effect of mapping changes on Session tasks.

Validating a Session 129


When you edit a session based on an invalid mapping, the Workflow Manager displays a warning message:
The mapping [mapping_name] associated with the session [session_name] is invalid.

♦ Delete a database, FTP, or external loader connection used by the session.


♦ Leave session attributes blank. For example, the session is invalid if you do not specify the source file name.
♦ Change the code page of a session database connection to an incompatible code page.
If you delete objects associated with a Session task such as session configuration object, Email, or Command
task, the Workflow Manager marks a reusable session invalid. However, the Workflow Manager does not mark a
non-reusable session invalid if you delete an object associated with the session.
If you delete a shortcut to a source or target from the mapping, the Workflow Manager does not mark the
session invalid.
The Workflow Manager does not validate SQL overrides or filter conditions entered in the session properties
when you validate a session. You must validate SQL override and filter conditions in the SQL Editor.
If a reusable session task is invalid, the Workflow Manager displays an invalid icon over the session task in the
Navigator and in the Task Developer workspace. This does not affect the validity of the session instance and the
workflows using the session instance.
If a reusable or non-reusable session instance is invalid, the Workflow Manager marks it invalid in the Navigator
and in the Workflow Designer workspace. Workflows using the session instance remain valid.
To validate a session, select the session in the workspace and click Tasks > Validate. Or, right-click the session
instance in the workspace and choose Validate.

Validating Multiple Sessions


You can validate multiple sessions without fetching them into the workspace. You must select and validate the
sessions from a query results view or a view dependencies list. You can save and optionally check in sessions that
change from invalid to valid status.
Note: If you use the Repository Manager, you can select and validate multiple sessions from the Navigator.

To validate multiple sessions:

1. Select sessions from either a query list or a view dependencies list.


2. Right-click one of the selected sessions and choose Validate.
The Validate Objects dialog box appears.
3. Choose whether to save objects and check in objects that you validate.

Stopping and Aborting a Session


You can stop or abort a session just as you can stop or abort any task. You can also abort a session by using the
ABORT() function in the mapping logic. Session errors can cause the Integration Service to stop a session early.
You can control the stopping point by setting an error threshold in a session, using the ABORT function in
mappings, or requesting the Integration Service to stop the session. You cannot control the stopping point when
the Integration Service encounters fatal errors, such as loss of connection to the target database.
If a session fails as a result of error, you can recover the workflow to recover the session.

RELATED TOPICS:
♦ “Recovery Options” on page 225

♦ “Row Error Logging” on page 393

130 Chapter 7: Working with Sessions


Threshold Errors
You can choose to stop a session on a designated number of non-fatal errors. A non-fatal error is an error that
does not force the session to stop on its first occurrence. Establish the error threshold in the session properties
with the Stop on Errors option. When you enable this option, the Integration Service counts non-fatal errors
that occur in the reader, writer, and transformation threads.
The Integration Service maintains an independent error count when reading sources, transforming data, and
writing to targets. The Integration Service counts the following non-fatal errors when you set the Stop on Errors
option in the session properties:
♦ Reader errors. Errors encountered by the Integration Service while reading the source database or source
files. Reader threshold errors can include alignment errors while running a session in Unicode mode.
♦ Writer errors. Errors encountered by the Integration Service while writing to the target database or target
files. Writer threshold errors can include key constraint violations, loading nulls into a not null field, and
database trigger responses.
♦ Transformation errors. Errors encountered by the Integration Service while transforming data.
Transformation threshold errors can include conversion errors, and any condition set up as an ERROR, such
as null input.
When you create multiple partitions in a pipeline, the Integration Service maintains a separate error threshold
for each partition. When the Integration Service reaches the error threshold for any partition, it stops the
session. The writer may continue writing data from one or more partitions, but it does not affect the ability to
perform a successful recovery.
Note: If alignment errors occur in a non line-sequential VSAM file, the Integration Service sets the error
threshold to 1 and stops the session.

Fatal Error
A fatal error occurs when the Integration Service cannot access the source, target, or repository. This can
include loss of connection or target database errors, such as lack of database space to load data. If the session
uses a Normalizer or Sequence Generator transformation, the Integration Service cannot update the sequence
values in the repository, and a fatal error occurs.
If the session does not use a Normalizer or Sequence Generator transformation, and the Integration Service
loses connection to the repository, the Integration Service does not stop the session. The session completes, but
the Integration Service cannot log session statistics into the repository.

ABORT Function
Use the ABORT function in the mapping logic to abort a session when the Integration Service encounters a
designated transformation error.

User Command
You can stop or abort the session from the Workflow Manager. You can also stop the session using pmcmd.

Integration Service Handling for Session Failure


The Integration Service handles session errors in different ways, depending on the error or event that causes the
session to fail.

Stopping and Aborting a Session 131


Table 7-1 describes the Integration Service behavior when a session fails:

Table 7-2. Integration Service Behavior for Failed Sessions

Cause for Session Errors Integration Service Behavior

- Error threshold met due to Integration Service performs the following tasks:
reader errors - Stops reading.
- Stop command using - Continues processing data.
Workflow Manager or pmcmd - Continues writing and committing data to targets.
If the Integration Service cannot finish processing and committing data, you
need to issue the Abort command to stop the session.

Abort command using Integration Service performs the following tasks:


Workflow Manager - Stops reading.
- Continues processing data.
- Continues writing and committing data to targets.
If the Integration Service cannot finish processing and committing data
within 60 seconds, it kills the DTM process and terminates the session.

- Fatal error from database Integration Service performs the following tasks:
- Error threshold met due to - Stops reading and writing.
writer errors - Rolls back all data not committed to the target database.
If the session stops due to fatal error, the commit or rollback may or may
not be successful.

- Error threshold met due to Integration Service performs the following tasks:
transformation errors - Stops reading.
- ABORT( ) - Flags the row as an abort row and continues processing data.
- Invalid evaluation of - Continues to write to the target database until it hits the abort row.
transaction control expression - Issues commits based on commit intervals.
- Rolls back all data not committed to the target database.

Working with Session Parameters


Session parameters represent values that can change between session runs, such as database connections or
source and target files.
Session parameters are either user-defined or built-in. Use user-defined session parameters in session or
workflow properties and define the values in a parameter file. When you run a session, the Integration Service
matches parameters in the parameter file with the parameters in the session. It uses the value in the parameter
file for the session property value. In the parameter file, folder and session names are case sensitive.
For example, you can write session logs to a log file. In the session properties, use $PMSessionLogFile as the
session log file name, and set $PMSessionLogFile to TestRun.txt in the parameter file. When you run the
session, the Integration Service creates a session log named TestRun.txt.
User-defined session parameters do not have default values, so you must define them in a parameter file. If the
Integration Service cannot find a value for a user-defined session parameter, it fails the session, takes an empty
string as the default value, or fails to expand the parameter at run time.
You can run a session with different parameter files when you use pmcmd to start a session. The parameter file
you set with pmcmd overrides the parameter file in the session or workflow properties.
Use built-in session parameters to get run-time information such as folder name, service names, or session run
statistics. You can use built-in session parameters in post-session shell commands, SQL commands, and email
messages. You can also use them in input fields in the Designer and Workflow Manager that accept session
parameters. The Integration Service sets the values of built-in session parameters. You cannot define built-in
session parameter values in the parameter file. The Integration Service expands these parameters when the
session runs.

132 Chapter 7: Working with Sessions


Table 7-3 describe the user-defined session parameters:

Table 7-3. User-Defined Session Parameters

Parameter Type Naming Convention Description

Session Log File $PMSessionLogFile Defines the name of the session log between session
runs.

Number of Partitions $DynamicPartitionCount Defines the number of partitions for a session.

Source File* $InputFileName Defines a source file name.

Lookup File* $LookupFileName Defines a lookup file name.

Target File* $OutputFileNames Defines a target file name.

Reject File* $BadFileName Defines a reject file name.

Database $DBConnectionName Defines a relational database connection for a source,


Connection* target, lookup, or stored procedure.

External Loader $LoaderConnectionName Defines external loader connections.


Connection*

FTP Connection* $FTPConnectionName Defines FTP connections.

Queue Connection* $QueueConnectionName Defines database connections for message queues.

Source or Target $AppConnectionName Defines connections to source and target


Application applications.
Connection*

General Session $ParamName Defines any other session property. For example, you
Parameter* can use this parameter to define a table owner name,
table name prefix, FTP file or directory name, lookup
cache file name prefix, or email address. You can use
this parameter to define source, lookup, target, and
reject file names, but not the session log file name or
database connections.
* Define the parameter name using the appropriate prefix, for example, $InputFileMySource.

Table 7-4 describes the built-in session parameters:

Table 7-4. Built-in Session Parameters

Parameter Type Naming Convention Description

Folder name $PMFolderName Returns the folder name.

Integration Service name $PMIntegrationServiceName Returns the Integration Service name.

Mapping name $PMMappingName Returns the mapping name.

Repository Service name $PMRepositoryServiceName Returns the Repository Service name.

Repository user name $PMRepositoryUserName Returns the repository user name.

Session name $PMSessionName Returns the session name.

Session run mode $PMSessionRunMode Returns the session run mode (normal or
recovery).

Source number of $PMSourceQualifierName Returns the number of rows the Integration


affected rows* @numAffectedRows Service successfully read from the named
Source Qualifier.

Source number of applied $PMSourceQualifierName Returns the number of rows the Integration
rows* @numAppliedRows Service successfully read from the named
Source Qualifier.

Working with Session Parameters 133


Table 7-4. Built-in Session Parameters

Parameter Type Naming Convention Description

Source number of $PMSourceQualifierName Returns the number of rows the Integration


rejected rows* @numRejectedRows Service dropped when reading from the
named Source Qualifier.

Source table name* $PMSourceName Returns the table name for the named source
@TableName instance.

Target number of affected $PMTargetName Returns the number of rows affected by the
rows* @numAffectedRows specified operation for the named target
instance.

Target number of applied $PMTargetName Returns the number of rows the Integration
rows* @numAppliedRows Service successfully applied to the named
target instance.

Target number of rejected $PMTargetName Returns the number of rows the Integration
rows* @numRejectedRows Service rejected when writing to the named
target instance.

Target table name* $PMTargetName Returns the table name for the named target
@TableName instance.

Workflow name $PMWorkflowName Returns the workflow name.

Workflow run ID $PMWorkflowRunId Returns the workflow run ID.

Workflow run instance $PMWorkflowRunInstance Returns the workflow run instance name.
name Name
* Define the parameter name using the appropriate prefix and suffix. For example, for a source instance named “Customers,”
the parameter for source table name is $PMCustomers@TableName. If the Source Qualifier is named “SQ_Customers,” the
parameter for source number of affected rows is $PMSQ_Customers@numAffectedRows.

Changing the Session Log Name


You can configure a session to write log events to a file. In the session properties, the Session Log File Directory
defaults to the service process variable, $PMSessionLogDir. The Session Log File Name defaults to
$PMSessionLogFile.
In a parameter file, you set $PMSessionLogFile to TestRun.txt. In the Administration Console, you defined
$PMSessionLogDir as \\server\infa_shared\SessLogs. When the Integration ServiceIntegration Service
runs the session, it creates a session log file named TestRun.txt in the \\server\infa_shared\SessLogs
directory.

Changing the Target File and Directory


Use a target file parameter in the session properties to change the target file and directory for a session. You can
enter a path that includes the directory and file name in the Output Filename field. If you include the directory
in the Output Filename field you must clear the Output File Directory. The Integration Service concatenates
the Output File Directory and the Output Filename to determine the target file location.
For example, a session uses a file parameter to read internal and external weblogs. You want to write the results
of the internal weblog session to one location and the external weblog session to another location.
In the session properties, you name the target file $OutputFileName and clear the Output File Directory field.
In the parameter file, set $OutputFileName to E:/internal_weblogs/November_int.txt to create a target file
for the internal weblog session. After the session completes, you change $OutputFileName to
F:/external_weblogs/November_ex.txt for the external weblog session.

You can create a different parameter file for each target and use pmcmd to start a session with a specific
parameter file. This parameter file overrides the parameter file name in the session properties.

134 Chapter 7: Working with Sessions


Changing Source Parameters in a File
You can define multiple parameters for a session property in a parameter file and use one of the parameters in a
session. You can change the parameter name in the session properties and run the session again with a different
parameter value.
For example, you create a session parameter named $InputFile_Products in a parameter file. You set the
parameter value to “products.txt.” In the same parameter file, you create another parameter called
$InputFile_Items. You set the parameter value to “items.txt.”
When you set the source file name to $InputFile_Products in the session properties, the Integration Service
reads products.txt. When you change the source file name to $InputFile_Items, the Integration Service reads
items.txt.

Changing Connection Parameters


Use connection parameters to rerun sessions with different sources, targets, lookup tables, or stored procedures.
You create a connection parameter in the session properties of any session. You can reference any connection in
a parameter. Name all connection session parameters with the appropriate prefix, followed by any alphanumeric
and underscore character.
For example, you run a session that reads from two relational sources. You access one source with a database
connection named “Marketing” and the other with a connection named “Sales.” In the session properties, you
create a source database connection parameter named $DBConnection_Source. In the parameter file, you
define $DBConnection_Source as Marketing and run the session. Set $DBConnection_Source to Sales in the
parameter file for the next session run.
If you use a connection parameter to override a connection for a source or target, you can override the
connection attributes in the parameter file. You can override connection attributes when you use a
non-relational connection parameter for a source or target instance. When you define the connection in the
parameter file, the Integration Service searches for specific, user-defined session parameters that define the
connection attributes. For example, you create an FTP connection parameter called
$FTPConnectionMyFTPConn and define it in the parameter file. The Integration Service searches the
parameter file for the following parameters:
♦ $Param_FTPConnectionMyFTPConn_Remote_Filename
♦ $Param_FTPConnectionMyFTPConn_Is_Staged
♦ $Param_FTPConnectionMyFTPConn_Is_Transfer_Mode_ASCII
If you do not define a value for any of these parameters, the Integration Service uses the value defined in the
connection object.
The connection attributes you can override are listed in the following template file:
<PowerCenter Installation Directory>/server/bin/ConnectionParam.prm

RELATED TOPICS:
♦ “Overriding Connection Attributes in the Parameter File” on page 26

Getting Run-Time Information


Use built-in session parameters to get run-time information such as folder name, Integration Service name, and
source and target table name. You can use built-in session parameters in post-session shell commands, SQL
commands, and email messages. You can also use them in input fields in the Designer and Workflow Manager
that accept session parameters.
For example, you want to send a post-session email after session “s_UpdateCustInfo” completes that includes
session run statistics for Source Qualifier “SQ_Customers” and target “T_CustInfo.” Enter the following text in
the body of the email message:
Statistics for session $PMSessionName
Integration service: $PMIntegrationServiceName

Working with Session Parameters 135


Source number of affected rows: $PMSQ_Customers@numAffectedRows
Source number of dropped rows: $PMSQ_Customers@numRejectedRows
Target number of affected rows: $PMT_CustInfo@numAffectedRows
Target number of applied rows: $PMT_CustInfo@numAppliedRows
Target number of rejected rows: $PMT_CustInfo@numRejectedRows

You can also use email variables to get the session name, Integration Service name, number of rows loaded, and
number of rows rejected.

RELATED TOPICS:
♦ “Working with Post-Session Email” on page 242

Rules and Guidelines


Session file parameters and database connection parameters provide the flexibility to run sessions against
different files and databases.
Use the following rules and guidelines when you create file parameters:
♦ When you define the parameter file as a resource for a node, verify the Integration Service runs the session
on a node that can access the parameter file. Define the resource for the node, configure the Integration
Service to check resources, and edit the session to require the resource.
♦ When you create a file parameter, use alphanumeric and underscore characters. For example, to name a
source file parameter, use $InputFileName, such as $InputFile_Data.
♦ All session file parameters of a particular type must have distinct names. For example, if you create two
source file parameters, you might name them $SourceFileAccts and $SourceFilePrices.
♦ When you define the parameter in the file, you can reference any directory local to the Integration Service.
♦ Use a parameter to define the location of a file. Clear the entry in the session properties that define the file
location. Enter the full path of the file in the parameter file.
♦ You can change the parameter value in the parameter file between session runs, or you can create multiple
parameter files. If you use multiple parameter files, use the pmcmd Startworkflow command with the
-paramfile or -localparamfile options to specify which parameter file to use.
Use the following rules and guidelines when you create database connection parameters:
♦ You can change connections for relational sources, targets, lookups, and stored procedures.
♦ When you define the parameter, you can reference any database connection in the repository.
♦ Use the same $DBConnection parameter for more than one connection in a session.

Mapping Parameters and Variables in Sessions


Use mapping parameters in the session properties to alter certain mapping attributes. For example, use a
mapping parameter in a transformation override to override a filter or user-defined join in a Source Qualifier
transformation.
If you use mapping variables in a session, you can clear any of the variable values saved in the repository by
editing the session. When you clear the variable values, the Integration Service uses the values in the parameter
file the next time you run a session. If the session does not use a parameter file, the Integration Service uses the
values assigned in the pre-session variable assignment. If there are no assigned values, the Integration Service
uses the initial values defined in the mapping.

To view or delete values for mapping variables saved in the repository:

1. In the Navigator window of the Workflow Manager, right-click the Session task and select View Persistent
Values.
You can see the variable name and value.

136 Chapter 7: Working with Sessions


2. Click Delete Values to delete existing variable values.
3. Click OK.

Assigning Parameter and Variable Values in a Session


You can update the values of certain parameters and variables before or after a non-reusable session runs. This
allows you to pass information from one session to another within the same workflow or worklet. For example,
you have a workflow that contains two sessions that need to increment the same counter. You can increment the
counter in the first session, pass the updated counter value to the second session, and increment the counter
again in the second session. Or, you have a worklet that contains sessions that access the same web site. You can
configure the first session to get a session ID from the web site and then pass the session ID value to subsequent
sessions.
You can also pass information from a session to a worklet or from a worklet to a session as long as the session
and worklet are in the same workflow or parent worklet. For more information about passing information
between worklets, see “Assigning Variable Values in a Worklet” on page 114.
Note: You cannot assign parameters and variables in reusable sessions.

The types of parameters and variables you can update depend on whether you assign them before or after a
session runs. You can update the following types of parameters and variables before or after a session runs:
♦ Pre-session variable assignment. You can update mapping parameters, mapping variables, and session
parameters before a session runs. You can assign these parameters and variables the values of workflow or
worklet variables in the parent workflow or worklet. Therefore, if a session is in a worklet within a workflow,
you can assign values from the worklet variables, but not the workflow variables.
You cannot update mapplet variables in the pre-session variable assignment.
♦ Post-session on success variable assignment. You can update workflow or worklet variables in the parent
workflow or worklet after the session completes successfully. You can assign these variables the values of
mapping parameters and variables.
♦ Post-session on failure variable assignment. You can update workflow or worklet variables in the parent
workflow or worklet when the session fails. You can assign these variables the values of mapping parameters
and variables.
You assign parameters and variables on the Components tab of the session properties.

Passing Parameter and Variable Values between Sessions


You can assign parameter and variable values in a session to pass values from one session to any subsequent
session in the same workflow or worklet. For example, a workflow contains two sessions s_NewCustomers and
s_MergeCustomers. Session s_MergeCustomers needs to use the value of a mapping variable updated in
s_NewCustomers.
The following figure shows the workflow:

To pass the mapping variable value from s_NewCustomers to s_MergeCustomers, complete the following
steps:
1. Configure the mapping associated with session s_NewCustomers to use a mapping variable, for example,
$$Count1.
2. Configure the mapping associated with session s_MergeCustomers to use a mapping variable, for example,
$$Count2.

Assigning Parameter and Variable Values in a Session 137


3. Configure the workflow to use a user-defined workflow variable, for example, $$PassCountValue.
4. Configure session s_NewCustomers to assign the value of mapping variable $$Count1 to workflow variable
$$PassCountValue after the session completes successfully.
5. Configure session s_MergeCustomers to assign the value of workflow variable $$PassCountValue to
mapping variable $$Count2 before the session starts.

RELATED TOPICS:
♦ “User-Defined Workflow Variables” on page 90

Configuring Parameter and Variable Assignments


Assign parameters and variables on the Components tab of the session properties. Assign values to the following
types of parameters and variables before or after a session runs:
♦ Pre-session variable assignment. Assign values to mapping parameters, mapping variables, and session
parameters. You update these parameters and variables with the values of parent workflow or worklet
variables.
♦ Post-session variable assignment. Assign values to workflow and worklet variables. You update these
variables with the values of mapping parameters or variables.
Mapping parameters and variables have different datatypes than workflow and worklet variables. When you
assign parameter and variable values in a session, the Integration Service converts the datatypes as shown in
Table 7-5 and Table 7-6.
Table 7-5 describes datatype conversion when you assign workflow or worklet variable values to mapping
parameters or mapping variables:

Table 7-5. Datatype Conversion for Pre-Session Variable Assignment

Workflow/Worklet Variable Datatype Mapping Parameter/Variable Datatype

Date/Time Date/Time

Double Double

Integer Integer

NString NString

Table 7-6 describes datatype conversion when you assign mapping parameter or variable values to workflow or
worklet variables:

Table 7-6. Datatype Conversion for Post-Session Variable Assignment

Mapping Parameter/Variable Datatype Workflow/Worklet Variable Datatype

Date/Time Date/Time

Decimal NString*

Double Double

Integer Integer

NString NString

NText NString

Real Double

Small integer Integer

String NString

138 Chapter 7: Working with Sessions


Table 7-6. Datatype Conversion for Post-Session Variable Assignment

Mapping Parameter/Variable Datatype Workflow/Worklet Variable Datatype

Text NString
* There is no workflow or worklet variable datatype for the Decimal datatype. The Integration Service
converts Decimal values in mapping parameters and variables to NString values in workflow and
worklet variables.

To assign parameters and variables in a session:

1. Edit the session for which you want to assign parameters and variables.
2. Click the Components tab.
3. Select the variable assignment type:
♦ Pre-session variable assignment. Assign values to mapping parameters, mapping variables, and session
parameters before a session runs.
♦ Post-session on success variable assignment. Assign values to parent workflow or worklet variables after
a session completes successfully.
♦ Post-session on failure variable assignment. Assign values to parent workflow or worklet variables after
a session fails.
4. Click the Open button in the variable assignment field. The Pre- or Post-Session Variable Assignment
dialog box appears.
5. In the Pre- or Post-Session Variable Assignment dialog box, click the add button to add a variable
assignment row.
6. Click the open button in the Mapping Variables/Parameters and Parent Workflow/Worklet Variables fields
to select the parameters and variables whose values you want to read or assign. To assign values to a session
parameter, enter the parameter name in the Mapping Variables/Parameters field. The Workflow Manager
does not validate parameter names.
The Workflow Manager assigns values from the right side of the assignment statement to parameters and
variables on the left side of the statement. So, if the variable assignment statement is
“$$Counter_MapVar=$$Counter_WFVar,” the Workflow Manager assigns the value of
$$Counter_WFVar to $$Counter_MapVar.
7. Repeat steps 5 to 6 to add more variable assignment statements. To delete a variable assignment statement,
click one of the fields in the assignment statement, and click the delete button.
8. Click OK.

Handling High Precision Data


High precision data determines how large numbers are represented with greater accuracy. The precision
attributed to a number includes the scale of the number. For example, the value 11.47 has a precision of 4 and
a scale of 2. Large numbers can loose accuracy because of rounding when used in a calculation that produces an
overflow. Incorrect results may arise because of a failure to truncate the high precision data.
High precision data values have greater accuracy but enabling high precision processing can increase workflow
run time. Disable high precision if performance is more important.
You enable high precision on the properties tab of the session. The Integration Service processes high precision
data differently for bigint and decimal values.

Handling High Precision Data 139


Bigint
In calculations that can produce decimal values, the Integration Service processes bigint values as doubles or
decimals. When a session contains a calculation that can produce decimal values and runs without high
precision, the Integration Service converts bigint values to doubles before it performs the calculation. The
transformation Double datatype supports precision of up to 15 digits, while the Bigint datatype supports
precision of up to 19 digits. Therefore, precision loss can occur in calculations that produce bigint values with
precision of more than 15 digits.
For example, an expression transformation contains the following calculation:
POWER( BIGINTVAL, EXPVAL )

Before it performs the calculation, the Integration Service converts the inputs to the POWER function to
double values. If the BIGINTVAL port contains the bigint value 9223372036854775807, the Integration
Service converts this value to 9.22337203685478e+18, losing the last four digits of precision. If the EXPVAL
port contains the value 1.0 and the result port is a bigint, this calculation produces a row error since the result,
9223372036854780000, exceeds the maximum bigint value.
When you use a bigint value in a calculation that can produce decimal values and you run the session with high
precision, the Integration Service converts the bigint values to decimals. The transformation Decimal datatype
supports precision of up to 28 digits. Therefore, precision loss does not occur in a calculation unless the result
produces a value with precision greater than 28 digits. In this case, the Integration Service stores the result as a
double.

Decimal
The Integration Services processes decimal values as doubles or decimals. When you enable high precision, you
choose to enable the Decimal datatype or let the Integration Service process the data as a Double (precision of
15).
To enable high precision data handling:
♦ Use the Decimal datatype with a precision of 16 to 28 in the mapping.
♦ Select Enable High Precision in the session properties.
For example, you have a mapping with Decimal (20,0) that passes the number 40012030304957666903. If you
enable high precision, the Integration Service passes the number as is. If you do not enable high precision, the
Integration Service passes 4.00120303049577 x 1019 .
If you want to process a Decimal value with a precision greater than 28 digits, the Integration Service treats it as
a Double value. For example, if you want to process the number 2345678904598383902092.1927658, which
has a precision of 29 digits, the Integration Service treats this number as a Double value of 2.34567890459838
x 10 21 .

140 Chapter 7: Working with Sessions


CHAPTER 8

Session Configuration Object


This chapter includes the following topics:
♦ Overview, 141
♦ Advanced Settings, 142
♦ Log Options Settings, 143
♦ Error Handling Settings, 144
♦ Partitioning Options Settings, 145
♦ Session on Grid Settings, 145
♦ Working with a Session Configuration Object, 146

Overview
Each folder in the repository has a default session configuration object that contains session properties such as
commit and load settings, log options, and error handling settings. You can create multiple configuration
objects if you want to apply different configuration settings to multiple sessions.
When you create a session, the Workflow Manager applies the default configuration object settings to the
Config Object tab on the session. You can also choose a configuration object to use for the session.
When you edit a session configuration object, each session that uses the session configuration object inherits the
changes. When you override the configuration object settings in the Session task, the session configuration
object does not inherit changes.

Configuration Object and Config Object Tab Settings


You can configure the following settings in a session configuration object or on the Config Object tab in session
properties:
♦ Advanced. Advanced settings allow you to configure constraint-based loading, lookup caches, and buffer
sizes.
♦ Log options. Log options allow you to configure how you want to save the session log. By default, the Log
Manager saves only the current session log.
♦ Error handling. Error Handling settings allow you to determine if the session fails or continues when it
encounters pre-session command errors, stored procedure errors, or a specified number of session errors.
♦ Partitioning options. Partitioning options allow the Integration Service to determine the number of
partitions to create at run time.
♦ Session on grid. When Session on Grid is enabled, the Integration Service distributes session threads to the
nodes in a grid to increase performance and scalability.

Overview 141
Advanced Settings
Advanced settings allow you to configure constraint-based loading, lookup caches, and buffer sizes.
Table 8-1 describes the Advanced settings of the Config Object tab:

Table 8-1. Config Object Tab - Advanced Settings

Advanced Settings Description

Constraint Based Load Ordering Integration Service loads targets based on primary key-foreign key
constraints where possible.

Cache Lookup() Function If selected, the Integration Service caches PowerMart 3.5 LOOKUP
functions in the mapping, overriding mapping-level LOOKUP
configurations.
If not selected, the Integration Service performs lookups on a row-by-row
basis, unless otherwise specified in the mapping.

Default Buffer Block Size Size of buffer blocks used to move data and index caches from sources to
targets. By default, the Integration Service determines this value at run
time.
You can specify auto or a numeric value. If you enter 2000, the Integration
Service interprets the number as 2000 bytes. Append KB, MB, or GB to
the value to specify other units. For example, you can specify 512MB.
Note: The session must have enough buffer blocks to initialize. The
minimum number of buffer blocks must be greater than the total number of
sources (Source Qualifiers, Normalizers for COBOL sources) and targets.
The number of buffer blocks in a session = DTM Buffer Size / Buffer Block
Size. Default settings create enough buffer blocks for 83 sources and
targets. If the session contains more than 83, you might need to increase
DTM Buffer Size or decrease Default Buffer Block Size.

Line Sequential Buffer Length Affects the way the Integration Service reads flat files. Increase this setting
from the default of 1024 bytes per line only if source flat file records are
larger than 1024 bytes.

Maximum Memory Allowed for Maximum memory allocated for automatic cache when you configure the
Auto Memory Attributes Integration Service to determine session cache size at run time.
You enable automatic memory settings by configuring a value for this
attribute. If you enter 2000, the Integration Service interprets the number
as 2000 bytes. Append KB, MB, or GB to the value to specify other units.
For example, you can specify 512MB.
If the value is set to zero, the Integration Service uses default values for
memory attributes that you set to auto.

Maximum Percentage of Total Maximum percentage of memory allocated for automatic cache when you
Memory Allowed for Auto configure the Integration Service to determine session cache size at run
Memory Attributes time. If the value is set to zero, the Integration Service uses default values
for memory attributes that you set to auto.

Additional Concurrent Pipelines Enables the Integration Service to create lookup caches concurrently by
for Lookup Cache Creation creating additional pipelines.
Specify a numeric value or select Auto to enable the Integration Service to
determine this value at run time.
By default, the Integration Service creates caches concurrently and it
determines the number of additional pipelines to create at run time. If you
configure a numeric value, you can configure an additional concurrent
pipeline for each Lookup transformation in the pipeline.
You can also configure the Integration Service to create session caches
sequentially. It builds a lookup cache in memory when it processes the first
row of data in a cached Lookup transformation. Set the value to 0 to
configure the Integration Service to process sessions sequentially.

Custom Properties Configure custom properties of the Integration Service for the session.
You can override custom properties that the Integration Service uses after
the DTM process has started. The Integration Service also writes the
override value of the property to the session log.

142 Chapter 8: Session Configuration Object


Table 8-1. Config Object Tab - Advanced Settings

Advanced Settings Description

DateTime Format String Date time format defined in the session configuration object. Default
format specifies microseconds: MM/DD/YYYY HH24:MI:SS.US.
You can specify seconds, milliseconds, or nanoseconds.
MM/DD/YYYY HH24:MI:SS, specifies seconds.
MM/DD/YYYY HH24:MI:SS.MS, specifies milliseconds.
MM/DD/YYYY HH24:MI:SS.US, specifies microseconds.
MM/DD/YYYY HH24:MI:SS.NS, specifies nanoseconds.

Pre 85 Timestamp Compatibility Trims subseconds to maintain compatibility with versions prior to 8.5. The
Integration Service converts the Oracle Timestamp datatype to the Oracle
Date datatype. The Integration Service trims subsecond data for the
following sources, targets, and transformations:
- Relational sources and targets
- XML sources and targets
- SQL transformation
- XML Generator transformation
- XML Parser transformation
Default is disabled.

RELATED TOPICS:
♦ “Configuring Automatic Memory Settings” on page 124

Log Options Settings


Log options allow you to configure how you want to save the session log. By default, the Log Manager saves
only the current session log.
Table 8-2 shows the Log Options settings of the Config Object tab:

Table 8-2. Config Object Tab - Log Options Settings

Log Options Settings Description

Save Session Log By Configure this option when you choose to save session log files.
If you select Save Session Log by Timestamp, the Log Manager saves all session
logs, appending a timestamp to each log.
If you select Save Session Log by Runs, the Log Manager saves a designated
number of session logs. Configure the number of sessions in the Save Session
Log for These Runs option.
You can also use the $PMSessionLogCount service variable to save the
configured number of session logs for the Integration Service.

Save Session Log for Number of historical session logs you want the Log Manager to save.
These Runs The Log Manager saves the number of historical logs you specify, plus the most
recent session log. When you configure 5 runs, the Log Manager saves the most
recent session log, plus historical logs 0-4, for a total of 6 logs.
You can configure up to 2,147,483,647 historical logs. If you configure 0 logs, the
Log Manager saves only the most recent session log.

RELATED TOPICS:
♦ “Session Logs” on page 383

Log Options Settings 143


Error Handling Settings
Error Handling settings allow you to determine if the session fails or continues when it encounters pre-session
command errors, stored procedure errors, or a specified number of session errors.
Table 8-3 describes the Error handling settings of the Config Object tab:

Table 8-3. Config Object Tab - Error Handling Settings

Error Handling
Description
Settings

Stop On Errors Indicates how many non-fatal errors the Integration Service can encounter before it
stops the session. Non-fatal errors include reader, writer, and DTM errors. Enter the
number of non-fatal errors you want to allow before stopping the session. The
Integration Service maintains an independent error count for each source, target, and
transformation. If you specify 0, non-fatal errors do not cause the session to stop.
Optionally use the $PMSessionErrorThreshold service variable to stop on the
configured number of errors for the Integration Service.

Override Tracing Overrides tracing levels set on a transformation level. Selecting this option enables a
menu from which you choose a tracing level: None, Terse, Normal, Verbose
Initialization, or Verbose Data.

On Stored Procedure Required if the session uses pre- or post-session stored procedures.
Error If you select Stop Session, the Integration Service stops the session on errors
executing a pre-session or post-session stored procedure.
If you select Continue Session, the Integration Service continues the session
regardless of errors executing pre-session or post-session stored procedures.
By default, the Integration Service stops the session on Stored Procedure error and
marks the session failed.

On Pre-Session Required if the session has pre-session shell commands.


Command Task Error If you select Stop Session, the Integration Service stops the session on errors
executing pre-session shell commands.
If you select Continue Session, the Integration Service continues the session
regardless of errors executing pre-session shell commands.
By default, the Integration Service stops the session upon error.

On Pre-Post SQL Required if the session uses pre- or post-session SQL.


Error If you select Stop Session, the Integration Service stops the session errors executing
pre-session or post-session SQL.
If you select Continue, the Integration Service continues the session regardless of
errors executing pre-session or post-session SQL.
By default, the Integration Service stops the session upon pre- or post-session SQL
error and marks the session failed.

Error Log Type Specifies the type of error log to create. You can specify relational, file, or no log. By
default, the Error Log Type is set to none.

Error Log DB Specifies the database connection for a relational error log.
Connection

Error Log Table Name Specifies table name prefix for a relational error log. Oracle and Sybase have a 30
Prefix character limit for table names. If a table name exceeds 30 characters, the session
fails.

Error Log File Specifies the directory where errors are logged. By default, the error log file directory
Directory is $PMBadFilesDir\.

Error Log File Name Specifies error log file name. By default, the error log file name is PMError.log.

Log Row Data Specifies whether or not to log transformation row data. When you enable error
logging, the Integration Service logs transformation row data by default. If you disable
this property, n/a or -1 appears in transformation row data fields.

144 Chapter 8: Session Configuration Object


Table 8-3. Config Object Tab - Error Handling Settings

Error Handling
Description
Settings

Log Source Row Data Specifies whether or not to log source row data. By default, the check box is clear and
source row data is not logged.

Data Column Delimiter for string type source row data and transformation group row data. By
Delimiter default, the Integration Service uses a pipe ( | ) delimiter. Verify that you do not use
the same delimiter for the row data as the error logging columns. If you use the same
delimiter, you may find it difficult to read the error log file.

RELATED TOPICS:
♦ “Session Logs” on page 383

Partitioning Options Settings


When you configure dynamic partitioning, the Integration Service determines the number of partitions to
create at run time. Configure dynamic partitioning on the Config Object tab of session properties.
Table 8-4 describes the Partitioning Options settings on the Config Objects tab:

Table 8-4. Config Objects Tab - Partitioning Options

Partitioning
Description
Options Settings

Dynamic Configure dynamic partitioning using one of the following methods:


Partitioning - Disabled. Do not use dynamic partitioning. Define the number of partitions on the
Mapping tab.
- Based on number of partitions. Sets the partitions to a number that you define in the
Number of Partitions attribute. Use the $DynamicPartitionCount session parameter, or
enter a number greater than 1.
- Based on number of nodes in grid. Sets the partitions to the number of nodes in the
grid running the session. If you configure this option for sessions that do not run on a
grid, the session runs in one partition and logs a message in the session log.
- Based on source partitioning. Determines the number of partitions using database
partition information. The number of partitions is the maximum of the number of
partitions at the source.
- Based on number of CPUs. Sets the number of partitions equal to the number of CPUs
on the node that prepares the session. If the session is configured to run on a grid,
dynamic partitioning sets the number of partitions equal to the number of CPUs on the
node that prepares the session multiplied by the number of nodes in the grid.
Default is disabled.

Number of Determines the number of partitions that the Integration Service creates when you
Partitions configure dynamic partitioning based on the number of partitions. Enter a value greater
than 1 or use the $DynamicPartitionCount session parameter.

Session on Grid Settings


When Session on Grid is enabled, the Integration Service distributes workflows and session threads to the nodes
in a grid to increase performance and scalability.

Partitioning Options Settings 145


Table 8-5 describes the Session on Grid setting on the Config Object tab:

Table 8-5. Config Object Tab - Session on Grid

Session on Grid Setting Description

Is Enabled Specifies whether the session runs on a grid.

Working with a Session Configuration Object


Create a session configuration object when you want to reuse a set of Config Object tab settings. After you
create a session configuration object, you can configure sessions to use it.

To create a session configuration object:

1. In the Workflow Manager, open a folder and click Tasks > Session Configuration.
The Session Configuration Browser appears.
2. Click New to create a new session configuration object.
3. Enter a name for the session configuration object.
4. On the Properties tab, configure the settings.
5. Click OK.

To use a session configuration object in a session:

1. In the Workflow Manager, open the session properties and click the Config Object tab.
2. Click the Open button in the Config Name field.
A list of session configuration objects appears.
3. Select the configuration object you want to use and click OK.
The settings associated with the configuration object appear on the Config Object tab.
4. Click OK.

146 Chapter 8: Session Configuration Object


CHAPTER 9

Working with Sources


This chapter includes the following topics:
♦ Overview, 147
♦ Configuring Sources in a Session, 148
♦ Working with Relational Sources, 149
♦ Working with File Sources, 151
♦ Integration Service Handling for File Sources, 155
♦ Using a File List, 157
♦ Using FastExport, 159

Overview
In the Workflow Manager, you can create sessions with the following sources:
♦ Relational. You can extract data from any relational database that the Integration Service can connect to.
When extracting data from relational sources and Application sources, you must configure the database
connection to the data source prior to configuring the session.
♦ File. You can create a session to extract data from a flat file, COBOL, or XML source. Use an operating
system command to generate source data for a flat file or COBOL source or generate a file list.
If you use a flat file or XML source, the Integration Service can extract data from any local directory or FTP
connection for the source file. If the file source requires an FTP connection, you need to configure the FTP
connection to the host machine before you create the session.
♦ Heterogeneous. You can extract data from multiple sources in the same session. You can extract from
multiple relational sources, such as Oracle and Microsoft SQL Server. Or, you can extract from multiple
source types, such as relational and flat file. When you configure a session with heterogeneous sources,
configure each source instance separately.

Globalization Features
You can choose a code page that you want the Integration Service to use for relational sources and flat files. You
specify code pages for relational sources when you configure database connections in the Workflow Manager.
You can set the code page for file sources in the session properties.

147
Source Connections
Before you can extract data from a source, you must configure the connection properties the Integration Service
uses to connect to the source file or database. You can configure source database and FTP connections in the
Workflow Manager.

Allocating Buffer Memory


When the Integration Service initializes a session, it allocates blocks of memory to hold source and target data.
The Integration Service allocates at least two blocks for each source and target partition. Sessions that use a
large number of sources or targets might require additional memory blocks. If the Integration Service cannot
allocate enough memory blocks to hold the data, it fails the session.

Partitioning Sources
You can create multiple partitions for relational, Application, and file sources. For relational or Application
sources, the Integration Service creates a separate connection to the source database for each partition you set in
the session properties. For file sources, you can configure the session to read the source with one thread or
multiple threads.

RELATED TOPICS:
♦ “Understanding Pipeline Partitioning” on page 247

Configuring Sources in a Session


Configure source properties for sessions in the Sources node of the Mapping tab of the session properties.
When you configure source properties for a session, you define properties for each source instance in the
mapping.
Figure 9-1 shows the Sources node on the Mapping tab:

Figure 9-1. Sources Node of the Session Properties

148 Chapter 9: Working with Sources


The Sources node lists the sources used in the session and displays their settings. To view and configure settings
for a source, select the source from the list. You can configure the following settings for a source:
♦ Readers
♦ Connections
♦ Properties

Configuring Readers
You can click the Readers settings on the Sources node to view the reader the Integration Service uses with each
source instance. The Workflow Manager specifies the necessary reader for each source instance in the Readers
settings on the Sources node.

Configuring Connections
Click the Connections settings on the Sources node to define source connection information. For relational
sources, choose a configured database connection in the Value column for each relational source instance. By
default, the Workflow Manager displays the source type for relational sources.
For flat file and XML sources, choose one of the following source connection types in the Type column for each
source instance:
♦ FTP. To read data from a flat file or XML source using FTP, you must specify an FTP connection when you
configure source options. You must define the FTP connection in the Workflow Manager prior to
configuring the session.
♦ None. Choose None to read from a local flat file or XML file.

RELATED TOPICS:
♦ “Selecting the Source Database Connection” on page 150

♦ “Using FTP” on page 443

Configuring Properties
Click the Properties settings in the Sources node to define source property information. The Workflow
Manager displays properties, such as source file name and location for flat file, COBOL, and XML source file
types. You do not need to define any properties on the Properties settings for relational sources.

RELATED TOPICS:
♦ “Working with Relational Sources” on page 149
♦ “Working with File Sources” on page 151

Working with Relational Sources


When you configure a session to read data from a relational source, you can configure the following properties
for sources:
♦ Source database connection. Select the database connection for each relational source. For more
information, see “Selecting the Source Database Connection” on page 150.
♦ Treat source rows as. Define how the Integration Service treats each source row as it reads it from the source
table. For more information, see “Defining the Treat Source Rows As Property” on page 150.
♦ Override SQL query. You can override the default SQL query to extract source data. For more information,
see “Overriding the SQL Query” on page 150.

Working with Relational Sources 149


♦ Table owner name. Define the table owner name for each relational source. For more information, see
“Configuring the Table Owner Name” on page 151.
♦ Source table name. You can override the source table name for each relational source. For more
information, see “Overriding the Source Table Name” on page 151.

Selecting the Source Database Connection


Before you can run a session to read data from a source database, the Integration Service must connect to the
source database. Database connections must exist in the repository to appear on the source database list. You
must define them prior to configuring a session.
On the Connections settings in the Sources node, choose the database connection. You can select a connection
object, use a connection variable, or use a session parameter to define the connection value in a parameter file.

RELATED TOPICS:
♦ “Relational Database Connections” on page 28

♦ “Configuring Session Connections” on page 22

Defining the Treat Source Rows As Property


When the Integration Service reads a source, it marks each row with an indicator to specify which operation to
perform when the row reaches the target. You can define how the Integration Service marks each row using the
Treat Source Rows As property in the General Options settings on the Properties tab.
Table 9-1 describes the options you can choose for the Treat Source Rows As property:

Table 9-1. Treat Source Rows As Options

Treat Source
Description
Rows As Option

Insert Integration Service marks all rows to insert into the target.

Delete Integration Service marks all rows to delete from the target.

Update Integration Service marks all rows to update the target. You can further define the
update operation in the target options.

Data Driven Integration Service uses the Update Strategy transformations in the mapping to
determine the operation on a row-by-row basis. You define the update operation in the
target options. If the mapping contains an Update Strategy transformation, this option
defaults to Data Driven. You can also use this option when the mapping contains
Custom transformations configured to set the update strategy.

After you determine how to treat all rows in the session, you also need to set update strategy options for
individual targets.

RELATED TOPICS:
♦ “Target Properties” on page 167

Overriding the SQL Query


You can alter or override the default query in the mapping by entering SQL override in the Properties settings
in the Sources node. You can enter any SQL statement supported by the source database.
The Workflow Manager does not validate the SQL override. The following errors can cause data errors and
session failure:
♦ Fields with incompatible datatypes or unknown fields
♦ Typing mistakes or other errors

150 Chapter 9: Working with Sources


To override the default query for a relational source:

1. In the Workflow Manager, open the session properties.


2. Click the Mapping tab and open the Transformations view.
3. Click the Sources node and open the Properties settings.
4. Click the Open button in the SQL Query field to open the SQL Editor.
5. Enter the SQL override.
6. Click OK to return to the session properties.

Configuring the Table Owner Name


You can define the owner name of the source table in the session properties. For some databases such as DB2,
tables can have different owners. If the database user specified in the database connection is not the owner of
the source tables in a session, specify the table owner for each source instance. A session can fail if the database
user is not the owner and you do not specify the table owner name.
Specify the table owner name in the Owner Name field in the Properties settings on the Mapping tab.
You can use a parameter or variable as the table owner name. Use any parameter or variable type that you can
define in the parameter file. For example, you can use a session parameter, $ParamMyTableOwner, as the table
owner name, and set $ParamMyTableOwner to the table owner name in the parameter file.

Overriding the Source Table Name


You can override the source table name in the session properties. Override the source table name when you use
a single session to read data from different source tables. Enter a table name in the source table name, or enter a
parameter or variable to define the source table name in the parameter file. You can use mapping parameters,
mapping variables, session parameters, workflow variables, or worklet variables in the source table name. For
example, you can use a session parameter, $ParamSrcTable, as the source table name, and set $ParamSrcTable
to the source table name in the parameter file.
Note: If you override the source table name on the Properties tab of the source instance, and you override the
source table name using an SQL query, the Integration Service uses the source table name defined in the SQL
query.

Working with File Sources


You can create a session to extract data from flat file, COBOL, or XML sources. When you create a session to
read data from a file, you can configure the following information in the session properties:
♦ Source properties. You can define source properties on the Properties settings in the Sources node, such as
source file options. For more information, see “Configuring Source Properties” on page 152 and
“Configuring Commands for File Sources” on page 152.
♦ Flat file properties. You can edit fixed-width and delimited source file properties. For more information, see
“Configuring Fixed-Width File Properties” on page 153 and “Configuring Delimited File Properties” on
page 154.
♦ Line sequential buffer length. You can change the buffer length for flat files on the Advanced settings on the
Config Object tab. For more information, see “Configuring Line Sequential Buffer Length” on page 155.
♦ Treat source rows as. You can define how the Integration Service treats each source row as it reads it from
the source. For more information, see “Defining the Treat Source Rows As Property” on page 150.

Working with File Sources 151


Configuring Source Properties
You can define session source properties on the Properties settings on the Mapping tab.
Table 9-2 describes the properties you define for flat file source definitions:

Table 9-2. Flat File Source Properties

File Source
Description
Options

Input Type Type of source input. You can choose the following types of source input:
- File. For flat file, COBOL, or XML sources.
- Command. For source data or a file list generated by a command.
You cannot use a command to generate XML source data.

Source File Directory name of flat file source. By default, the Integration Service looks in the service
Directory process variable directory, $PMSourceFileDir, for file sources.
If you specify both the directory and file name in the Source Filename field, clear this
field. The Integration Service concatenates this field with the Source Filename field
when it runs the session.
You can also use the $InputFileName session parameter to specify the file location.

Source File Name File name, or file name and path of flat file source. Optionally, use the $InputFileName
session parameter for the file name.
The Integration Service concatenates this field with the Source File Directory field when
it runs the session. For example, if you have “C:\data\” in the Source File Directory field,
then enter “filename.dat” in the Source Filename field. When the Integration Service
begins the session, it looks for “C:\data\filename.dat”.
By default, the Workflow Manager enters the file name configured in the source
definition.

Source File Type Indicates whether the source file contains the source data, or whether it contains a list of
files with the same file properties. You can choose the following source file types:
- Direct. For source files that contain the source data.
- Indirect. For source files that contain a list of files. When you select Indirect, the
Integration Service finds the file list and reads each listed file when it runs the session.
For more information about file lists, see “Using a File List” on page 157.

Command Type Type of source data the command generates. You can choose the following command
types:
- Command generating data for commands that generate source data input rows.
- Command generating file list for commands that generate a file list.

Command Command used to generate the source file data.

Set File Properties Overrides source file properties. By default, the Workflow Manager displays file
link properties as configured in the source definition.
For more information, see “Configuring Fixed-Width File Properties” on page 153 and
“Configuring Delimited File Properties” on page 154.

Configuring Commands for File Sources


Use a command to generate flat file source data input rows or a list of source files for a session. For UNIX, use
any valid UNIX command or shell script. For Windows, use any valid DOS or batch file on Windows. You can
also use service process variables, such as $PMSourceFileDir, in the command.

Generating Flat File Source Data


Use a command to generate the input rows for flat file source data. Use a command to generate or transform
flat file data and send the standard output of the command to the flat file reader when the session runs. The flat
file reader reads the standard output of the command as the flat file source data. Generating source data with a
command eliminates the need to stage a flat file source. Use a command or script to send source data directly to
the Integration Service instead of using a pre-session command to generate a flat file source.

152 Chapter 9: Working with Sources


For example, to uncompress a data file and use the uncompressed data as the source data input rows, use the
following command:
uncompress -c $PMSourceFileDir/myCompressedFile.Z

The command uncompresses the file and sends the standard output of the command to the flat file reader. The
flat file reader reads the standard output of the command as the flat file source data.

Generating a File List


Use a command to generate a list of source files. The flat file reader reads each file in the list when the session
runs. Use a command to generate a file list when the list of source files changes often or you want to generate a
file list based on specific conditions. You might want to use a command to generate a file list based on a
directory listing.
For example, to use a directory listing as a file list, use the following command:
cd $PMSourceFileDir; ls -1 sales-records-Sep-*-2005.dat

The command generates a file list from the source file directory listing. When the session runs, the flat file
reader reads each file as it reads the file names from the command.
To use the output of a command as a file list, select Command as the Input Type, Command generating file list
as the Command Type, and enter a command for the Command property.

Configuring Fixed-Width File Properties


When you read data from a fixed-width file, you can edit file properties in the session, such as the null character
or code page. You can configure fixed-width properties for non-reusable sessions in the Workflow Designer and
for reusable sessions in the Task Developer. You cannot configure fixed-width properties for instances of
reusable sessions in the Workflow Designer.
Click Set File Properties to open the Flat Files dialog box. To edit the fixed-width properties, select Fixed
Width and click Advanced. The Fixed Width Properties dialog box appears. By default, the Workflow Manager
displays file properties as configured in the mapping. Edit these settings to override those configured in the
source definition.
Table 9-3 describes options you can define in the Fixed Width Properties dialog box for file sources:

Table 9-3. Fixed-Width File Properties for File Sources

Fixed-Width
Description
Properties Options

Text/Binary Indicates the character representing a null value in the file. This can be any valid
character in the file code page, or any binary value from 0 to 255. For more
information about specifying null characters, see “Null Character Handling” on
page 156.

Repeat Null If selected, the Integration Service reads repeat null characters in a single field as a
Character single null value. If you do not select this option, the Integration Service reads a
single null character at the beginning of a field as a null field.
Important: For multibyte code pages, specify a single-byte null character if you use
repeating non-binary null characters. This ensures that repeating null characters fit
into the column.

Code Page Code page of the fixed-width file. Select a code page or a variable:
- Code page. Select the code page.
- Use Variable. Enter a user-defined workflow or worklet variable or the session
parameter $ParamName, and define the code page in the parameter file. Use the
code page name. For more information about supported code pages, see the
PowerCenter Administrator Guide.
Default is the PowerCenter Client code page.

Number of Initial Integration Service skips the specified number of rows before reading the file. Use
Rows to Skip this to skip header rows. One row may contain multiple records. If you select the Line
Sequential File Format option, the Integration Service ignores this option.

Working with File Sources 153


Table 9-3. Fixed-Width File Properties for File Sources

Fixed-Width
Description
Properties Options

Number of Bytes to Integration Service skips the specified number of bytes between records. For
Skip Between example, you have an ASCII file on Windows with one record on each line, and a
Records carriage return and line feed appear at the end of each line. If you want the
Integration Service to skip these two single-byte characters, enter 2.
If you have an ASCII file on UNIX with one record for each line, ending in a carriage
return, skip the single character by entering 1.

Strip Trailing Blanks If selected, the Integration Service strips trailing blank spaces from records before
passing them to the Source Qualifier transformation.

Line Sequential File Select this option if the file uses a carriage return at the end of each record,
Format shortening the final column.

Configuring Delimited File Properties


When you read data from a delimited file, you can edit file properties in the session, such as the delimiter or
code page. You can configure delimited properties for non-reusable sessions in the Workflow Designer and for
reusable sessions in the Task Developer. You cannot configure delimited properties for instances of reusable
sessions in the Workflow Designer. Click Set File Properties to open the Flat Files dialog box.
To edit the delimited properties, select Delimited and click Advanced. The Delimited File Properties dialog
box appears. By default, the Workflow Manager displays file properties as configured in the mapping. Edit
these settings to override those configured in the source definition.
Table 9-4 describes options you can define in the Delimited File Properties dialog box for file sources:

Table 9-4. Delimited File Properties for File Sources

Delimited File
Description
Properties Options

Column Delimiters One or more characters used to separate columns of data. Delimiters can be either
printable or single-byte unprintable characters, and must be different from the escape
character and the quote character (if selected). To enter a single-byte unprintable
character, click the Browse button to the right of this field. In the Delimiters dialog
box, select an unprintable character from the Insert Delimiter list and click Add. You
cannot select unprintable multibyte characters as delimiters.
Maximum number of delimiters is 80.

Treat Consecutive By default, the Integration Service treats multiple delimiters separately. If selected,
Delimiters as One the Integration Service reads any number of consecutive delimiter characters as one.
For example, a source file uses a comma as the delimiter character and contains the
following record: 56, , , Jane Doe. By default, the Integration Service reads that
record as four columns separated by three delimiters: 56, NULL, NULL, Jane Doe. If
you select this option, the Integration Service reads the record as two columns
separated by one delimiter: 56, Jane Doe.

Optional Quotes Select No Quotes, Single Quote, or Double Quotes. If you select a quote character,
the Integration Service ignores delimiter characters within the quote characters.
Therefore, the Integration Service uses quote characters to escape the delimiter.
For example, a source file uses a comma as a delimiter and contains the following
row: 342-3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
If you select the optional single quote character, the Integration Service ignores the
commas within the quotes and reads the row as four fields.
If you do not select the optional single quote, the Integration Service reads six
separate fields.
When the Integration Service reads two optional quote characters within a quoted
string, it treats them as one quote character. For example, the Integration Service
reads the following quoted string as I’m going tomorrow:
2353, ‘I’’m going tomorrow’, MD
Additionally, if you select an optional quote character, the Integration Service reads a
string as a quoted string if the quote character is the first character of the field.
Note: You can improve session performance if the source file does not contain quotes
or escape characters.

154 Chapter 9: Working with Sources


Table 9-4. Delimited File Properties for File Sources

Delimited File
Description
Properties Options

Code Page Code page of the delimited file. Select a code page or a variable:
- Code page. Select the code page.
- Use Variable. Enter a user-defined workflow or worklet variable or the session
parameter $ParamName, and define the code page in the parameter file. Use the
code page name. For more information about supported code pages, see the
PowerCenter Administrator Guide.
Default is the PowerCenter Client code page.

Row Delimiter Specify a line break character. Select from the list or enter a character. Preface an
octal code with a backslash (\). To use a single character, enter the character.
The Integration Service uses only the first character when the entry is not preceded
by a backslash. The character must be a single-byte character, and no other
character in the code page can contain that byte. Default is line-feed, \012 LF (\n).

Escape Character Character immediately preceding a delimiter character embedded in an unquoted


string, or immediately preceding the quote character in a quoted string. When you
specify an escape character, the Integration Service reads the delimiter character as
a regular character (called escaping the delimiter or quote character).
Note: You can improve session performance for mappings containing Sequence
Generator transformations if the source file does not contain quotes or escape
characters.

Remove Escape This option is selected by default. Clear this option to include the escape character in
Character From Data the output string.

Number of Initial Integration Service skips the specified number of rows before reading the file. Use
Rows to Skip this to skip title or header rows in the file.

Configuring Line Sequential Buffer Length


You can configure the line buffer length for file sources. By default, the Integration Service reads a file record
into a buffer that holds 1024 bytes. If the source file records are larger than 1024 bytes, increase the Line
Sequential Buffer Length property in the session properties accordingly. Define the line buffer length on the
Config Object tab in the session properties.

Integration Service Handling for File Sources


When you configure a session with file sources, you might take these additional features into account when
creating mappings with file sources:
♦ Character set
♦ Multibyte character error handling
♦ Null character handling
♦ Row length handling for fixed-width flat files
♦ Numeric data handling
♦ Tab handling

Character Set
You can configure the Integration Service to run sessions in either ASCII or Unicode data movement mode.

Integration Service Handling for File Sources 155


Table 9-5 describes source file formats supported by each data movement path in PowerCenter:

Table 9-5. Support for ASCII and Unicode Data Movement Modes

Character Set Unicode mode ASCII mode

7-bit ASCII Supported Supported

US-EBCDIC Supported Supported


(COBOL sources only)

8-bit ASCII Supported Supported

8-bit EBCDIC Supported Supported


(COBOL sources only)

ASCII-based MBCS Supported Integration Service generates a warning message.

EBCDIC-based SBCS Supported Not supported. The Integration Service terminates the
session.

EBCDIC-based MBCS Supported Not supported. The Integration Service terminates the
session.

If you configure a session to run in ASCII data movement mode, delimiters, escape characters, and null
characters must be valid in the ISO Western European Latin 1 code page. Any 8-bit characters you specified in
previous versions of PowerCenter are still valid. In Unicode data movement mode, delimiters, escape characters,
and null characters must be valid in the specified code page of the flat file.

Multibyte Character Error Handling


Misalignment of multibyte data in a file causes session errors. Data becomes misaligned when you place column
breaks incorrectly in a file, resulting in multibyte characters that extend beyond the last byte in a column.
When you import a fixed-width flat file, you can create, move, or delete column breaks using the Flat File
Wizard. Incorrect positioning of column breaks can create alignment errors when you run a session containing
multibyte characters.
The Integration Service handles alignment errors in fixed-width flat files according to the following guidelines:
♦ Non-line sequential file. The Integration Service skips rows containing misaligned data and resumes reading
the next row. The skipped row appears in the session log with a corresponding error message. If an
alignment error occurs at the end of a row, the Integration Service skips both the current row and the next
row, and writes them to the session log.
♦ Line sequential file. The Integration Service skips rows containing misaligned data and resumes reading the
next row. The skipped row appears in the session log with a corresponding error message.
♦ Reader error threshold. You can configure a session to stop after a specified number of non-fatal errors. A
row containing an alignment error increases the error count by 1. The session stops if the number of rows
containing errors reaches the threshold set in the session properties. Errors and corresponding error messages
appear in the session log file.
Fixed-width COBOL sources are always byte-oriented and can be line sequential. The Integration Service
handles COBOL files according to the following guidelines:
♦ Line sequential files. The Integration Service skips rows containing misaligned data and writes the skipped
rows to the session log. The session stops if the number of error rows reaches the error threshold.
♦ Non-line sequential files. The session stops at the first row containing misaligned data.

Null Character Handling


You can specify single-byte or multibyte null characters for fixed-width flat files. The Integration Service uses
these characters to determine if a column is null.

156 Chapter 9: Working with Sources


Table 9-6 describes how the Integration Service uses the Null Character and Repeat Null Character properties
to determine if a column is null:

Table 9-6. Null Character Handling

Null Repeat Null


Integration Service Behavior
Character Character

Binary Disabled A column is null if the first byte in the column is the binary null character. The
Integration Service reads the rest of the column as text data to determine the
column alignment and track the shift state for shift sensitive code pages. If
data in the column is misaligned, the Integration Service skips the row and
writes the skipped row and a corresponding error message to the session log.

Non-binary Disabled A column is null if the first character in the column is the null character. The
Integration Service reads the rest of the column to determine the column
alignment and track the shift state for shift sensitive code pages. If data in the
column is misaligned, the Integration Service skips the row and writes the
skipped row and a corresponding error message to the session log.

Binary Enabled A column is null if it contains the specified binary null character. The next
column inherits the initial shift state of the code page.

Non-binary Enabled A column is null if the repeating null character fits into the column with no
bytes leftover. For example, a five-byte column is not null if you specify a two-
byte repeating null character. In shift-sensitive code pages, shift bytes do not
affect the null value of a column. A column is still null if it contains a shift byte
at the beginning or end of the column.
Specify a single-byte null character if you use repeating non-binary null
characters. This ensures that repeating null characters fit into a column.

Row Length Handling for Fixed-Width Flat Files


For fixed-width flat files, data in a row can be shorter than the row length in the following situations:
♦ The file is fixed-width line-sequential with a carriage return or line feed that appears sooner than expected.
♦ The file is fixed-width non-line sequential, and the last line in the file is shorter than expected.
In these cases, the Integration Service reads the data but does not append any blanks to fill the remaining bytes.
The Integration Service reads subsequent fields as NULL. Fields containing repeating null characters that do
not fill the entire field length are not considered NULL.

Numeric Data Handling


Sometimes, file sources contain non-numeric data in numeric columns. When the Integration Service reads
non-numeric data, it treats the row differently, depending on the source type. When the Integration Service
reads non-numeric data from numeric columns in a flat file source or an XML source, it drops the row and
writes the row to the session log. When the Integration Service reads non-numeric data for numeric columns in
a COBOL source, it reads a null value for the column.

Using a File List


You can create a session to run multiple source files for one source instance in the mapping. You might use this
feature if, for example, the organization collects data at several locations which you then want to move through
the same session. When you create a mapping to use multiple source files for one source instance, the properties
of all files must match the source definition.
To use multiple source files, you create a file containing the names and directories of each source file you want
the Integration Service to use. This file is referred to as a file list.

Using a File List 157


When you configure the session properties, enter the file name of the file list in the Source Filename field and
enter the location of the file list in the Source File Directory field. When the session starts, the Integration
Service reads the file list, then locates and reads the first file source in the list. After the Integration Service reads
the first file, it locates and reads the next file in the list.
The Integration Service writes the path and name of the file list to the session log. If the Integration Service
encounters an error while accessing a source file, it logs the error in the session log and stops the session.
Note: When you use a file list and the session performs incremental aggregation, the Integration Service
performs incremental aggregation across all listed source files.

Creating the File List


The file list contains the names of all the source files you want the Integration Service to use for the source
instance in the session. Create the file list in an editor appropriate to the Integration Service platform and save
it as a text file. For example, you can create a file list for an Integration Service on Windows with any text editor
then save it as ASCII.
The Integration Service interprets the file list using the Integration Service code page. Each file in the list must
use the user-defined code page configured in the source definition.
Each file in the file list must share the same file properties as configured in the source definition or as entered
for the source instance in the session property sheet. You can enter different paths for each file in the list, but
for the session to complete successfully, the paths must be local to the Integration Service machine. Map the
drives on an Integration Service on Windows or mount the drives on an Integration Service on UNIX. If you
do not specify a path for a file, the Integration Service assumes the file is in the same directory as the file list.
The file list format must follow the following guidelines:
♦ Text file
♦ One file name, or path and file name, for each line
The Integration Service skips blank lines and ignores leading blank spaces. Any characters indicating a new line,
such as \n in ASCII files, must be valid in the code page of the Integration Service.
The following example shows a valid file list created for an Integration Service on Windows. Each of the drives
listed are mapped on the Integration Service machine. The western_trans.dat file is located in the same
directory as the file list.
western_trans.dat
d:\data\eastern_trans.dat
e:\data\midwest_trans.dat
f:\data\canada_trans.dat

After you create the file list, place it in a directory local to the Integration Service.

Configuring a Session to Use a File List


After you create a file list for multiple source files, you can configure the session to access those files.

To use multiple source files for one source instance in a session:

1. In the Workflow Manager, open the session properties.


2. Click the Mapping tab and open the Transformations view.
3. Click the Properties settings in the Sources node.
4. In the Source Filetype field, choose Indirect.
5. In the Source Filename field, replace the file name with the name of the file list.
If necessary, also enter the path in the Source File Directory field.
If you enter a file name in the Source Filename field, and you have specified a path in the Source File
Directory field, the Integration Service looks for the named file in the listed directory.

158 Chapter 9: Working with Sources


If you enter a file name in the Source Filename field, and you do not specify a path in the Source File
Directory field, the Integration Service looks for the named file in the directory where the Integration
Service is installed on UNIX or in the system directory on Windows.
6. Click OK.

Using FastExport
FastExport is a utility that uses multiple Teradata sessions to quickly export large amounts of data from a
Teradata database. You can create a PowerCenter session that uses FastExport to read Teradata sources.
To use FastExport, create a mapping with a Teradata source database. The mapping can include multiple
source definitions from the same Teradata source database joined in a single Source Qualifier transformation.
In the session, use FastExport reader instead of Relational reader. Use a FastExport connection to the Teradata
tables you want to export in a session.
FastExport uses a control file that defines what to export. When a session starts, the Integration Service creates
the control file from the FastExport connection attributes. If you create a SQL override for the Teradata tables,
the Integration Service uses the SQL to generate the control file. You can override the control file for a session
by defining a control file in the session properties.
The Integration Service writes FastExport messages in the session log and information about FastExport
performance in the FastExport log. PowerCenter saves the FastExport log in the folder defined by the
Temporary File Name session attribute. The default extension for the FastExport log is .log.
To use FastExport in a session, complete the following steps:
1. Create a FastExport connection in the Workflow Manager and configure the connection attributes.
2. Open the session and change the reader from Relational to Teradata FastExport.
3. Change the connection type and select a FastExport connection for the session.
4. Optionally, create a FastExport control file in a text editor and save it in the repository.

Creating a FastExport Connection


Create a FastExport connection in the Workflow Manager. If you edit a FastExport connection, all sessions
using the connection use the updated connection.

To create a FastExport connection:

1. Click Connections > Application in the Workflow Manager.


The Connection Browser dialog box appears.
2. Click New.
3. Select a Teradata FastExport connection and click OK.
4. Enter a name for the FastExport connection.
5. Enter the database user name and password.
6. Enter the FastExport attributes and click OK.

Using FastExport 159


Table 9-7 describes the attributes that you configure for a Teradata FastExport connection:

Table 9-7. FastExport Connection Attributes

Attribute Default Value Description

TDPID n/a Teradata database ID.

Tenacity 4 Number of hours that FastExport tries to log on to the


Teradata database. When FastExport tries to log on but
the maximum number of Teradata sessions is already
running, FastExport waits for the amount of time defined
by the SLEEP option. After the SLEEP time, FastExport
tries to log on to the Teradata Database again.
FastExport repeats this process until it has either logged
on for the required number of sessions or exceeded the
TENACITY hours time period.

Max Sessions 1 Maximum number of FastExport sessions per FastExport


job. Max Sessions must be between 1 and the total
number of access module processes (AMPs) on your
system.

Sleep 6 Number of minutes FastExport pauses before retrying a


login. FastExport attempts a login until the login succeeds
or the Tenacity hours elapse.

Block Size 64000 Maximum block size to use for the exported data.

Data Encryption Disabled Enables data encryption for FastExport. You can use data
encryption with the version 8 Teradata client.

Logtable Name FE_<source_tabl Restart log table name. The FastExport utility uses the
e_name> information in the restart log table to restart jobs that halt
because of a Teradata database or client system failure.
Each FastExport job should use a separate logtable. If
you specify a table that does not exist, the FastExport
utility creates the table and uses it as the restart log.
PowerCenter does not support restarting FastExport, but
if you stage the output, you can restart FastExport
manually.

Database Name n/a The name of the Teradata database you want to connect
to. The Integration Service generates the SQL statement
using the database name as a prefix to the table name.

For more information about the connection attributes, see the Teradata documentation.

Changing the Reader


The default reader for Teradata is relational. To use FastExport, change the reader to Teradata FastExport.

Changing the Source Connection


To use FastExport in the session, change the Teradata source connection to a Teradata FastExport connection.
You can override some session attributes.

160 Chapter 9: Working with Sources


Table 9-8 describes the session attributes you can change for FastExport:

Table 9-8. Fast Export Session Attributes

Attribute Default Value Precision

Is Staged Disabled If enabled, FastExport writes data to a stage file. Otherwise,


FastExport writes data to a named pipe.

Fractional seconds 0 The precision for fractional seconds following the decimal
precision point in a timestamp. You can enter 0 to 6. For example, a
timestamp with a precision of 6 is 'hh:mi:ss.ss.ss.ss.' The
fractional seconds precision must match the setting in the
Teradata database.

Temporary File $PMTempDir\ PowerCenter uses the temporary file name to generate the
names for the log file, control file, and the staged output file.
Enter a complete path for the file.

Control File Override Blank The control file text. Use this attribute to override the control
file the Integration Service creates for a session.

Overriding the Control File


By default, the Integration Service generates a FastExport control file based on session and connection
properties when you run a session with FastExport. The Integration Service saves the control file it generates in
the temporary file directory and overwrites it the next time you run the session.
You can override the control file that the Integration Service generates. When you override the control file, the
Workflow Designer saves the control file to the repository. The Integration Service uses the saved control file
when you run the session.
Each FastExport statement must meet the following criteria:
♦ Begin on a new line.
♦ Start with a period (.).
♦ End with a semicolon (;).
For more information about the control file, see the Teradata documentation.
Table 9-9 describes the control file statements you can use with PowerCenter:

Table 9-9. FastExport Control File Statements

Control File Statement Description

.LOGTABLE utillog; The restart logtable name.

LOGON tdpz/user,pswd; The database login string, including the database, user name, and
password.

BEGIN EXPORT The first export command.

.SESSIONS 20; The number of Teradata sessions.

.EXPORT OUTFILE ddname2; The destination file for the exported data.

SELECT EmpNo, Hours FROM charges The SQL statements to select data.

WHERE Proj_ID = 20

ORDER BY EmpNo ;

.END EXPORT ; Indicates the end of an export task and initiates the export process.

LOGOFF ; Disconnect from the database.

Using FastExport 161


To override the control file:

1. Create a control file in a text editor.


2. Copy the control file text to the clipboard.
3. Paste the control file text into the Control File Override field.
The Workflow Manager does not validate the control file syntax. Teradata verifies the control file syntax when
you run a session. If the control file is invalid, the session fails.
Tip: You can change the control file to read-only to use the control file for each session. The Integration Service
does not overwrite the read-only file.

Rules and Guidelines


Use the following rules and guidelines when you use FastExport with PowerCenter:
♦ When you use an SQL override for Teradata, PowerCenter uses it to create the FastExport control file. If
you do not use an SQL override, PowerCenter generates a control file based on the connected ports in the
source qualifier.
♦ FastExport supports a maximum export file size of 2 GB on a UNIX MP-RAS operating system. Other
operating systems have no file size limitation.
♦ You cannot concatenate exported data files.
♦ The session fails if you use a pre-session SQL command and FastExport.

162 Chapter 9: Working with Sources


CHAPTER 10

Working with Targets


This chapter includes the following topics:
♦ Overview, 163
♦ Configuring Targets in a Session, 165
♦ Working with Relational Targets, 166
♦ Working with Target Connection Groups, 177
♦ Working with Active Sources, 178
♦ Working with File Targets, 178
♦ Integration Service Handling for File Targets, 183
♦ Working with Heterogeneous Targets, 187
♦ Reject Files, 188

Overview
In the Workflow Manager, you can create sessions with the following targets:
♦ Relational. You can load data to any relational database that the Integration Service can connect to. When
loading data to relational targets, you must configure the database connection to the target before you
configure the session.
♦ File. You can load data to a flat file or XML target or write data to an operating system command. For flat
file or XML targets, the Integration Service can load data to any local directory or FTP connection for the
target file. If the file target requires an FTP connection, you need to configure the FTP connection to the
host machine before you create the session.
♦ Heterogeneous. You can output data to multiple targets in the same session. You can output to multiple
relational targets, such as Oracle and Microsoft SQL Server. Or, you can output to multiple target types,
such as relational and flat file. For more information, see “Working with Heterogeneous Targets” on
page 187.

Globalization Features
You can configure the Integration Service to run sessions in either ASCII or Unicode data movement mode.

163
Table 10-1 describes target character sets supported by each data movement mode in PowerCenter:

Table 10-1. Support for ASCII and Unicode Data Movement Modes

Character Set Unicode Mode ASCII Mode

7-bit ASCII Supported Supported

ASCII-based MBCS Supported Integration Service generates a warning message,


but does not terminate the session.

UTF-8 Supported (Targets Only) Integration Service generates a warning message,


but does not terminate the session.

EBCDIC-based SBCS Supported Not supported. The Integration Service terminates


the session.

EBCDIC-based MBCS Supported Not supported. The Integration Service terminates


the session.

You can work with targets that use multibyte character sets with PowerCenter. You can choose a code page that
you want the Integration Service to use for relational objects and flat files. You specify code pages for relational
objects when you configure database connections in the Workflow Manager. The code page for a database
connection used as a target must be a superset of the source code page.
When you change the database connection code page to one that is not two-way compatible with the old code
page, the Workflow Manager generates a warning and invalidates all sessions that use that database connection.
Code pages you select for a file represent the code page of the data contained in these files. If you are working
with flat files, you can also specify delimiters and null characters supported by the code page you have specified
for the file.
Target code pages must be a superset of the source code page.
However, if you configure the Integration Service and Client for code page relaxation, you can select any code
page supported by PowerCenter for the target database connection. When using code page relaxation, select
compatible code pages for the source and target data to prevent data inconsistencies.
If the target contains multibyte character data, configure the Integration Service to run in Unicode mode.
When the Integration Service runs a session in Unicode mode, it uses the database code page to translate data.
If the target contains only single-byte characters, configure the Integration Service to run in ASCII mode.
When the Integration Service runs a session in ASCII mode, it does not validate code pages.

Target Connections
Before you can load data to a target, you must configure the connection properties the Integration Service uses
to connect to the target file or database. You can configure target database and FTP connections in the
Workflow Manager.

RELATED TOPICS:
♦ “Relational Database Connections” on page 28

♦ “FTP Connections” on page 34

Partitioning Targets
When you create multiple partitions in a session with a relational target, the Integration Service creates multiple
connections to the target database to write target data concurrently. When you create multiple partitions in a
session with a file target, the Integration Service creates one target file for each partition. You can configure the
session properties to merge these target files.

RELATED TOPICS:
♦ “Understanding Pipeline Partitioning” on page 247

164 Chapter 10: Working with Targets


Configuring Targets in a Session
Configure target properties for sessions in the Transformations view on Mapping tab of the session properties.
Click the Targets node to view the target properties. When you configure target properties for a session, you
define properties for each target instance in the mapping.
Figure 10-1 shows where you define target properties in a session:

Figure 10-1. Defining Target Properties in the Session Properties

Targets Node

Writers
Settings

Connections
Settings

Properties
Settings

Transformations
View

The Targets node contains the following settings where you define properties:
♦ Writers
♦ Connections
♦ Properties

Configuring Writers
Click the Writers settings in the Transformations view to define the writer to use with each target instance.
When the mapping target is a flat file, an XML file, an SAP NetWeaver BI target, or a WebSphere MQ target,
the Workflow Manager specifies the necessary writer in the session properties. However, when the target is
relational, you can change the writer type to File Writer if you plan to use an external loader.
Note: You can change the writer type for non-reusable sessions in the Workflow Designer and for reusable
sessions in the Task Developer. You cannot change the writer type for instances of reusable sessions in the
Workflow Designer.
When you override a relational target to use the file writer, the Workflow Manager changes the properties for
that target instance on the Properties settings. It also changes the connection options you can define in the
Connections settings.
If the target contains a column with datetime values, the Integration Service compares the date formats defined
for the target column and the session. When the date formats do not match, the Integration Service uses the
date format with the lesser precision. For example, a session writes to a Microsoft SQL Server target that
includes a Datetime column with precision to the millisecond. The date format for the session is
MM/DD/YYYY HH24:MI:SS.NS. If you override the Microsoft SQL Server target with a flat file writer, the

Configuring Targets in a Session 165


Integration Service writes datetime values to the flat file with precision to the millisecond. If the date format for
the session is MM/DD/YYYY HH24:MI:SS, the Integration Service writes datetime values to the flat file with
precision to the second.
After you override a relational target to use a file writer, define the file properties for the target. Click Set File
Properties and choose the target to define.

RELATED TOPICS:
♦ “Configuring Fixed-Width Properties” on page 181

♦ “Configuring Delimited Properties” on page 182

Configuring Connections
View the Connections settings on the Mapping tab to define target connection information. For relational
targets, the Workflow Manager displays Relational as the target type by default. In the Value column, choose a
configured database connection for each relational target instance.
For flat file and XML targets, choose one of the following target connection types in the Type column for each
target instance:
♦ FTP. If you want to load data to a flat file or XML target using FTP, you must specify an FTP connection
when you configure target options. FTP connections must be defined in the Workflow Manager prior to
configuring sessions.
♦ Loader. Use the external loader option to improve the load speed to Oracle, DB2, Sybase IQ, or Teradata
target databases.
To use this option, you must use a mapping with a relational target definition and choose File as the writer
type on the Writers settings for the relational target instance. The Integration Service uses an external loader
to load target files to the Oracle, DB2, Sybase IQ, or Teradata database. You cannot choose external loader
if the target is defined in the mapping as a flat file, XML, MQ, or SAP BW target.
♦ Queue. Choose Queue when you want to output to a WebSphere MQ or MSMQ message queue.
♦ None. Choose None when you want to write to a local flat file or XML file.

RELATED TOPICS:
♦ “Target Database Connection” on page 167

♦ “External Loading” on page 419


♦ “Using FTP” on page 443

Configuring Properties
View the Properties settings on the Mapping tab to define target property information. The Workflow Manager
displays different properties for the different target types: relational, flat file, and XML.

RELATED TOPICS:
♦ “Working with Relational Targets” on page 166

♦ “Working with File Targets” on page 178


♦ “Working with Heterogeneous Targets” on page 187

Working with Relational Targets


When you configure a session to load data to a relational target, you define most properties in the
Transformations view on the Mapping tab. You also define some properties on the Properties tab and the
Config Object tab.

166 Chapter 10: Working with Targets


You can configure the following properties for relational targets:
♦ Target database connection. Define database connection information. For more information, see “Target
Database Connection” on page 167.
♦ Target properties. You can define target properties such as target load type, target update options, and reject
options. For more information, see “Target Properties” on page 167.
♦ Truncate target tables. The Integration Service can truncate target tables before loading data. For more
information, see “Truncating Target Tables” on page 170.
♦ Deadlock retry. You can configure the session to retry deadlocks when writing to targets or a recovery table.
For more information, see “Deadlock Retry” on page 171.
♦ Drop and recreate indexes. Use pre- and post-session SQL to drop and recreate an index on a relational
target table to optimize query speed. For more information, see “Dropping and Recreating Indexes” on
page 172.
♦ Constraint-based loading. The Integration Service can load data to targets based on primary key-foreign key
constraints and active sources in the session mapping. For more information, see “Constraint-Based
Loading” on page 172.
♦ Bulk loading. You can specify bulk mode when loading to DB2, Microsoft SQL Server, Oracle, and Sybase
databases. For more information, see “Bulk Loading” on page 174.
You can define the following properties in the session and override the properties you define in the mapping:
♦ Table name prefix. You can specify the target owner name or prefix in the session properties to override the
table name prefix in the mapping. For more information, see “Table Name Prefix” on page 176.
♦ Pre-session SQL. You can create SQL commands and execute them in the target database before loading
data to the target. For example, you might want to drop the index for the target table before loading data
into it. For more information, see “Using Pre- and Post-Session SQL Commands” on page 127.
♦ Post-session SQL. You can create SQL commands and execute them in the target database after loading data
to the target. For example, you might want to recreate the index for the target table after loading data into it.
For more information, see “Using Pre- and Post-Session SQL Commands” on page 127.
♦ Target table name. You can override the target table name for each relational target. For more information,
see “Target Table Name” on page 176.
If any target table or column name contains a database reserved word, you can create and maintain a reserved
words file containing database reserved words. When the Integration Service executes SQL against the database,
it places quotes around the reserved words. For more information, see “Reserved Words” on page 176.
When the Integration Service runs a session with at least one relational target, it performs database transactions
per target connection group. For example, it commits all data to targets in a target connection group at the
same time. For more information, see “Working with Target Connection Groups” on page 177.

Target Database Connection


Before you can run a session to load data to a target database, the Integration Service must connect to the target
database. Database connections must exist in the repository to appear on the target database list. You must
define them prior to configuring a session.
On the Connections settings in the Targets node, choose the database connection. You can select a connection
object, use a connection variable, or use a session parameter to define the connection value in a parameter file.

RELATED TOPICS:
♦ “Configuring Session Connections” on page 22

♦ “Relational Database Connections” on page 28

Target Properties
You can configure session properties for relational targets in the Transformations view on the Mapping tab, and
in the General Options settings on the Properties tab. Define the properties for each target instance in the

Working with Relational Targets 167


session. When you click the Transformations view on the Mapping tab, you can view and configure the settings
of a specific target. Select the target under the Targets node.
Table 10-2 describes the properties available in the Properties settings on the Mapping tab of the session
properties:

Table 10-2. Relational Target Properties

Target Property Description

Target Load Type You can choose Normal or Bulk.


If you select Normal, the Integration Service loads targets normally.
You can choose Bulk when you load to DB2, Sybase, Oracle, or Microsoft SQL
Server. If you specify Bulk for other database types, the Integration Service reverts to
a normal load.
Choose Normal mode if the mapping contains an Update Strategy transformation.
If you choose Normal and the Microsoft SQL Server target name includes spaces,
configure the following connection environment SQL in the connection object:
SET QUOTED_IDENTIFIER ON
For more information, see “Bulk Loading” on page 174.

Insert* Integration Service inserts all rows flagged for insert.


Default is enabled.

Update (as Update)* Integration Service updates all rows flagged for update.
Default is enabled.

Update (as Insert)* Integration Service inserts all rows flagged for update.
Default is disabled.

Update (else Insert)* Integration Service updates rows flagged for update if they exist in the target, then
inserts any remaining rows marked for insert.
Default is disabled.

Delete* Integration Service deletes all rows flagged for delete.


Default is disabled.

Truncate Table Integration Service truncates the target before loading.


Default is disabled.
For more information, see “Truncating Target Tables” on page 170.

Reject File Directory Reject-file directory name. By default, the Integration Service writes all reject files to
the service process variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field, clear this
field. The Integration Service concatenates this field with the Reject Filename field
when it runs the session.
You can also use the $BadFileName session parameter to specify the file directory.

Reject Filename File name or file name and path for the reject file. By default, the Integration Service
names the reject file after the target instance name: target_name.bad. Optionally,
use the $BadFileName session parameter for the file name.
The Integration Service concatenates this field with the Reject File Directory field
when it runs the session. For example, if you have “C:\reject_file\” in the Reject File
Directory field, and enter “filename.bad” in the Reject Filename field, the Integration
Service writes rejected rows to C:\reject_file\filename.bad.
*For more information, see “Using Session-Level Target Properties with Source Properties” on page 169. For more information
about target update strategies, see the PowerCenter Transformation Guide.

168 Chapter 10: Working with Targets


Table 10-3 describes the test load options on the General Options settings on the Properties tab:

Table 10-3. Test Load Options - Relational Targets

Property Description

Enable Test Load You can configure the Integration Service to perform a test load.
With a test load, the Integration Service reads and transforms data without writing to
targets. The Integration Service generates all session files, and performs all pre- and
post-session functions, as if running the full session.
The Integration Service writes data to relational targets, but rolls back the data when the
session completes. For all other target types, such as flat file and SAP BW, the
Integration Service does not write data to the targets.
Enter the number of source rows you want to test in the Number of Rows to Test field.
You cannot perform a test load on sessions using XML sources.
You can perform a test load for relational targets when you configure a session for
normal mode. If you configure the session for bulk mode, the session fails.

Number of Rows to Enter the number of source rows you want the Integration Service to test load.
Test The Integration Service reads the number you configure for the test load.

Using Session-Level Target Properties with Source Properties


You can set session-level target properties to specify how the Integration Service inserts, updates, and deletes
rows. However, you can also set session-level properties for sources.
At the source level, you can specify whether the Integration Service inserts, updates, or deletes source rows or
whether it treats rows as data driven. If you treat source rows as data driven, you must use an Update Strategy
transformation to indicate how the Integration Service handles rows.
This section explains how the Integration Service writes data based on the source and target row properties.
PowerCenter uses the source and target row options to provide an extra check on the session-level properties. In
addition, when you use both the source and target row options, you can control inserts, updates, and deletes for
the entire session or, if you use an Update Strategy transformation, based on the data.
When you set the row-handling property for a source, you can treat source rows as inserts, deletes, updates, or
data driven according to the following guidelines:
♦ Inserts. If you treat source rows as inserts, select Insert for the target option. When you enable the Insert
target row option, the Integration Service ignores the other target row options and treats all rows as inserts.
If you disable the Insert target row option, the Integration Service rejects all rows.
♦ Deletes. If you treat source rows as deletes, select Delete for the target option. When you enable the Delete
target option, the Integration Service ignores the other target-level row options and treats all rows as deletes.
If you disable the Delete target option, the Integration Service rejects all rows.
♦ Updates. If you treat source rows as updates, the behavior of the Integration Service depends on the target
options you select.
Table 10-4 describes how the Integration Service loads the target when you configure the session to treat
source rows as updates:

Table 10-4. Effect of Target Options when You Treat Source Rows as Updates

Target Option Integration Service Behavior

Insert If enabled, the Integration Service uses the target update option (Update as
Update, Update as Insert, or Update else Insert) to update rows.
If disabled, the Integration Service rejects all rows when you select Update as
Insert or Update else Insert as the target-level update option.

Update as Update* Integration Service updates all rows as updates.

Update as Insert* Integration Service updates all rows as inserts. You must also select the Insert
target option.

Update else Insert* Integration Service updates existing rows and inserts other rows as if marked for
insert. You must also select the Insert target option.

Working with Relational Targets 169


Table 10-4. Effect of Target Options when You Treat Source Rows as Updates

Target Option Integration Service Behavior

Delete Integration Service ignores this setting and uses the selected target update option.
*The Integration Service rejects all rows if you do not select one of the target update options.

♦ Data Driven. If you treat source rows as data driven, you use an Update Strategy transformation to specify
how the Integration Service handles rows. However, the behavior of the Integration Service also depends on
the target options you select.
Table 10-5 describes how the Integration Service loads the target when you configure the session to treat
source rows as data driven:

Table 10-5. Effect of Target Options when You Treat Source Rows as Data Driven

Target Option Integration Service Behavior

Insert If enabled, the Integration Service inserts all rows flagged for insert. Enabled
by default.
If disabled, the Integration Service rejects the following rows:
- Rows flagged for insert
- Rows flagged for update if you enable Update as Insert or Update else
Insert

Update as Update* Integration Service updates all rows flagged for update. Enabled by default.

Update as Insert* Integration Service inserts all rows flagged for update. Disabled by default.

Update else Insert* Integration Service updates rows flagged for update and inserts remaining
rows as if marked for insert.

Delete If enabled, the Integration Service deletes all rows flagged for delete.
If disabled, the Integration Service rejects all rows flagged for delete.
*The Integration Service rejects rows flagged for update if you do not select one of the target update options.

Truncating Target Tables


The Integration Service can truncate target tables before running a session. You can choose to truncate tables on
a target-by-target basis. If you have more than one target instance, select the truncate target table option for one
target instance.
The Integration Service issues a delete or truncate command based on the target database and primary key-
foreign key relationships in the session target. To optimize performance, use the truncate table command. The
delete from command may impact performance.
Table 10-6 describes the commands that the Integration Service issues for each database:

Table 10-6. Integration Service Commands on Supported Databases

Table contains a primary key Table does not contain a primary


Target Database
referenced by a foreign key key referenced by a foreign key

DB2 truncate table <table_name>* truncate table <table_name>*

Informix delete from <table_name> delete from <table_name>

ODBC delete from <table_name> delete from <table_name>

Oracle delete from <table_name> unrecoverable truncate table <table_name>

Microsoft SQL Server delete from <table_name> truncate table <table_name>**

Sybase 11.x truncate table <table_name> truncate table <table_name>


*If you use a DB2 database on AS/400, the Integration Service issues a clrpfm command.
** If you use the Microsoft SQL Server ODBC driver, the Integration Service issues a delete statement.

170 Chapter 10: Working with Targets


If the Integration Service issues a truncate target table command and the target table instance specifies a table
name prefix, the Integration Service verifies the database user privileges for the target table by issuing a truncate
command. If the database user is not specified as the target owner name or does not have the database privilege
to truncate the target table, the Integration Service issues a delete command instead.
When the Integration Service issues a delete command, it writes the following message to the session log:
WRT_8208 Error truncating target table <target table name>. The Integration Service is
trying the DELETE FROM query.

If the Integration Service issues a delete command and the database has logging enabled, the database saves all
deleted records to the log for rollback. If you do not want to save deleted records for rollback, you can disable
logging to improve the speed of the delete.
For all databases, if the Integration Service fails to truncate or delete any selected table because the user lacks
the necessary privileges, the session fails.
If you enable truncate target tables with the following sessions, the Integration Service does not truncate target
tables:
♦ Incremental aggregation. When you enable both truncate target tables and incremental aggregation in the
session properties, the Workflow Manager issues a warning that you cannot enable truncate target tables and
incremental aggregation in the same session.
♦ Test load. When you enable both truncate target tables and test load, the Integration Service disables the
truncate table function, runs a test load session, and writes the following message to the session log:
WRT_8105 Truncate target tables option turned off for test load session.

♦ Real-time. The Integration Service does not truncate target tables when you restart a JMS or Websphere
MQ real-time session that has recovery data.

To truncate a target table:

1. In the Workflow Manager, open the session properties.


2. Click the Mapping tab, and then click the Transformations view.
3. Click the Targets node.
4. In the Properties settings, select Truncate Target Table Option for each target table you want the
Integration Service to truncate before it runs the session.
5. Click OK.

Deadlock Retry
Select the Session Retry on Deadlock option in the session properties if you want the Integration Service to
retry writes to a target database or recovery table on a deadlock. A deadlock occurs when the Integration Service
attempts to take control of the same lock for a database row.
The Integration Service may encounter a deadlock under the following conditions:
♦ A session writes to a partitioned target.
♦ Two sessions write simultaneously to the same target.
♦ Multiple sessions simultaneously write to the recovery table, PM_RECOVERY.
Encountering deadlocks can slow session performance. To improve session performance, you can increase the
number of target connection groups the Integration Service uses to write to the targets in a session. To use a
different target connection group for each target in a session, use a different database connection name for each
target instance. You can specify the same connection information for each connection name.
You can retry sessions on deadlock for targets configured for normal load. If you select this option and
configure a target for bulk mode, the Integration Service does not retry target writes on a deadlock for that
target. You can also configure the Integration Service to set the number of deadlock retries and the deadlock
sleep time period.

Working with Relational Targets 171


To retry a session on deadlock, click the Properties tab in the session properties and then scroll down to the
Performance settings.

RELATED TOPICS:
♦ “Working with Target Connection Groups” on page 177

Dropping and Recreating Indexes


After you insert significant amounts of data into a target, you normally need to drop and recreate indexes on
that table to optimize query speed. You can drop and recreate indexes by:
♦ Using pre- and post-session SQL. The preferred method for dropping and re-creating indexes is to define
an SQL statement in the Pre SQL property that drops indexes before loading data to the target. Use the Post
SQL property to recreate the indexes after loading data to the target. Define the Pre SQL and Post SQL
properties for relational targets in the Transformations view on the Mapping tab in the session properties.
♦ Using the Designer. The same dialog box you use to generate and execute DDL code for table creation can
drop and recreate indexes. However, this process is not automatic. Every time you run a session that
modifies the target table, you need to launch the Designer and use this feature.

RELATED TOPICS:
♦ “Using Pre- and Post-Session SQL Commands” on page 127

Constraint-Based Loading
In the Workflow Manager, you can specify constraint-based loading for a session. When you select this option,
the Integration Service orders the target load on a row-by-row basis. For every row generated by an active
source, the Integration Service loads the corresponding transformed row first to the primary key table, then to
any foreign key tables. Constraint-based loading depends on the following requirements:
♦ Active source. Related target tables must have the same active source.
♦ Key relationships. Target tables must have key relationships.
♦ Target connection groups. Targets must be in one target connection group.
♦ Treat rows as insert. Use this option when you insert into the target. You cannot use updates with
constraint-based loading.

Active Source
When target tables receive rows from different active sources, the Integration Service reverts to normal loading
for those tables, but loads all other targets in the session using constraint-based loading when possible. For
example, a mapping contains three distinct pipelines. The first two contain a source, source qualifier, and
target. Since these two targets receive data from different active sources, the Integration Service reverts to
normal loading for both targets. The third pipeline contains a source, Normalizer, and two targets. Since these
two targets share a single active source (the Normalizer), the Integration Service performs constraint-based
loading: loading the primary key table first, then the foreign key table.

RELATED TOPICS:
♦ “Working with Active Sources” on page 178

Key Relationships
When target tables have no key relationships, the Integration Service does not perform constraint-based
loading. Similarly, when target tables have circular key relationships, the Integration Service reverts to a normal
load. For example, you have one target containing a primary key and a foreign key related to the primary key in
a second target. The second target also contains a foreign key that references the primary key in the first target.
The Integration Service cannot enforce constraint-based loading for these tables. It reverts to a normal load.

172 Chapter 10: Working with Targets


Target Connection Groups
The Integration Service enforces constraint-based loading for targets in the same target connection group. If
you want to specify constraint-based loading for multiple targets that receive data from the same active source,
you must verify the tables are in the same target connection group. If the tables with the primary key-foreign
key relationship are in different target connection groups, the Integration Service cannot enforce constraint-
based loading when you run the workflow.
To verify that all targets are in the same target connection group, complete the following tasks:
♦ Verify all targets are in the same target load order group and receive data from the same active source.
♦ Use the default partition properties and do not add partitions or partition points.
♦ Define the same target type for all targets in the session properties.
♦ Define the same database connection name for all targets in the session properties.
♦ Choose normal mode for the target load type for all targets in the session properties.

RELATED TOPICS:
♦ “Working with Target Connection Groups” on page 177

Treat Rows as Insert


Use constraint-based loading when the session option Treat Source Rows As is set to Insert. You might get
inconsistent data if you select a different Treat Source Rows As option and you configure the session for
constraint-based loading.
When the mapping contains Update Strategy transformations and you need to load data to a primary key table
first, split the mapping using one of the following options:
♦ Load primary key table in one mapping and dependent tables in another mapping. Use constraint-based
loading to load the primary table.
♦ Perform inserts in one mapping and updates in another mapping.
Constraint-based loading does not affect the target load ordering of the mapping. Target load ordering defines
the order the Integration Service reads the sources in each target load order group in the mapping. A target load
order group is a collection of source qualifiers, transformations, and targets linked together in a mapping.
Constraint-based loading establishes the order in which the Integration Service loads individual targets within a
set of targets receiving data from a single source qualifier.

Example
The session for the mapping in Figure 10-2 is configured to perform constraint-based loading. In the first
pipeline, target T_1 has a primary key, T_2 and T_3 contain foreign keys referencing the T1 primary key. T_3
has a primary key that T_4 references as a foreign key.
Since these four tables receive records from a single active source, SQ_A, the Integration Service loads rows to
the target in the following order:
♦ T_1
♦ T_2 and T_3 (in no particular order)
♦ T_4
The Integration Service loads T_1 first because it has no foreign key dependencies and contains a primary key
referenced by T_2 and T_3. The Integration Service then loads T_2 and T_3, but since T_2 and T_3 have no

Working with Relational Targets 173


dependencies, they are not loaded in any particular order. The Integration Service loads T_4 last, because it has
a foreign key that references a primary key in T_3.

Figure 10-2. Mapping Using Constraint-Based Loading

After loading the first set of targets, the Integration Service begins reading source B. If there are no key
relationships between T_5 and T_6, the Integration Service reverts to a normal load for both targets.
If T_6 has a foreign key that references a primary key in T_5, since T_5 and T_6 receive data from a single
active source, the Aggregator AGGTRANS, the Integration Service loads rows to the tables in the following
order:
♦ T_5
♦ T_6
T_1, T_2, T_3, and T_4 are in one target connection group if you use the same database connection for each
target, and you use the default partition properties. T_5 and T_6 are in another target connection group
together if you use the same database connection for each target and you use the default partition properties.
The Integration Service includes T_5 and T_6 in a different target connection group because they are in a
different target load order group from the first four targets.

To enable constraint-based loading:

1. In the General Options settings of the Properties tab, choose Insert for the Treat Source Rows As property.
2. Click the Config Object tab. In the Advanced settings, select Constraint Based Load Ordering.
3. Click OK.

Bulk Loading
You can enable bulk loading when you load to DB2, Sybase, Oracle, or Microsoft SQL Server.
If you enable bulk loading for other database types, the Integration Service reverts to a normal load. Bulk
loading improves the performance of a session that inserts a large amount of data to the target database.
Configure bulk loading on the Mapping tab.
When bulk loading, the Integration Service invokes the database bulk utility and bypasses the database log,
which speeds performance. Without writing to the database log, however, the target database cannot perform
rollback. As a result, you may not be able to perform recovery. Therefore, you must weigh the importance of
improved session performance against the ability to recover an incomplete session.
Note: When loading to DB2, Microsoft SQL Server, and Oracle targets, you must specify a normal load for data
driven sessions. When you specify bulk mode and data driven, the Integration Service reverts to normal load.

174 Chapter 10: Working with Targets


Committing Data
When bulk loading to Sybase and DB2 targets, the Integration Service ignores the commit interval you define
in the session properties and commits data when the writer block is full.
When bulk loading to Microsoft SQL Server and Oracle targets, the Integration Service commits data at each
commit interval. Also, Microsoft SQL Server and Oracle start a new bulk load transaction after each commit.
Tip: When bulk loading to Microsoft SQL Server or Oracle targets, define a large commit interval to reduce the
number of bulk load transactions and increase performance.

Oracle Guidelines
When bulk loading to Oracle, the Integration Service invokes the SQL*Loader.
Use the following guidelines when bulk loading to Oracle:
♦ Do not define CHECK constraints in the database.
♦ Do not define primary and foreign keys in the database. However, you can define primary and foreign keys
for the target definitions in the Designer.
♦ To bulk load into indexed tables, choose non-parallel mode and disable the Enable Parallel Mode option.
For more information, see “Relational Database Connections” on page 28.
Note that when you disable parallel mode, you cannot load multiple target instances, partitions, or sessions
into the same table.
To bulk load in parallel mode, you must drop indexes and constraints in the target tables before running a
bulk load session. After the session completes, you can rebuild them. If you use bulk loading with the session
on a regular basis, use pre- and post-session SQL to drop and rebuild indexes and key constraints.
♦ When you use the LONG datatype, verify it is the last column in the table.
♦ Specify the Table Name Prefix for the target when you use Oracle client 9i. If you do not specify the table
name prefix, the Integration Service uses the database login as the prefix.
For more information, see the Oracle documentation.

DB2 Guidelines
Use the following guidelines when bulk loading to DB2:
♦ You must drop indexes and constraints in the target tables before running a bulk load session. After the
session completes, you can rebuild them. If you use bulk loading with the session on a regular basis, use pre-
and post-session SQL to drop and rebuild indexes and key constraints.
♦ You cannot use source-based or user-defined commit when you run bulk load sessions on DB2.
♦ If you create multiple partitions for a DB2 bulk load session, you must use database partitioning for the
target partition type. If you choose any other partition type, the Integration Service reverts to normal load
and writes the following message to the session log:
ODL_26097 Only database partitioning is support for DB2 bulk load. Changing target load
type variable to Normal.

♦ When you bulk load to DB2, the DB2 database writes non-fatal errors and warnings to a message log file in
the session log directory. The message log file name is
<session_log_name>.<target_instance_name>.<partition_index>.log. You can check both the message log
file and the session log when you troubleshoot a DB2 bulk load session.
♦ If you want to bulk load flat files to DB2 for z/OS, use PowerExchange. For more information, see the
PowerExchange Interfaces for PowerCenter.
For more information, see the DB2 documentation.

Working with Relational Targets 175


Table Name Prefix
The table name prefix is the owner of the target table. For some databases, such as DB2, tables can have
different owners. If the database user specified in the database connection is not the owner of the target tables
in a session, specify the table owner for each target instance. A session can fail if the database user is not the
owner and you do not specify the table owner name.
You can specify the table owner name in the target instance or on the Mapping tab of the session properties.
When you specify the table owner name in the session properties, you override table owner name in the
transformation properties.
You can use a parameter or variable as the target table name prefix. Use any parameter or variable type that you
can define in the parameter file. For example, you can use a session parameter, $ParamMyPrefix, as the table
name prefix, and set $ParamMyPrefix to the table name prefix in the parameter file.
Note: When you specify the table owner name and you set the sqlid for a DB2 database in the connection
environment SQL, the Integration Service uses table owner name in the target instance. To use the table owner
name specified in the SET sqlid statement, do not enter a name in the target name prefix.

Target Table Name


You can override the target table name in the session properties. Override the target table name when you use a
single session to load data to different target tables. Enter a table name in the target table name, or enter a
parameter or variable to define the target table name in the parameter file. You can use mapping parameters,
mapping variables, session parameters, workflow variables, or worklet variables in the target table name. For
example, you can use a session parameter, $ParamTgtTable, as the target table name, and set $ParamTgtTable
to the target table name in the parameter file.
Configure the target table name on the Transformation view of the Mapping tab.

Reserved Words
If any table name or column name contains a database reserved word, such as MONTH or YEAR, the session
fails with database errors when the Integration Service executes SQL against the database. You can create and
maintain a reserved words file, reswords.txt, in the server/bin directory. When the Integration Service initializes
a session, it searches for reswords.txt. If the file exists, the Integration Service places quotes around matching
reserved words when it executes SQL against the database.
Use the following rules and guidelines when working with reserved words.
♦ The Integration Service searches the reserved words file when it generates SQL to connect to source, target,
and lookup databases.
♦ If you override the SQL for a source, target, or lookup, you must enclose any reserved word in quotes.
♦ You may need to enable some databases, such as Microsoft SQL Server and Sybase, to use SQL-92 standards
regarding quoted identifiers. Use connection environment SQL to issue the command. For example, use the
following command with Microsoft SQL Server:
SET QUOTED_IDENTIFIER ON

Sample reswords.txt File


To use a reserved words file, create a file named reswords.txt and place it in the server/bin directory. Create a
section for each database that you need to store reserved words for. Add reserved words used in any table or
column name. You do not need to store all reserved words for a database in this file. Database names and
reserved words in resword.txt are not case sensitive.
Following is a sample resword.txt file:
[Teradata]
MONTH
DATE
INTERVAL

176 Chapter 10: Working with Targets


[Oracle]
OPTION
START
[DB2]
[SQL Server]
CURRENT
[Informix]
[ODBC]
MONTH
[Sybase]

Working with Target Connection Groups


When you create a session with at least one relational target, SAP NetWeaver BI target, or dynamic MQSeries
target, you need to consider target connection groups. A target connection group is a group of targets that the
Integration Service uses to determine commits and loading. When the Integration Service performs a database
transaction, such as a commit, it performs the transaction concurrently to all targets in a target connection
group.
The Integration Service performs the following database transactions per target connection group:
♦ Deadlock retry. If the Integration Service encounters a deadlock when it writes to a target, the deadlock
affects targets in the same target connection group. The Integration Service still writes to targets in other
target connection groups. For more information, see “Deadlock Retry” on page 171.
♦ Constraint-based loading. The Integration Service enforces constraint-based loading for targets in a target
connection group. If you want to specify constraint-based loading, you must verify the primary table and
foreign table are in the same target connection group. For more information, see “Constraint-Based
Loading” on page 172.
Targets in the same target connection group meet the following criteria:
♦ Belong to the same partition.
♦ Belong to the same target load order group and transaction control unit.
♦ Have the same target type in the session.
♦ Have the same database connection name for relational targets, and Application connection name for SAP
SAP NetWeaver BI targets. For more information, see the PowerExchange for SAP NetWeaver User Guide.
♦ Have the same target load type, either normal or bulk mode.
For example, suppose you create a session based on a mapping that reads data from one source and writes to two
Oracle target tables. In the Workflow Manager, you do not create multiple partitions in the session. You use the
same Oracle database connection for both target tables in the session properties. You specify normal mode for
the target load type for both target tables in the session properties. The targets in the session belong to the same
target connection group.
Suppose you create a session based on the same mapping. In the Workflow Manager, you do not create multiple
partitions. However, you use one Oracle database connection name for one target, and you use a different
Oracle database connection name for the other target. You specify normal mode for the target load type for
both target tables. The targets in the session belong to different target connection groups.
Note: When you define the target database connections for multiple targets in a session using session parameters,
the targets may or may not belong to the same target connection group. The targets belong to the same target
connection group if all session parameters resolve to the same target connection name. For example, you create
a session with two targets and specify the session parameter $DBConnection1 for one target, and
$DBConnection2 for the other target. In the parameter file, you define $DBConnection1 as Sales1 and you
define $DBConnection2 as Sales1 and run the workflow. Both targets in the session belong to the same target
connection group.

Working with Target Connection Groups 177


Working with Active Sources
An active source is an active transformation the Integration Service uses to generate rows. An active source can
be any of the following transformations:
♦ Aggregator
♦ Application Source Qualifier
♦ Custom, configured as an active transformation
♦ Joiner
♦ MQ Source Qualifier
♦ Normalizer (VSAM or pipeline)
♦ Rank
♦ Sorter
♦ Source Qualifier
♦ XML Source Qualifier
♦ Mapplet, if it contains any of the above transformations
Note: Although the Filter, Router, Transaction Control, and Update Strategy transformations are active
transformations, the Integration Service does not use them as active sources in a pipeline.
Active sources affect how the Integration Service processes a session when you use any of the following
transformations or session properties:
♦ XML targets. The Integration Service can load data from different active sources to an XML target when
each input group receives data from one active source. For more information about XML targets, see the
PowerCenter XML Guide.
♦ Transaction generators. Transaction generators, such as Transaction Control transformations, become
ineffective for downstream transformations or targets if you put a transaction control point after it.
Transaction control points are transaction generators and active sources that generate commits. For more
information about effective and ineffective transaction generators, see the PowerCenter Transformation
Guide. For a list of transaction control points, see “Transformation Scope” on page 215.
♦ Mapplets. An Input transformation must receive data from a single active source. For more information
about connecting mapplets to active sources in mappings, see the PowerCenter Designer Guide.
♦ Source-based commit. Some active sources generate commits. When you run a source-based commit
session, the Integration Service generates a commit from these active sources at every commit interval. For
more information about source-based commit sessions, see “Source-Based Commits” on page 208.
♦ Constraint-based loading. To use constraint-based loading, you must connect all related targets to the same
active source. The Integration Service orders the target load on a row-by-row basis based on rows generated
by an active source. For more information about constraint-based loading, see “Constraint-Based Loading”
on page 172.
♦ Row error logging. If an error occurs downstream from an active source that is not a source qualifier, the
Integration Service cannot identify the source row information for the logged error row. For more
information about logging errors, see “Row Error Logging” on page 393.

Working with File Targets


You can output data to a flat file in either of the following ways:
♦ Use a flat file target definition. Create a mapping with a flat file target definition. Create a session using the
flat file target definition. When the Integration Service runs the session, it creates the target flat file or
generates the target data based on the flat file target definition.

178 Chapter 10: Working with Targets


♦ Use a relational target definition. Use a relational definition to write to a flat file when you want to use an
external loader to load the target. Create a mapping with a relational target definition. Create a session using
the relational target definition. Configure the session to output to a flat file by specifying the File Writer in
the Writers settings on the Mapping tab. For more information about using the external loader feature, see
“External Loading” on page 419.
You can configure the following properties for flat file targets:
♦ Target properties. You can define target properties such as partitioning options, merge options, output file
options, reject options, and command options. For more information, see “Configuring Target Properties”
on page 179.
♦ Flat file properties. You can choose to create delimited or fixed-width files, and define their properties. For
more information, see “Configuring Fixed-Width Properties” on page 181 and “Configuring Delimited
Properties” on page 182.

Configuring Target Properties


You can configure session properties for flat file targets in the Properties settings on the Mapping tab, and in
the General Options settings on the Properties tab. Define the properties for each target instance in the session.
Table 10-7 describes the properties you define on the Mapping tab for flat file target definitions:

Table 10-7. Flat File Target Properties

Target Properties Description

Merge Type Type of merge the Integration Service performs on the data for partitioned targets.
For more information about partitioning targets and creating merge files, see
“Partitioning File Targets” on page 268.

Merge File Name of the merge file directory. By default, the Integration Service writes the merge
Directory file in the service process variable directory, $PMTargetFileDir.
If you enter a full directory and file name in the Merge File Name field, clear this field.

Merge File Name Name of the merge file. Default is target_name.out. This property is required if you
select a merge type.

Append if Exists Appends the output data to the target files and reject files for each partition. Appends
output data to the merge file if you merge the target files. You cannot use this option for
target files that are non-disk files, such as FTP target files.
If you do not select this option, the Integration Service truncates each target file before
writing the output data to the target file. If the file does not exist, the Integration Service
creates it.

Header Options Create a header row in the file target. You can choose the following options:
- No Header. Do not create a header row in the flat file target.
- Output Field Names. Create a header row in the file target with the output port names.
- Use header command output. Use the command in the Header Command field to
generate a header row. For example, you can use a command to add the date to a
header row for the file target.
Default is No Header.

Header Command Command used to generate the header row in the file target. For more information about
using commands, see “Configuring Commands for File Targets” on page 180.

Footer Command Command used to generate a footer row in the file target. For more information about
using commands, see “Configuring Commands for File Targets” on page 180.

Output Type Type of target for the session. Select File to write the target data to a file target. Select
Command to output data to a command. You cannot select Command for FTP or Queue
target connections.
For more information about processing output data with a command, see “Configuring
Commands for File Targets” on page 180.

Merge Command Command used to process the output data from all partitioned targets. For more
information about merging target data with a command, see “Partitioning File Targets”
on page 268.

Working with File Targets 179


Table 10-7. Flat File Target Properties

Target Properties Description

Output File Name of output directory for a flat file target. By default, the Integration Service writes
Directory output files in the service process variable directory, $PMTargetFileDir.
If you specify both the directory and file name in the Output Filename field, clear this
field. The Integration Service concatenates this field with the Output Filename field
when it runs the session.
You can also use the $OutputFileName session parameter to specify the file directory.

Output File Name File name, or file name and path of the flat file target. Optionally, use the
$OutputFileName session parameter for the file name. By default, the Workflow
Manager names the target file based on the target definition used in the mapping:
target_name.out. The Integration Service concatenates this field with the Output File
Directory field when it runs the session.
If the target definition contains a slash character, the Workflow Manager replaces the
slash character with an underscore.
When you use an external loader to load to an Oracle database, you must specify a file
extension. If you do not specify a file extension, the Oracle loader cannot find the flat file
and the Integration Service fails the session.
Note: If you specify an absolute path file name when using FTP, the Integration Service
ignores the Default Remote Directory specified in the FTP connection. When you
specify an absolute path file name, do not use single or double quotes.

Reject File Name of the directory for the reject file. By default, the Integration Service writes all
Directory reject files to the service process variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field, clear this
field. The Integration Service concatenates this field with the Reject Filename field when
it runs the session.
You can also use the $BadFileName session parameter to specify the file directory.

Reject File Name File name, or file name and path of the reject file. By default, the Integration Service
names the reject file after the target instance name: target_name.bad. Optionally use
the $BadFileName session parameter for the file name.
The Integration Service concatenates this field with the Reject File Directory field when
it runs the session. For example, if you have “C:\reject_file\” in the Reject File Directory
field, and enter “filename.bad” in the Reject Filename field, the Integration Service
writes rejected rows to C:\reject_file\filename.bad.

Command Command used to process the target data. For more information, see “Configuring
Commands for File Targets” on page 180.

Set File Properties Define flat file properties. For more information, see “Configuring Fixed-Width
Link Properties” on page 181 and “Configuring Delimited Properties” on page 182.
Set the file properties when you output to a flat file using a relational target definition in
the mapping.

Configuring Commands for File Targets


Use a command to process target data for a flat file target. Use any valid UNIX command or shell script on
UNIX. Use any valid DOS or batch file on Windows. The flat file writer sends the data to the command
instead of a flat file target.
Use a command to perform additional processing of flat file target data. For example, use a command to sort
target data or compress target data. You can increase session performance by pushing transformation tasks to
the command instead of the Integration Service.
To send the target data to a command, select Command for the output type and enter a command for the
Command property.
For example, to generate a compressed file from the target data, use the following command:
compress -c - > $PMTargetFileDir/myCompressedFile.Z

The Integration Service sends the output data to the command, and the command generates a compressed file
that contains the target data.
Note: You can also use service process variables, such as $PMTargetFileDir, in the command.

180 Chapter 10: Working with Targets


Configuring Test Load Options
You can configure the Integration Service to perform a test load. With a test load, the Integration Service reads
and transforms data without writing to targets. The Integration Service generates all session files and performs
all pre- and post-session functions, as if running the full session. To configure a session to perform a test load,
enable test load and enter the number of rows to test.
The Integration Service writes data to relational targets, but rolls back the data when the session completes. For
all other target types, such as flat file and SAP BW, the Integration Service does not write data to the targets.
Use the following rules and guidelines when performing a test load:
♦ You cannot perform a test load on sessions using XML sources.
♦ You can perform a test load for relational targets when you configure a session for normal mode.
♦ If you configure the session for bulk mode, the session fails.
♦ Enable a test load on the session Properties tab.
Table 10-8 describes the test load options in the General Options settings on the Properties tab:

Table 10-8. Test Load Options - Flat File Targets

Property Description

Enable Test You can configure the Integration Service to perform a test load.
Load With a test load, the Integration Service reads and transforms data without writing to
targets. The Integration Service generates all session files and performs all pre- and post-
session functions, as if running the full session.
The Integration Service writes data to relational targets, but rolls back the data when the
session completes. For all other target types, such as flat file and SAP BW, the Integration
Service does not write data to the targets.
Enter the number of source rows you want to test in the Number of Rows to Test field.
You cannot perform a test load on sessions using XML sources.
You can perform a test load for relational targets when you configure a session for normal
mode. If you configure the session for bulk mode, the session fails.

Number of Rows Enter the number of source rows you want the Integration Service to test load.
to Test The Integration Service reads the number you configure for the test load.

Configuring Fixed-Width Properties


When you write data to a fixed-width file, you can edit file properties in the session properties, such as the null
character or code page. You can configure fixed-width properties for non-reusable sessions in the Workflow
Designer and for reusable sessions in the Task Developer. You cannot configure fixed-width properties for
instances of reusable sessions in the Workflow Designer.
In the Transformations view on the Mapping tab, click the Targets node and then click Set File Properties to
open the Flat Files dialog box.
To edit the fixed-width properties, select Fixed Width and click Advanced.

Working with File Targets 181


Table 10-9 describes the options you define in the Fixed Width Properties dialog box:

Table 10-9. Writing to a Fixed-Width Target

Fixed-Width
Description
Properties Options

Null Character Enter the character you want the Integration Service to use to represent null
values. You can enter any valid character in the file code page.
For more information about using null characters for target files, see “Null
Characters in Fixed-Width Files” on page 186.

Repeat Null Character Select this option to indicate a null value by repeating the null character to fill the
field. If you do not select this option, the Integration Service enters a single null
character at the beginning of the field to represent a null value. For more
information about specifying null characters for target files, see “Null Characters in
Fixed-Width Files” on page 186.

Code Page Code page of the fixed-width file. Select a code page or a variable:
- Code page. Select the code page.
- Use Variable. Enter a user-defined workflow or worklet variable or the session
parameter $ParamName, and define the code page in the parameter file. Use the
code page name. For more information about supported code pages, see the
PowerCenter Administrator Guide.
Default is the PowerCenter Client code page.

Configuring Delimited Properties


When you write data to a delimited file, you can edit file properties in the session properties, such as the
delimiter or code page. You can configure delimited properties for non-reusable sessions in the Workflow
Designer and for reusable sessions in the Task Developer. You cannot configure delimited properties for
instances of reusable sessions in the Workflow Designer.
In the Transformations view on the Mapping tab, click the Targets node and then click Set File Properties to
open the Flat Files dialog box. To edit the delimited properties, select Delimited and click Advanced.
Table 10-10 describes the options you can define in the Delimited File Properties dialog box:

Table 10-10. Delimited File Properties

Edit Delimiter
Description
Options

Delimiters Character used to separate columns of data. Delimiters can be either printable or
single-byte unprintable characters, and must be different from the escape character and
the quote character (if selected). To enter a single-byte unprintable character, click the
Browse button to the right of this field. In the Delimiters dialog box, select an unprintable
character from the Insert Delimiter list and click Add. You cannot select unprintable
multibyte characters as delimiters.

Optional Quotes Select None, Single, or Double. If you select a quote character, the Integration Service
does not treat delimiter characters within the quote characters as a delimiter. For
example, suppose an output file uses a comma as a delimiter and the Integration
Service receives the following row: 342-3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
If you select the optional single quote character, the Integration Service ignores the
commas within the quotes and writes the row as four fields.
If you do not select the optional single quote, the Integration Service writes six separate
fields.

Code Page Code page of the delimited file. Select a code page or a variable:
- Code page. Select the code page.
- Use Variable. Enter a user-defined workflow or worklet variable or the session
parameter $ParamName, and define the code page in the parameter file. Use the code
page name. For more information about supported code pages, see the PowerCenter
Administrator Guide.
Default is the PowerCenter Client code page.

182 Chapter 10: Working with Targets


Integration Service Handling for File Targets
When you configure a session to write to file targets, you must correctly configure the flat file target definitions
and the relational target definitions. The Integration Service loads data to flat files based on the following
criteria:
♦ Write to fixed-width flat files from relational target definitions. The Integration Service adds spaces to
target columns based on transformation datatype.
♦ Write to fixed-width flat files from flat file target definitions. You must configure the precision and field
width for flat file target definitions to accommodate the total length of the target field.
♦ Generate flat file targets by transaction. You can configure the file target to generate a separate output file
for each transaction.
♦ Write multibyte data to fixed-width files. You must configure the precision of string columns to
accommodate character data. When writing shift-sensitive data to a fixed-width flat file target, the
Integration Service adds shift characters and spaces to meet file requirements.
♦ Null characters in fixed-width files. The Integration Service writes repeating or non-repeating null
characters to fixed-width target file columns differently depending on whether the characters are single-byte
or multibyte.
♦ Character set. You can write ASCII or Unicode data to a flat file target.
♦ Write metadata to flat file targets. You can configure the Integration Service to write the column header
information when you write to flat file targets.

Writing to Fixed-Width Flat Files with Relational Target Definitions


When you want to output to a fixed-width file based on a relational target definition in the mapping, consider
how the Integration Service handles spacing in the target file.
When the Integration Service writes to a fixed-width flat file based on a relational target definition in the
mapping, it adds spaces to columns based on the transformation datatype connected to the target. This allows
the Integration Service to write optional symbols necessary for the datatype, such as a negative sign or decimal
point, without sending the row to the reject file.
For example, you connect a transformation Integer(10) port to a Number(10) column in a relational target
definition. In the session properties, you override the relational target definition to use the File Writer and you
specify to output a fixed-width flat file. In the target flat file, the Integration Service appends an additional byte
to the Number(10) column to allow for negative signs that might be associated with Integer data.
Table 10-11 describes the number of bytes the Integration Service adds to the target column and optional
characters it uses for each datatype:

Table 10-11. Datatype Modifications for File Target Columns

Datatype Connected Bytes Added


to Fixed-Width Flat by Integration Optional Characters for the Datatype
File Target Column Service

Decimal 2 - Negative sign (-) for the mantissa.


- Decimal point (.).

Double 7 - Negative sign for the mantissa.


- Decimal point.
- Negative sign, e, and three digits for the exponent, for
example, -4.2-e123.

Float 7 - Negative sign for the mantissa.


- Decimal point.
- Negative sign, e, and three digits for the exponent.

Integer 1 - Negative sign for the mantissa.

Integration Service Handling for File Targets 183


Table 10-11. Datatype Modifications for File Target Columns

Datatype Connected Bytes Added


to Fixed-Width Flat by Integration Optional Characters for the Datatype
File Target Column Service

Money 2 - Negative sign for the mantissa.


- Decimal point.

Numeric 2 - Negative sign for the mantissa.


- Decimal point.

Real 7 - Negative sign for the mantissa.


- Decimal point.
- Negative sign, e, and three digits for the exponent.

Writing to Fixed-Width Files with Flat File Target Definitions


When you want to output to a fixed-width flat file based on a flat file target definition, you must configure
precision and field width for the target field to accommodate the total length of the target field. If the data for a
target field is too long for the total length of the field, the Integration Service performs one of the following
actions:
♦ Truncates the row for string columns
♦ Writes the row to the reject file for numeric and datetime columns
Note: When the Integration Service writes a row to the reject file, it writes a message in the session log.

When a session writes to a fixed-width flat file based on a fixed-width flat file target definition in the mapping,
the Integration Service defines the total length of a field by the precision or field width defined in the target.
Fixed-width files are byte-oriented, which means the total length of a field is measured in bytes.
Table 10-12 describes how the Integration Service measures the total field length for fields in a fixed-width flat
file target definition:

Table 10-12. Field Length Measurements for Fixed-Width Flat File Targets

Datatype Target Field Property That Determines Total Field Length

Number Field width

String Precision

Datetime Field width

Table 10-13 lists the characters you must accommodate when you configure the precision or field width for flat
file target definitions to accommodate the total length of the target field:

Table 10-13. Characters to Include when Calculating Field Length for Fixed-Width Targets

Datatype Characters to Accommodate

Number - Decimal separator.


- Thousands separators.
- Negative sign (-) for the mantissa.

String - Multibyte data.


- Shift-in and shift-out characters.
For more information, see “Writing Multibyte Data to Fixed-Width Flat Files” on page 185.

Datetime - Date and time separators, such as slashes (/), dashes (-), and colons (:).
For example, the format MM/DD/YYYY HH24:MI:SS.US has a total length of 26 bytes.

When you edit the flat file target definition in the mapping, define the precision or field width great enough to
accommodate both the target data and the characters in Table 10-13.

184 Chapter 10: Working with Targets


For example, suppose you have a mapping with a fixed-width flat file target definition. The target definition
contains a number column with a precision of 10 and a scale of 2. You use a comma as the decimal separator
and a period as the thousands separator. You know some rows of data might have a negative value. Based on this
information, you know the longest possible number is formatted with the following format:
-NN.NNN.NNN,NN

Open the flat file target definition in the mapping and define the field width for this number column as a
minimum of 14 bytes.

Generating Flat File Targets By Transaction


You can generate a separate output file each time the Integration Service starts a new transaction. You can
dynamically name each target flat file. To generate a separate output file for each transaction, add a FileName
port to the flat file target definition. When you connect the FileName port in the mapping, the Integration
Service creates a separate target file at each commit point. The Integration Service uses the FileName port value
from the first row in each transaction to name the output file.

RELATED TOPICS:
♦ “Creating Target Files by Transaction” on page 218

Writing Multibyte Data to Fixed-Width Flat Files


If you plan to load multibyte data into a fixed-width flat file, configure the precision to accommodate the
multibyte data. Fixed-width files are byte-oriented, not character-oriented. So, when you configure the
precision for a fixed-width target, you need to consider the number of bytes you load into the target, rather
than the number of characters.
For string columns, the Integration Service truncates the data if the precision is not large enough to
accommodate the multibyte data.
You might work with the following types of multibyte data:
♦ Non shift-sensitive multibyte data. The file contains all multibyte data. Configure the precision in the
target definition to allow for the additional bytes.
For example, you know that the target data contains four double-byte characters, so you define the target
definition with a precision of 8 bytes.
If you configure the target definition with a precision of 4, the Integration Service truncates the data before
writing to the target.
♦ Shift-sensitive multibyte data. The file contains single-byte and multibyte data. When writing to a shift-
sensitive flat file target, the Integration Service adds shift characters and spaces to meet file requirements.
You must configure the precision in the target definition to allow for the additional bytes and the shift
characters.
Note: Delimited files are character-oriented, and you do not need to allow for additional precision for multibyte
data.

Writing Shift-Sensitive Multibyte Data


When writing to a shift-sensitive flat file target, the Integration Service adds shift characters and spaces if the
data going into the target does not meet file requirements. You need to allow at least two extra bytes in each
data column containing multibyte data so the output data precision matches the byte width of the target
column.
The Integration Service writes shift characters and spaces in the following ways:
♦ If a column begins or ends with a double-byte character, the Integration Service adds shift characters so the
column begins and ends with a single-byte shift character.
♦ If the data is shorter than the column width, the Integration Service pads the rest of the column with spaces.

Integration Service Handling for File Targets 185


♦ If the data is longer than the column width, the Integration Service truncates the data so the column ends
with a single-byte shift character.
To illustrate how the Integration Service handles a fixed-width file containing shift-sensitive data, say you want
to output the following data to the target:
SourceCol1 SourceCol2
AAAA aaaa

A is a double-byte character, a is a single-byte character.


The first target column contains eight bytes and the second target column contains four bytes.
The Integration Service must add shift characters to handle shift-sensitive data. Since the first target column
can handle eight bytes, the Integration Service truncates the data before it can add the shift characters.
TargetCol1 TargetCol2
-oAAA-i aaaa

The following table describes the notation used in this example:

Notation Description

A Double-byte character
-o Shift-out character
-i Shift-in character

For the first target column, the Integration Service writes three of the double-byte characters to the target. It
cannot write any additional double-byte characters to the output column because the column must end in a
single-byte character. If you add two more bytes to the first target column definition, then the Integration
Service can add shift characters and write all the data without truncation.
For the second target column, the Integration Service writes all four single-byte characters to the target. It does
not add write shift characters to the column because the column begins and ends with single-byte characters.

Null Characters in Fixed-Width Files


You can specify any valid single-byte or multibyte character as a null character for a fixed-width target. You can
also use a space as a null character.
The null character can be repeating or non-repeating. If the null character is repeating, the Integration Service
writes as many null characters as possible into a target column. If you specify a multibyte null character and
there are extra bytes left after writing null characters, the Integration Service pads the column with single-byte
spaces. If a column is smaller than the multibyte character specified as the null character, the session fails at
initialization.

Character Set
You can configure the Integration Service to run sessions with flat file targets in either ASCII or Unicode data
movement mode.
If you configure a session with a flat file target to run in Unicode data movement mode, the target file code
page must be a superset of the source code page. Delimiters, escape, and null characters must be valid in the
specified code page of the flat file.
If you configure a session to run in ASCII data movement mode, delimiters, escape, and null characters must be
valid in the ISO Western European Latin1 code page. Any 8-bit character you specified in previous versions of
PowerCenter is still valid.

186 Chapter 10: Working with Targets


Writing Metadata to Flat File Targets
When you write to flat file targets, you can configure the Integration Service to write the column header
information. When you enable the Output Metadata For Flat File Target option, the Integration Service writes
column headers to flat file targets. It writes the target definition port names to the flat file target in the first line,
starting with the # symbol. By default, this option is disabled.
When writing to fixed-width files, the Integration Service truncates the target definition port name if it is
longer than the column width.
For example, you have a flat file target definition with the following structure:
Port Name Datatype
ITEM_ID number
ITEM_NAME string
PRICE number

The column width for ITEM_ID is six. When you enable the Output Metadata For Flat File Target option, the
Integration Service writes the following text to a flat file:
#ITEM_ITEM_NAME PRICE
100001Screwdriver 9.50
100002Hammer 12.90
100003Small nails 3.00

For information about configuring the Integration Service to output flat file metadata, see the PowerCenter
Administrator Guide.

Working with Heterogeneous Targets


You can output data to multiple targets in the same session. When the target types or database types of those
targets differ from each other, you have a session with heterogeneous targets.
To create a session with heterogeneous targets, you can create a session based on a mapping with heterogeneous
targets. Or, you can create a session based on a mapping with homogeneous targets and select different database
connections.
A heterogeneous target has one of the following characteristics:
♦ Multiple target types. You can create a session that writes to both relational and flat file targets.
♦ Multiple target connection types. You can create a session that writes to a target on an Oracle database and
to a target on a DB2 database. Or, you can create a session that writes to multiple targets of the same type,
but you specify different target connections for each target in the session.
All database connections you define in the Workflow Manager are unique to the Integration Service, even if you
define the same connection information. For example, you define two database connections, Sales1 and Sales2.
You define the same user name, password, connect string, code page, and attributes for both Sales1 and Sales2.
Even though both Sales1 and Sales2 define the same connection information, the Integration Service treats
them as different database connections. When you create a session with two relational targets and specify Sales1
for one target and Sales2 for the other target, you create a session with heterogeneous targets.
You can create a session with heterogeneous targets in one of the following ways:
♦ Create a session based on a mapping with targets of different types or different database types. In the session
properties, keep the default target types and database types.
♦ Create a session based on a mapping with the same target types. However, in the session properties, specify
different target connections for the different target instances, or override the target type to a different type.
You can specify the following target type overrides in a session:

Working with Heterogeneous Targets 187


♦ Relational target to flat file.
♦ Relational target to any other relational database type. Verify the datatypes used in the target definition are
compatible with both databases.
♦ SAP BW target to a flat file target type.
Note: When the Integration Service runs a session with at least one relational target, it performs database
transactions per target connection group. For example, it orders the target load for targets in a target
connection group when you enable constraint-based loading.

RELATED TOPICS:
♦ “Working with Target Connection Groups” on page 177

Reject Files
During a session, the Integration Service creates a reject file for each target instance in the mapping. If the
writer or the target rejects data, the Integration Service writes the rejected row into the reject file. The reject file
and session log contain information that helps you determine the cause of the reject.
Each time you run a session, the Integration Service appends rejected data to the reject file. Depending on the
source of the problem, you can correct the mapping and target database to prevent rejects in subsequent
sessions.
Note: If you enable row error logging in the session properties, the Integration Service does not create a reject
file. It writes the reject rows to the row error tables or file.

Locating Reject Files


The Integration Service creates reject files for each target instance in the mapping. It creates reject files in the
session reject file directory. Configure the target reject file directory on the Mapping tab for the session. By
default, the Integration Service creates reject files in the $PMBadFileDir process variable directory.
When you run a session that contains multiple partitions, the Integration Service creates a separate reject file for
each partition. The Integration Service names reject files after the target instance name. The default name for
reject files is filename_partitionnumber.bad. The reject file name for the first partition does not contain a
partition number.
For example,
/home/directory/filename.bad
/home/directory/filename2.bad
/home/directory/filename3.bad

The Workflow Manager replaces slash characters in the target instance name with underscore characters.
To find a reject file name and path, view the target properties settings on the Mapping tab of session properties.

Reading Reject Files


After you locate a reject file, you can read it using a text editor that supports the reject file code page. Reject
files contain rows of data rejected by the writer or the target database. Though the Integration Service writes the
entire row in the reject file, the problem generally centers on one column within the row. To help you
determine which column caused the row to be rejected, the Integration Service adds row and column indicators
to give you more information about each column:
♦ Row indicator. The first column in each row of the reject file is the row indicator. The row indicator defines
whether the row was marked for insert, update, delete, or reject.
If the session is a user-defined commit session, the row indicator might indicate whether the transaction was
rolled back due to a non-fatal error, or if the committed transaction was in a failed target connection group.

188 Chapter 10: Working with Targets


♦ Column indicator. Column indicators appear after every column of data. The column indicator defines
whether the column contains valid, overflow, null, or truncated data.
The following sample reject file shows the row and column indicators:
0,D,1921,D,Nelson,D,William,D,415-541-5145,D
0,D,1922,D,Page,D,Ian,D,415-541-5145,D
0,D,1923,D,Osborne,D,Lyle,D,415-541-5145,D
0,D,1928,D,De Souza,D,Leo,D,415-541-5145,D
0,D,2001123456789,O,S. MacDonald,D,Ira,D,415-541-514566,T

RELATED TOPICS:
♦ “User-Defined Commits” on page 212

Row Indicators
The first column in the reject file is the row indicator. The row indicator is a flag that defines the update
strategy for the data row.
Table 10-14 describes the row indicators in a reject file:

Table 10-14. Row Indicators in Reject File

Row Indicator Meaning Rejected By

0 Insert Writer or target

1 Update Writer or target

2 Delete Writer or target

3 Reject Writer

4 Rolled-back insert Writer

5 Rolled-back update Writer

6 Rolled-back delete Writer

7 Committed insert Writer

8 Committed update Writer

9 Committed delete Writer

If a row indicator is 0, 1, or 2, the writer or the target database rejected the row.
If a row indicator is 3, an update strategy expression marked the row for reject.

Column Indicators
A column indicator appears after every column of data. A column indicator defines whether the data is valid,
overflow, null, or truncated.
Note: A column indicator “D” also appears after each row indicator.

Table 10-15 describes the column indicators in a reject file:

Table 10-15. Column Indicators in Reject File

Column
Type of data Writer Treats As
Indicator

D Valid data. Good data. Writer passes it to the target database. The
target accepts it unless a database error occurs, such
as finding a duplicate key.

O Overflow. Numeric data exceeded the Bad data, if you configured the mapping target to reject
specified precision or scale for the overflow or truncated data.
column.

Reject Files 189


Table 10-15. Column Indicators in Reject File

Column
Type of data Writer Treats As
Indicator

N Null. The column contains a null Good data. Writer passes it to the target, which rejects
value. it if the target database does not accept null values.

T Truncated. String data exceeded a Bad data, if you configured the mapping target to reject
specified precision for the column, so overflow or truncated data.
the Integration Service truncated it.

Null columns appear in the reject file with commas marking their column. An example of a null column
surrounded by good data appears as follows:
5,D,,N,5,D

Either the writer or target database can reject a row. Consult the session log to determine the cause for
rejection.

190 Chapter 10: Working with Targets


CHAPTER 11

Real-time Processing
This chapter includes the following topics:
♦ Overview, 191
♦ Understanding Real-time Data, 192
♦ Configuring Real-time Sessions, 194
♦ Terminating Conditions, 195
♦ Flush Latency, 195
♦ Commit Type, 196
♦ Message Recovery, 197
♦ Recovery File, 198
♦ Recovery Table, 199
♦ Recovery Queue and Recovery Topic, 200
♦ Recovery Ignore List, 201
♦ Stopping Real-time Sessions, 202
♦ Restarting and Recovering Real-time Sessions, 202
♦ Rules and Guidelines, 203
♦ Real-time Processing Example, 204
♦ Informatica Real-time Products, 206

Overview
This chapter contains general information about real-time processing. Real-time processing behavior depends
on the real-time source. Exceptions are noted in this chapter or are described in the corresponding product
documentation. For more information about real-time processing for your Informatica real-time product, see
the Informatica product documentation. For more information about real-time processing for PowerExchange,
see PowerExchange Interfaces for PowerCenter.
You can use PowerCenter to process data in real time. Real-time processing is on-demand processing of data
from real-time sources. A real-time session reads, processes, and writes data to targets continuously. By default,
a session reads and writes bulk data at scheduled intervals unless you configure the session for real-time
processing.

191
To process data in real time, the data must originate from a real-time source. Real-time sources include JMS,
WebSphere MQ, TIBCO, webMethods, MSMQ, SAP, and web services. You might want to use real-time
processing for processes that require immediate access to dynamic data, such as financial data.
To understand real-time processing with PowerCenter, you need to be familiar with the following concepts:
♦ Real-time data. Real-time data includes messages and messages queues, web services messages, and changes
from a PowerExchange change data capture source. Real-time data originates from a real-time source.
♦ Real-time sessions. A real-time session is a session that processes real-time source data. A session is real-time
if the Integration Service generates a real-time flush based on the flush latency configuration and all
transformations propagate the flush to the targets. Latency is the period of time from when source data
changes on a source to when a session writes the data to a target.
♦ Real-time properties. Real-time properties determine when the Integration Service processes the data and
commits the data to the target.
− Terminating conditions. Terminating conditions determine when the Integration Service stops reading
data from the source and ends the session if you do not want the session to run continuously.
− Flush latency. Flush latency determines how often the Integration Service flushes real-time data from the
source.
− Commit type. The commit type determines when the Integration Service commits real-time data to the
target.
♦ Message recovery. If the real-time session fails, you can recover messages. When you enable message recovery
for a real-time session, the Integration Service stores source messages or message IDs in a recovery file or
table. If the session fails, you can run the session in recovery mode to recover messages the Integration
Service could not process.

Understanding Real-time Data


You can process the following types of real-time data:
♦ Messages and message queues. Process messages and message queues from WebSphere MQ, JMS, MSMQ,
SAP, TIBCO, and webMethods sources. You can read from messages and message queues. You can write to
messages, messaging applications, and message queues.
♦ Web service messages. Receive a message from a web service client through the Web Services Hub and
transform the data. You can write the data to a target or send a message back to a web service client.
♦ Change data from PowerExchange change data capture sources. Extract change data that represent changes
to a relational database or file source from the change stream and write to a target.

Messages and Message Queues


The Integration Service uses the messaging and queueing architecture to process real-time data. It can read
messages from a message queue, process the message data, and write messages to a message queue.
You can also write messages to other messaging applications. For example, the Integration Service can read
messages from a JMS source and write the data to a TIBCO target.

192 Chapter 11: Real-time Processing


Figure 11-1 shows an example of how the messaging application and the Integration Service process messages
from a message queue:

Figure 11-1. Message Queue Processing

Message Queue Message Queue


Messaging Integration Service
Message Message
Application
1 Message 2 4 Message
Message 3 Message
Source Target

The messaging application and the Integration Service complete the following tasks to process messages from a
message queue:
1. The messaging application adds a message to a queue.
2. The Integration Service reads the message from the queue and extracts the data.
3. The Integration Service processes the data.
4. The Integration Service writes a reply message to a queue.

Web Service Messages


A web service message is a SOAP request from a web service client or a SOAP response from the Web Services
Hub. The Integration Service processes real-time data from a web service client by receiving a message request
through the Web Services Hub and processing the request. The Integration Service can send a reply back to the
web service client through the Web Services Hub, or it can write the data to a target.
Figure 11-2 shows an example of how the web service client, the Web Services Hub, and the Integration Service
process web service messages:

Figure 11-2. Web Service Message Processing

1 2 3

Web Service Client Web Services Hub Integration Service


Target

4 3

The web service client, the Web Services Hub, and the Integration Service complete the following tasks to
process web service messages:
1. The web service client sends a SOAP request to the Web Services Hub.
2. The Web Services Hub processes the SOAP request and passes the request to the Integration Service.
3. The Integration Service runs the service request. It sends a response to the Web Services Hub or writes the
data to a target.
4. If the Integration Service sends a response to the Web Services Hub, the Web Services Hub generates a
SOAP message reply and passes the reply to the web service client.

Change Data from PowerExchange Change Data Capture Sources


Change data from PowerExchange change data capture sources represents changes to a file or relational
database. PowerExchange captures change data and stores it in the change stream. Use PowerExchange Client
for PowerCenter to extract changes from the PowerExchange change stream. PowerExchange Client for
PowerCenter connects to PowerExchange to extract data that changed since the previous session. The
Integration Service processes and writes change data to a target.

Understanding Real-time Data 193


Figure 11-3 shows an example of how PowerExchange Client for PowerCenter, PowerExchange, and the
Integration Service process change data:

Figure 11-3. Change Data Processing

PowerExchange
3 Integration 4
Client for Target
Service
PowerCenter
1 3

PowerExchange

Change
Data

PowerExchange Client for PowerCenter, PowerExchange, and the Integration Service complete the following
tasks to process change data:
1. PowerExchange Client for PowerCenter connects to PowerExchange.
2. PowerExchange extracts change data from the change stream.
3. PowerExchange passes change data to the Integration Service through PowerExchange Client for
PowerCenter.
4. The Integration Service transforms and writes data to a target.
For more information about change data, see PowerExchange Interfaces for PowerCenter.

Configuring Real-time Sessions


When you configure a session to process data in real time, you configure session properties that control when
the session stops reading from the source. You can configure a session to stop reading from a source after it stops
receiving messages for a set period of time, when the session reaches a message count limit, or when the session
has read messages for a set period of time. You can also configure how the Integration Service commits data to
the target and enable message recovery for failed sessions.
You can configure the following properties for a real-time session:
♦ Terminating conditions. Define the terminating conditions to determine when the Integration Service stops
reading from a source and ends the session. For more information, see “Terminating Conditions” on
page 195.
♦ Flush latency. Define a session with flush latency to read and write real-time data. Flush latency determines
how often the session commits data to the targets. For more information, see “Flush Latency” on page 195.
♦ Commit type. Define a source- or target-based commit type for real-time sessions. With a source-based
commit, the Integration Service commits messages based on the commit interval and the flush latency
interval. With a target-based commit, the Integration Service commits messages based on the flush latency
interval. For more information, see “Commit Type” on page 196.
♦ Message recovery. Enable recovery for a real-time session to recover messages from a failed session. For more
information, see “Message Recovery” on page 197.

194 Chapter 11: Real-time Processing


Terminating Conditions
A terminating condition determines when the Integration Service stops reading messages from a real-time
source and ends the session. When the Integration Service reaches a terminating condition, it stops reading
from the real-time source. It processes the messages it read and commits data to the target. Then, it ends the
session.
You can configure the following terminating conditions:
♦ Idle time
♦ Message count
♦ Reader time limit
If you configure multiple terminating conditions, the Integration Service stops reading from the source when it
meets the first condition. By default, the Integration Service reads messages continuously and uses the flush
latency to determine when it flushes data from the source. After the flush, the Integration Service resets the
counters for the terminating conditions.

Idle Time
Idle time is the amount of time in seconds the Integration Service waits to receive messages before it stops
reading from the source. -1 indicates an infinite period of time.
For example, if the idle time for a JMS session is 30 seconds, the Integration Service waits 30 seconds after
reading from JMS. If no new messages arrive in JMS within 30 seconds, the Integration Service stops reading
from JMS. It processes the messages and ends the session.

Message Count
Message count is the number of messages the Integration Service reads from a real-time source before it stops
reading from the source. -1 indicates an infinite number of messages.
For example, if the message count in a JMS session is 100, the Integration Service stops reading from the source
after it reads 100 messages. It processes the messages and ends the session.
Note: The name of the message count terminating condition depends on the Informatica product. For example,
the message count for PowerExchange for SAP NetWeaver is called Packet Count. The message count for
PowerExchange Client for PowerCenter is called UOW Count.

Reader Time Limit


Reader time limit is the amount of time in seconds that the Integration Service reads source messages from the
real-time source before it stops reading from the source. Use reader time limit to read messages from a real-time
source for a set period of time. 0 indicates an infinite period of time.
For example, if you use a 10 second time limit, the Integration Service stops reading from the messaging
application after 10 seconds. It processes the messages and ends the session.

Flush Latency
Use flush latency to run a session in real time. Flush latency determines how often the Integration Service
flushes data from the source. For example, if you set the flush latency to 10 seconds, the Integration Service
flushes data from the source every 10 seconds.

Terminating Conditions 195


For change data from a PowerExchange change data capture source, the flush latency interval is determined by
the flush latency and the unit of work (UOW) count attributes. For more information, see PowerExchange
Interfaces for PowerCenter.
The Integration Service uses the following process when it reads data from a real-time source and the session is
configured with flush latency:
1. The Integration Service reads data from the source.
The flush latency interval begins when the Integration Service reads the first message from the source.
2. At the end of the flush latency interval, the Integration Service stops reading data from the source.
3. The Integration Service processes messages and writes them to the target.
4. The Integration Service reads from the source again until it reaches the next flush latency interval.
Configure flush latency in seconds. The default value is zero, which indicates that the flush latency is disabled
and the session does not run in real time.
Configure the flush latency interval depending on how dynamic the data is and how quickly users need to access
the data. If data is outdated quickly, such as financial trading information, then configure a lower flush latency
interval so the target tables are updated as close as possible to when the changes occurred. For example, users
need updated financial data every few minutes. However, they need updated customer address changes only
once a day. Configure a lower flush latency interval for financial data and a higher flush latency interval for
address changes.
Use the following rules and guidelines when you configure flush latency:
♦ The Integration Service does not buffer messages longer than the flush latency interval.
♦ The lower you set the flush latency interval, the more frequently the Integration Service commits messages
to the target.
♦ If you use a low flush latency interval, the session can consume more system resources.
If you configure a commit interval, then a combination of the flush latency and the commit interval determines
when the data is committed to the target.

Commit Type
The Integration Service commits data to the target based on the flush latency and the commit type. You can
configure a session to use the following commit types:
♦ Source-based commit. When you configure a source-based commit, the Integration Service commits data to
the target using a combination of the commit interval and the flush latency interval. The first condition the
Integration Service meets triggers the end of the flush latency period. After the flush, the counters are reset.
For example, you set the flush latency to five seconds and the source-based commit interval to 1,000
messages. The Integration Service commits messages to the target either after reading 1,000 messages from
the source or after five seconds.
♦ Target-based commit. When you configure a target-based commit, the Integration Service ignores the
commit interval and commits data to the target based on the flush latency interval.
When writing to targets in a real-time session, the Integration Service processes commits serially and commits
data to the target in real time. It does not store data in the DTM buffer memory.

RELATED TOPICS:
♦ “Understanding Commit Points” on page 207

196 Chapter 11: Real-time Processing


Message Recovery
When you enable message recovery for a real-time session, the Integration Service can recover unprocessed
messages from a failed session. The Integration Service stores source messages or message IDs in a recovery file,
recovery table, recovery queue, or recovery topic. If the session fails, run the session in recovery mode to recover
the messages the Integration Service did not process.
Depending on the real-time source and the target type, the messages or message IDs are stored in the following
storage types:
♦ Recovery file. Messages or message IDs are stored in a designated local recovery file. A session with a real-
time source and a non-relational or non-queue target uses the recovery file. For more information, see
“Recovery File” on page 198.
♦ Recovery table. Message IDs are stored in a recovery table in the target database. A session with a JMS or
WebSphere MQ source and a relational target uses the recovery table. For more information, see “Recovery
Table” on page 199.
♦ Recovery queue and recovery topic. Message IDs are stored in a recovery queue or recovery topic. A session
with a JMS or WebSphere MQ source and a JMS or WebSphere MQ target uses the recovery queue. A
session with a JMS or WebSphere MQ source and a topic target uses the recovery topic. For more
information, see “Recovery Queue and Recovery Topic” on page 200.
A session can use a combination of the storage types. For example, a session with a JMS and TIBCO source uses
a recovery file and recovery table.
When you recover a real-time session, the Integration Service restores the state of operation from the point of
interruption. It reads and processes the messages in the recovery file, recovery table, recovery queue, or recovery
topic. Then, it ends the session.
During recovery, the terminating conditions do not affect the messages the Integration Service reads from the
recovery file, recovery table, recovery queue, or recovery topic. For example, if you specified message count and
idle time for the session, the conditions apply to the messages the Integration Service reads from the source, not
the recovery file, recovery table, recovery queue, or recovery topic.
In addition to the storage types above, the Integration Service uses a recovery ignore list if the session fails under
certain conditions. For more information, see “Recovery Ignore List” on page 201.
Sessions with MSMQ sources, web service messages, or change data from a PowerExchange change data capture
source use a different recovery strategy. For more information, see the PowerExchange for MSMQ User Guide,
PowerCenter Web Services Provider Guide, or PowerExchange Interfaces for PowerCenter.

Prerequisites
Complete the following prerequisites before you enable message recovery for sessions with a JMS or WebSphere
MQ source and a JMS or WebSphere MQ target:
♦ Create the recovery queue in the JMS provider or WebSphere MQ. Or, create the recovery topic in the JMS
provider.
♦ Create the recovery queue under the same queue manager as the message queue so the commit scope is the
same.
♦ Configure the recovery queue to be persistent. If the recovery queue is not persistent, data duplication can
occur.
If you do not configure the prerequisites, the Integration Service stores recovery information in a recovery file
instead of a recovery queue or recovery topic.

Steps to Enable Message Recovery


Complete the following steps to enable message recovery for sessions:

Message Recovery 197


1. In the session properties, select Resume from Last Checkpoint as the recovery strategy.
2. Specify a recovery cache directory in the session properties at each partition point.
The Integration Service stores messages in the location indicated by the recovery cache directory. The default
value recovery cache directory is $PMCacheDir.

Recovery File
The Integration Service stores messages or message IDs in a recovery file for real-time sessions that are enabled
for recovery and include the following source and target types:
♦ JMS source with non-relational, non-JMS, or non-WebSphere MQ targets
♦ WebSphere MQ source with non-relational, non-JMS, or non-WebSphere MQ targets
♦ SAP R/3 source and all targets
♦ webMethods source and all targets
♦ TIBCO source and all targets
The Integration Service temporarily stores messages or message IDs in a local recovery file that you configure in
the session properties. During recovery, the Integration Service processes the messages in this recovery file to
ensure that data is not lost.

Message Processing
Figure 11-4 shows how the Integration Service processes messages using the recovery file:

Figure 11-4. Message Processing with Recovery File

Recovery
File

Real-time Integration
Target
Source Service

The Integration Service completes the following tasks to processes messages using recovery files:
1. The Integration Service reads a message from the source.
2. For sessions with JMS and WebSphere MQ sources, the Integration Service writes the message ID to the
recovery file. For all other sessions, the Integration Service writes the message to the recovery file.
3. For sessions with SAP R/3, webMethods, or TIBCO sources, the Integration Service sends an
acknowledgement to the source to confirm it read the message. The source deletes the message.
4. The Integration Service repeats steps 1 - 3 until the flush latency is met.
5. The Integration Service processes the messages and writes them to the target.
6. The target commits the messages.
7. For sessions with JMS and WebSphere MQ sources, the Integration Service sends a batch
acknowledgement to the source to confirm it read the messages. The source deletes the messages.
8. The Integration Service clears the recovery file.

198 Chapter 11: Real-time Processing


Message Recovery
When you recover a real-time session, the Integration Service reads and processes the cached messages. After the
Integration Service reads all cached messages, it ends the session.
For sessions with JMS and WebSphere MQ sources, the Integration Service uses the message ID in the recovery
file to retrieve the message from the source.
The Integration Service clears the recovery file after the flush latency period expires and at the end of a
successful session. If the session fails after the Integration Service commits messages to the target but before it
removes the messages from the recovery file, targets can receive duplicate rows during recovery.

Session Recovery Data Flush


A recovery data flush is a process that the Integration Service uses to flush session recovery data that is in the
operating system buffer to the recovery file. You can prevent data loss if the Integration Service is not able to
write the recovery data to the recovery file. The Integration Service can fail to write recovery data in cases of an
operating system failure, hardware failure, or file system outage. The recovery data flush applies to sessions that
include a JMS or WebSphere MQ source and non-relational, non-JMS, or non-WebSphere MQ targets.
You can configure the Integration Service to flush recovery data from the operating system buffer to the
recovery file by setting the Integration Service property Flush Session Recovery Data to “Auto” or “Yes” in the
Administration Console.

Recovery Table
The Integration Service stores message IDs in a recovery table for real-time sessions that are enabled for
recovery and include the following source and target types:
♦ JMS source with relational targets
♦ WebSphere MQ source with relational targets
The Integration Service temporarily stores message IDs and commit numbers in a recovery table on each target
database. The commit number indicates the number of commits that the Integration Service committed to the
target. During recovery, the Integration Service uses the commit number to determine if it wrote the same
amount of messages to all targets. The messages IDs and commit numbers are verified against the recovery table
to ensure that no data is lost or duplicated.
Note: The source must use unique message IDs and provide access to the messages through the message ID.

PM_REC_STATE Table
When the Integration Service runs a real-time session that uses the recovery table and has recovery enabled, it
creates a recovery table, PM_REC_STATE, on the target database to store message IDs and commit numbers.
When the Integration Service recovers the session, it uses information in the recovery tables to determine if it
needs to write the message to the target table.

RELATED TOPICS:
♦ “Target Recovery Tables” on page 223

Recovery Table 199


Message Processing
Figure 11-5 shows how the Integration Service processes messages using the recovery table:

Figure 11-5. Message Processing with Recovery Table

Target Database

Recovery
Real-time Integration Table
Source Service
Target
Table

The Integration Service completes the following tasks to process messages using recovery tables:
1. The Integration Service reads one message at a time until the flush latency is met.
2. The Integration Service stops reading from the source. It processes the messages and writes them to the
target.
3. The Integration Service writes the message IDs, commit numbers, and the transformation states to the
recovery table on the target database and writes the messages to the target simultaneously.
4. When the target commits the messages, the Integration Service sends an acknowledgement to the real-time
source to confirm that all messages were processed and written to the target.
5. The Integration Service continues to read messages from the source.
If the session has multiple partitions, the tasks apply to each partition.

Message Recovery
When you recover a real-time session, the Integration Service uses the message ID and the commit number in
the recovery table to determine whether it committed messages to all targets.
The Integration Service commits messages to all targets if the message ID exists in the recovery table and all
targets have the same commit number. During recovery, the Integration Service sends an acknowledgement to
the source that it processed the message.
The Integration Service does not commit messages to all targets if the targets have different commit numbers.
During recovery, the Integration Service reads the message IDs and the transformation state from the recovery
table. It processes messages and writes them to the targets that did not have the message. When the Integration
Service reads all messages from the recovery table, it ends the session.
If the session fails before the Integration Service commits messages to all targets and you restart the session in
cold start mode, targets can receive duplicate rows.

Recovery Queue and Recovery Topic


The Integration Service stores message IDs in a recovery queue or recovery topic for real-time sessions that are
enabled for recovery and include the following source and target types:
♦ JMS source with JMS or WebSphere MQ targets
♦ WebSphere MQ source with JMS or WebSphere MQ targets
The Integration Service temporarily stores message IDs and commit numbers in a recovery queue or recovery
topic that you created in the JMS provider or in WebSphere MQ. The commit number indicates the number of
commits that the Integration Service committed to the target. During recovery, the Integration Service uses the
commit number to determine if it wrote the same amount of messages to all targets. The messages IDs and

200 Chapter 11: Real-time Processing


commit numbers are verified against the recovery queue or recovery topic to ensure that no data is lost or
duplicated.
The Integration Service uses the same recovery queue or recovery topic for all queue targets in each session.
Create multiple recovery queues or recovery topics for sessions to improve performance.
If you do not specify the recovery queue or recovery topic name in the session properties or in the JMS
connection object, the Integration Service stores recovery information in the recovery file.

RELATED TOPICS:
“Recovery Table” on page 199

Message Processing
Figure 11-6 shows how the Integration Service processes messages using the recovery queue:

Figure 11-6. Message Processing with Recovery Queue

Queue Manager

Recovery
Real-time Integration Queue
Source Service
Message
Queue

The Integration Service processes messages using the recovery queue or recovery topic similar to how it
processes messages using the recovery table. The Integration Service writes recovery information to the recovery
queue or recovery topic instead of the recovery table.

Message Recovery
The Integration Service recovers messages from the recovery queue or recovery topic similar to how it recovers
messages from the recovery table. The Integration Service retrieves recovery information from the recovery
queue or recovery topic instead of from the recovery table.

Recovery Ignore List


The Integration Service writes recovery information to a recovery ignore list when a session with a JMS or
WebSphere MQ source fails. The Integration Service writes recovery information to the list if there is a chance
that the source did not receive an acknowledgement. For example, the session fails before the Integration
Service sends an acknowledgement to the source but after it writes messages to the target. In this case, the
source can rollback the current transaction, but the messages in that transaction may not be immediately
available. If the messages are included in the recovery session, data duplication can occur. To prevent data
duplication, the Integration Service creates the recovery ignore list.
The recovery ignore list stores message IDs that the Integration Service wrote to the target for the failed session.
The Integration Service creates the recovery ignore list in the storage type that is used for that session, such as
the recovery file, recovery table, recovery queue, or recovery topic. During recovery, the Integration Service uses
the recovery ignore list and the storage type to determine if it wrote the messages to the target. It verifies the
messages IDs against the recovery ignore list and the storage type to ensure that no data is duplicated.
When the session fails, the Integration Service writes the message to the recovery ignore list and adds a time
stamp. By default, the Integration Service deletes the message from the recovery ignore list one hour after the

Recovery Ignore List 201


time stamp. If the Integration Service finds the message in the source within the default time period, it deletes
the message from the recovery ignore list.
If you restart a stopped or failed session in cold start mode, targets may receive duplicate rows. Restart the
session with recovery to prevent data duplication. Or, restart the session in cold start mode if you can ensure
that the messages that were in the recovery ignore list are removed from the source. Use the session log to view
the message IDs. The Integration Service writes the message IDs from the recovery ignore list to the session log
if you configure verbose data tracing.

Stopping Real-time Sessions


A real-time session runs continuously unless it fails or you manually stop it. You can stop a session by issuing a
stop command in pmcmd or from the Workflow Monitor. You might want to stop a session to perform routine
maintenance.
When you stop a real-time session, the Integration Service processes messages in the pipeline based on the
following real-time sources:
♦ JMS and WebSphere MQ. The Integration Service processes messages it read up until you issued the stop. It
writes messages to the targets.
♦ MSMQ, SAP, TIBCO, webMethods, and web service messages. The Integration Service does not process
messages if you stop a session before the Integration Service writes all messages to the target.
When you stop a real-time session with a JMS or a WebSphere MQ source, the Integration Service performs the
following tasks:
1. The Integration Service stops reading messages from the source.
If you stop a real-time recovery session, the Integration Service stops reading from the source after it
recovers all messages.
2. The Integration Service processes messages in the pipeline and writes to the targets.
3. The Integration Service sends an acknowledgement to the source.
4. The Integration Service clears the recovery table or recovery file to avoid data duplication when you restart
the session.
When you restart the session, the Integration Service starts reading from the source. It restores the session and
transformation state of operation to resume the session from the point of interruption.

Restarting and Recovering Real-time Sessions


You can resume a stopped or failed real-time session. To resume a session, you must restart or recover the
session. The Integration Service can recover a session automatically if you enabled the session for automatic task
recovery.
The following sections describe recovery information that is specific to real-time sessions.

RELATED TOPICS:
♦ “Recovering Workflows” on page 221

Restarting Real-time Sessions


When you restart a session, the Integration Service resumes the session based on the real-time source.
Depending on the real-time source, it restarts the session with or without recovery.

202 Chapter 11: Real-time Processing


You can restart a task or workflow in cold start mode. When you restart a task or workflow in cold start mode,
the Integration Service discards the recovery information and restarts the task or workflow.

Recovering Real-time Sessions


If you enabled session recovery, you can recover a failed or aborted session. When you recover a session, the
Integration Service continues to process messages from the point of interruption. The Integration Service
recovers messages according to the real-time source.
The Integration Service uses the following session recovery types:
♦ Automatic recovery. The Integration Service restarts the session if you configured the workflow to
automatically recover terminated tasks. The Integration Service recovers any unprocessed data and restarts
the session regardless of the real-time source.
♦ Manual recovery. Use a Workflow Monitor or Workflow Manager menu command or pmcmd command to
recover the session. For some real-time sources, you must recover the session before you restart it or the
Integration Service will not process messages from the failed session.

Restart and Recover Commands


You can restart or recover a session in the Workflow Manager, Workflow Monitor, or pmcmd. The Integration
Service resumes the session based on the real-time source.
Table 11-1 describes the behavior when you restart or recover a session with the following commands:

Table 11-1. Restart and Recover Commands for Real-time Sessions

Command Description

- Restart Task Restarts the task or workflow. For JMS and WebSphere MQ sessions, the
- Restart Workflow Integration Service recovers before it restarts the task or workflow.
- Restart Workflow from Task Note: If the session includes a JMS, WebSphere MQ source, and another real-
time source, the Integration Service performs recovery for all real-time sources
before it restarts the task or workflow.

- Recover Task Recovers the task or workflow.


- Recover Workflow
- Recover Workflow from Task

- Cold Start Task Discards the recovery information and restarts the task or workflow.
- Cold Start Workflow
- Cold Start Workflow from Task

RELATED TOPICS:
♦ “Steps to Recover Workflows and Tasks” on page 234
♦ “Restarting a Task or Workflow Without Recovery” on page 344

Rules and Guidelines


Rules and Guidelines for Real-time Sessions
Use the following rules and guidelines when you run real-time sessions:
♦ The session fails if a mapping contains a Transaction Control transformation.
♦ The session fails if a mapping contains any transformation with Generate Transactions enabled.
♦ The session fails if a mapping contains any transformation with the transformation scope set to all input.

Rules and Guidelines 203


♦ The session fails if a mapping contains any transformation that has row transformation scope and receives
input from multiple transaction control points.
♦ The session fails if the load scope for the target is set to all input.
♦ The Integration Service ignores flush latency when you run a session in debug mode.
♦ If the mapping contains a relational target, configure the load type for the target to normal.
♦ If the mapping contains an XML target definition, select Append to Document for the On Commit option
in the target definition.
♦ The Integration Service is resilient to connection failures to WebSphere MQ. It is not resilient to any other
messaging system.

Rules and Guidelines for Message Recovery


The Integration Service fails sessions that have message recovery enabled and contain any of the following
conditions:
♦ The source definition is the master source for a Joiner transformation.
♦ You configure multiple source definitions to run concurrently for the same target load order group.
♦ The mapping contains an XML target definition.
♦ You edit the recovery file or the mapping before you restart the session and you run a session with a recovery
strategy of Restart or Resume.
♦ The Integration Service cannot connect to the recovery queue or recovery topic.
♦ The Integration Service does not write the recovery message to the recovery queue or recovery topic.
If the number of messages that the Integration Service reads or writes from the message queue exceeds the
message size limit, increase the message size limit or decrease the flush latency.

Real-time Processing Example


The following example shows how you can use PowerExchange for IBM WebSphere MQ and PowerCenter to
process real-time data.
You want to process purchase orders in real-time. A purchase order can include multiple items from multiple
suppliers. However, the purchase order does not contain the supplier or the item cost. When you receive a
purchase order, you must calculate the total cost for each supplier. You have a master database that contains
your suppliers and their respective items and item cost. You use PowerCenter to look up the supplier and item
cost based on the item ID. You also use PowerCenter to write the total supplier cost to a relational database.
Your database administrator recommends that you update the target up to 1,000 messages in a single commit.
You also want to update the target every 2,000 milliseconds to ensure that the target is always current.
To process purchase orders in real time, you create and configure a mapping.

204 Chapter 11: Real-time Processing


Figure 11-7 shows a mapping that processes purchase orders in real time:

Figure 11-7. Real-time Processing Sample Mapping

The sample mapping includes the following components:


♦ Source. WebSphere MQ. Each message is in XML format and contains one purchase order.
♦ XML Parser transformation. Receives purchase order information from the MQ Source Qualifier
transformation. It parses the purchase order ID and the quantity from the XML file.
♦ Lookup transformation. Looks up the supplier details for the purchase order ID. It passes the supplier
information, the purchase item ID, and item cost to the Expression transformation.
♦ Expression transformation. Calculates the order cost for the supplier.
♦ Target. Oracle relational database. It contains the supplier information and the total supplier cost.
You create and configure a session and workflow with the following properties:

Property Value

Message count 1,000

Flush latency interval 2,000 milliseconds

Commit type Source-based commit

Workflow schedule Run continuously

The following steps describe how the Integration Service processes the session in real-time:
1. The Integration Service reads messages from the WebSphere MQ queue until it reads 1,000 messages or
after 2,000 milliseconds. When it meets either condition, it stops reading from the WebSphere MQ queue.
2. The Integration Service looks up supplier information and calculates the order cost.
3. The Integration Service writes the supplier information and order cost to the Oracle relational target.
4. The Integration Service starts to read messages from the WebSphere MQ queue again.
5. The Integration Service repeats steps 1 through 4 as you configured the workflow to run continuously.

Real-time Processing Example 205


Informatica Real-time Products
You can use the following products to read, transform, and write real-time data:
♦ PowerExchange for JMS. Use PowerExchange for JMS to read from JMS sources and write to JMS targets.
You can read from JMS messages, JMS provider message queues, or JMS provider based on message topic.
You can write to JMS provider message queues or to a JMS provider based on message topic.
JMS providers are message-oriented middleware systems that can send and receive JMS messages. During a
session, the Integration Service connects to the Java Naming and Directory Interface (JNDI) to determine
connection information. When the Integration Service determines the connection information, it connects
to the JMS provider to read or write JMS messages.
♦ PowerExchange for WebSphere MQ. Use PowerExchange for WebSphere MQ to read from WebSphere MQ
message queues and write to WebSphere MQ message queues or database targets. PowerExchange for
WebSphere MQ interacts with the WebSphere MQ queue manager, message queues, and WebSphere MQ
messages during data extraction and loading.
♦ PowerExchange for TIBCO. Use PowerExchange for TIBCO to read messages from TIBCO and write
messages to TIBCO in TIB/Rendezvous or AE format.
The Integration Service receives TIBCO messages from a TIBCO daemon, and it writes messages through a
TIBCO daemon. The TIBCO daemon transmits the target messages across a local or wide area network.
Target listeners subscribe to TIBCO target messages based on the message subject.
♦ PowerExchange for webMethods. Use PowerExchange for webMethods to read documents from
webMethods sources and write documents to webMethods targets.
The Integration Service connects to a webMethods broker that sends, receives, and queues webMethods
documents. The Integration Service reads and writes webMethods documents based on a defined document
type or the client ID. The Integration Service also reads and writes webMethods request/reply documents.
♦ PowerExchange for MSMQ. Use PowerExchange for MSMQ to read from MSMQ sources and write to
MSMQ targets.
The Integration Service connects to the Microsoft Messaging Queue to read data from messages or write
data to messages. The queue can be public or private and transactional or non-transactional.
♦ PowerExchange for SAP NetWeaver. Use PowerExchange for SAP NetWeaver to read from SAP using
outbound IDocs or write to SAP using inbound IDocs using Application Link Enabling (ALE).
The Integration Service can read from outbound IDocs and write to a relational target. The Integration
Service can read data from a relational source and write the data to an inbound IDoc. The Integration
Service can capture changes to the master data or transactional data in the SAP application database in real
time.
♦ PowerCenter Web Services Provider. Use PowerCenter Web Services Provider to expose transformation
logic as a service through the Web Services Hub and write client applications to run real-time web services.
You can create a service mapping to receive a message from a web service client, transform it, and write it to
any target PowerCenter supports. You can also create a service mapping with both a web service source and
target definition to receive a message request from a web service client, transform the data, and send the
response back to the web service client.
The Web Services Hub receives requests from web service clients and passes them to the gateway. The
Integration Service or the Repository Service process the requests and send a response to the web service
client through the Web Services Hub.
♦ PowerExchange. Use PowerExchange to extract and load relational and non-relational data, extract change
data, and extract change data in real time.
To extract data, the Integration Service reads change data from PowerExchange on the machine hosting the
source. You can extract and load data from multiple sources and targets, such as DB2/390, DB2/400, and
Oracle. You can also use a data map from a PowerExchange Listener as a non-relational source. For more
information, see PowerExchange Interfaces for PowerCenter.

206 Chapter 11: Real-time Processing


CHAPTER 12

Understanding Commit Points


This chapter includes the following topics:
♦ Overview, 207
♦ Target-Based Commits, 208
♦ Source-Based Commits, 208
♦ User-Defined Commits, 212
♦ Understanding Transaction Control, 215
♦ Setting Commit Properties, 218

Overview
A commit interval is the interval at which the Integration Service commits data to targets during a session. The
commit point can be a factor of the commit interval, the commit interval type, and the size of the buffer blocks.
The commit interval is the number of rows you want to use as a basis for the commit point. The commit
interval type is the type of rows that you want to use as a basis for the commit point. You can choose between
the following commit types:
♦ Target-based commit. The Integration Service commits data based on the number of target rows and the key
constraints on the target table. The commit point also depends on the buffer block size, the commit interval,
and the Integration Service configuration for writer timeout.
♦ Source-based commit. The Integration Service commits data based on the number of source rows. The
commit point is the commit interval you configure in the session properties.
♦ User-defined commit. The Integration Service commits data based on transactions defined in the mapping
properties. You can also configure some commit and rollback options in the session properties.
Source-based and user-defined commit sessions have partitioning restrictions. If you configure a session with
multiple partitions to use source-based or user-defined commit, you can choose pass-through partitioning at
certain partition points in a pipeline.

RELATED TOPICS:
♦ “Setting Partition Types” on page 283

207
Target-Based Commits
During a target-based commit session, the Integration Service commits rows based on the number of target
rows and the key constraints on the target table. The commit point depends on the following factors:
♦ Commit interval. The number of rows you want to use as a basis for commits. Configure the target commit
interval in the session properties.
♦ Writer wait timeout. The amount of time the writer waits before it issues a commit. Configure the writer
wait timeout in the Integration Service setup.
♦ Buffer blocks. Blocks of memory that hold rows of data during a session. You can configure the buffer block
size in the session properties, but you cannot configure the number of rows the block holds.
When you run a target-based commit session, the Integration Service may issue a commit before, on, or after,
the configured commit interval. The Integration Service uses the following process to issue commits:
♦ When the Integration Service reaches a commit interval, it continues to fill the writer buffer block. When
the writer buffer block fills, the Integration Service issues a commit.
♦ If the writer buffer fills before the commit interval, the Integration Service writes to the target, but waits to
issue a commit. It issues a commit when one of the following conditions is true:
− The writer is idle for the amount of time specified by the Integration Service writer wait timeout option.
− The Integration Service reaches the commit interval and fills another writer buffer.
Note: When you choose target-based commit for a session containing an XML target, the Workflow Manager
disables the On Commit session property on the Transformations view of the Mapping tab.

Source-Based Commits
During a source-based commit session, the Integration Service commits data to the target based on the number
of rows from some active sources in a target load order group. These rows are referred to as source rows.
When the Integration Service runs a source-based commit session, it identifies commit source for each pipeline
in the mapping. The Integration Service generates a commit row from these active sources at every commit
interval. The Integration Service writes the name of the transformation used for source-based commit intervals
into the session log:
Source-based commit interval based on... TRANSFORMATION_NAME

The Integration Service might commit less rows to the target than the number of rows produced by the active
source. For example, you have a source-based commit session that passes 10,000 rows through an active source,
and 3,000 rows are dropped due to transformation logic. The Integration Service issues a commit to the target
when the 7,000 remaining rows reach the target.
The number of rows held in the writer buffers does not affect the commit point for a source-based commit
session. For example, you have a source-based commit session that passes 10,000 rows through an active source.
When those 10,000 rows reach the targets, the Integration Service issues a commit. If the session completes
successfully, the Integration Service issues commits after 10,000, 20,000, 30,000, and 40,000 source rows.
If the targets are in the same transaction control unit, the Integration Service commits data to the targets at the
same time. If the session fails or aborts, the Integration Service rolls back all uncommitted data in a transaction
control unit to the same source row.
If the targets are in different transaction control units, the Integration Service performs the commit when each
target receives the commit row. If the session fails or aborts, the Integration Service rolls back each target to the
last commit point. It might not roll back to the same source row for targets in separate transaction control
units.

208 Chapter 12: Understanding Commit Points


Note: Source-based commit may slow session performance if the session uses a one-to-one mapping. A one-to-
one mapping is a mapping that moves data from a Source Qualifier, XML Source Qualifier, or Application
Source Qualifier transformation directly to a target.

RELATED TOPICS:
♦ “Understanding Transaction Control Units” on page 217

Determining the Commit Source


When you run a source-based commit session, the Integration Service generates commits at all source qualifiers
and transformations that do not propagate transaction boundaries. This includes the following active sources:
♦ Source Qualifier
♦ Application Source Qualifier
♦ MQ Source Qualifier
♦ XML Source Qualifier when you only connect ports from one output group
♦ Normalizer (VSAM)
♦ Aggregator with the All Input transformation scope
♦ Joiner with the All Input transformation scope
♦ Rank with the All Input transformation scope
♦ Sorter with the All Input transformation scope
♦ Custom with one output group and with the All Input transformation scope
♦ A multiple input group transformation with one output group connected to multiple upstream transaction
control points
♦ Mapplet, if it contains one of the above transformations
A mapping can have one or more target load order groups, and a target load order group can have one or more
active sources that generate commits. The Integration Service uses the commits generated by the active source
that is closest to the target definition. This is known as the commit source.
For example, you have the mapping in Figure 12-1:

Figure 12-1. Mapping with a Single Commit Source

Transformation Scope
property is All Input.

The mapping contains a Source Qualifier transformation and an Aggregator transformation with the All Input
transformation scope. The Aggregator transformation is closer to the targets than the Source Qualifier
transformation and is therefore used as the commit source for the source-based commit session.

Source-Based Commits 209


Also, suppose you have the mapping in Figure 12-2:

Figure 12-2. Mapping with Multiple Commit Sources

Transformation Scope
property is All Input.

The mapping contains a target load order group with one source pipeline that branches from the Source
Qualifier transformation to two targets. One pipeline branch contains an Aggregator transformation with the
All Input transformation scope, and the other contains an Expression transformation. The Integration Service
identifies the Source Qualifier transformation as the commit source for t_monthly_sales and the Aggregator as
the commit source for T_COMPANY_ALL. It performs a source-based commit for both targets, but uses a
different commit source for each.

RELATED TOPICS:
♦ “Understanding Transaction Control” on page 215
♦ “Working with Active Sources” on page 178

Switching from Source-Based to Target-Based Commit


If the Integration Service identifies a target in the target load order group that does not receive commits from
an active source that generates commits, it reverts to target-based commit for that target only.
The Integration Service writes the name of the transformation used for source-based commit intervals into the
session log. When the Integration Service switches to target-based commit, it writes a message in the session
log.
A target might not receive commits from a commit source in the following circumstances:
♦ The target receives data from the XML Source Qualifier transformation, and you connect multiple output
groups from an XML Source Qualifier transformation to downstream transformations. An XML Source
Qualifier transformation does not generate commits when you connect multiple output groups downstream.
♦ The target receives data from an active source with multiple output groups other than an XML Source
Qualifier transformation. For example, the target receives data from a Custom transformation that you do
not configure to generate transactions. Multiple output group active sources neither generate nor propagate
commits.

Connecting XML Sources in a Mapping


An XML Source Qualifier transformation does not generate commits when you connect multiple output groups
downstream. When you an XML Source Qualifier transformation in a mapping, the Integration Service can use
different commit types for targets in this session depending on the transformations used in the mapping:
♦ You put a commit source between the XML Source Qualifier transformation and the target. The
Integration Service uses source-based commit for the target because it receives commits from the commit
source. The active source is the commit source for the target.
♦ You do not put a commit source between the XML Source Qualifier transformation and the target. The
Integration Service uses target-based commit for the target because it receives no commits.

210 Chapter 12: Understanding Commit Points


Suppose you have the mapping in Figure 12-3:

Figure 12-3. Mapping with Targets Connected to a Commit Source

Connected to an XML Source


Qualifier transformation with multiple
connected output groups. Integration
Service uses target-based commit
when loading to these targets.

Connected to an active source that


generates commits, AGG_Sales.
Integration Service uses source-based
commit when loading to this target.

Transformation Scope = All

This mapping contains an XML Source Qualifier transformation with multiple output groups connected
downstream. Because you connect multiple output groups downstream, the XML Source Qualifier
transformation does not generate commits. You connect the XML Source Qualifier transformation to two
relational targets, T_STORE and T_PRODUCT. Therefore, these targets do not receive any commit generated
by an active source. The Integration Service uses target-based commit when loading to these targets.
However, the mapping includes an active source that generates commits, AGG_Sales, between the XML Source
Qualifier transformation and T_YTD_SALES. The Integration Service uses source-based commit when loading
to T_YTD_SALES.

Connecting Multiple Output Group Custom Transformations in a Mapping


Multiple output group Custom transformations that you do not configure to generate transactions neither
generate nor propagate commits. Therefore, the Integration Service can use different commit types for targets
in this session depending on the transformations used in the mapping:
♦ You put a commit source between the Custom transformation and the target. The Integration Service uses
source-based commit for the target because it receives commits from the active source. The active source is
the commit source for the target.
♦ You do not put a commit source between the Custom transformation and the target. The Integration
Service uses target-based commit for the target because it receives no commits.
Suppose you have the mapping in Figure 12-4:

Figure 12-4. Mapping a Custom Transformation with a Commit Source

Connected to a multiple output


group active source,
CT_XML_Parser. Integration
Service uses target-based commit
when loading to these targets.

Connected to an active source


that generates commits,
AGG_store_orders. Integration
Service uses source-based
commit when loading to this
target.

Transformation Scope is All Input.

Source-Based Commits 211


The mapping contains a multiple output group Custom transformation, CT_XML_Parser, which drops the
commits generated by the Source Qualifier transformation. Therefore, targets T_store_name and T_store_addr
do not receive any commits generated by an active source. The Integration Service uses target-based commit
when loading to these targets.
However, the mapping includes an active source that generates commits, AGG_store_orders, between the
Custom transformation and T_store_orders. The Integration Service uses source-based commit when loading
to T_store_orders.
Note: You can configure a Custom transformation to generate transactions when the Custom transformation
procedure outputs transactions. When you do this, configure the session for user-defined commit.

User-Defined Commits
During a user-defined commit session, the Integration Service commits and rolls back transactions based on a
row or set of rows that pass through a Transaction Control transformation. The Integration Service evaluates
the transaction control expression for each row that enters the transformation. The return value of the
transaction control expression defines the commit or rollback point.
You can also create a user-defined commit session when the mapping contains a Custom transformation
configured to generate transactions. When you do this, the procedure associated with the Custom
transformation defines the transaction boundaries.
When the Integration Service evaluates a commit row, it commits all rows in the transaction to the target or
targets. When it evaluates a rollback row, it rolls back all rows in the transaction from the target or targets. The
Integration Service writes a message to the session log at each commit and rollback point. The session details are
cumulative. The following message is a sample commit message from the session log:
WRITER_1_1_1> WRT_8317
USER-DEFINED COMMIT POINT Wed Oct 15 08:15:29 2003
===================================================
WRT_8036 Target: TCustOrders (Instance Name: [TCustOrders])
WRT_8038 Inserted rows - Requested: 1003 Applied: 1003 Rejected:
0 Affected: 1023

When the Integration Service writes all rows in a transaction to all targets, it issues commits sequentially for
each target.
The Integration Service rolls back data based on the return value of the transaction control expression or error
handling configuration. If the transaction control expression returns a rollback value, the Integration Service
rolls back the transaction. If an error occurs, you can choose to roll back or commit at the next commit point.
If the transaction control expression evaluates to a value other than commit, rollback, or continue, the
Integration Service fails the session.
When the session completes, the Integration Service may write data to the target that was not bound by commit
rows. You can choose to commit at end of file or to roll back that open transaction.
Note: If you use bulk loading with a user-defined commit session, the target may not recognize the transaction
boundaries. If the target connection group does not support transactions, the Integration Service writes the
following message to the session log:
WRT_8324 Warning: Target Connection Group’s connection doesn’t support transactions.
Targets may not be loaded according to specified transaction boundaries rules.

Rolling Back Transactions


The Integration Service rolls back transactions in the following circumstances:
♦ Rollback evaluation. The transaction control expression returns a rollback value.
♦ Open transaction. You choose to roll back at the end of file.

212 Chapter 12: Understanding Commit Points


♦ Roll back on error. You choose to roll back commit transactions if the Integration Service encounters a non-
fatal error.
♦ Roll back on failed commit. If any target connection group in a transaction control unit fails to commit, the
Integration Service rolls back all uncommitted data to the last successful commit point.

RELATED TOPICS:
♦ “Understanding Transaction Control Units” on page 217

Rollback Evaluation
If the transaction control expression returns a rollback value, the Integration Service rolls back the transaction
and writes a message to the session log indicating that the transaction was rolled back. It also indicates how
many rows were rolled back.
The following message is a sample message that the Integration Service writes to the session log when the
transaction control expression returns a rollback value:
WRITER_1_1_1> WRT_8326 User-defined rollback processed
WRITER_1_1_1> WRT_8331 Rollback statistics
WRT_8162 ===================================================
WRT_8330 Rolled back [333] inserted, [0] deleted, [0] updated rows for the target
[TCustOrders]

Roll Back Open Transaction


If the last row in the transaction control expression evaluates to TC_CONTINUE_TRANSACTION, the
session completes with an open transaction. If you choose to roll back that open transaction, the Integration
Service rolls back the transaction and writes a message to the session log indicating that the transaction was
rolled back.
The following message is a sample message indicating that Commit on End of File is disabled in the session
properties:
WRITER_1_1_1> WRT_8168 End loading table [TCustOrders] at: Wed Nov 05 10:21:56 2003
WRITER_1_1_1> WRT_8325 Final rollback executed for the target [TCustOrders] at end of
load

The following message is a sample message indicating that Commit on End of File is enabled in the session
properties:
WRITER_1_1_1> WRT_8143
Commit at end of Load Order Group Wed Nov 05 08:15:29 2003

Roll Back on Error


You can choose to roll back a transaction at the next commit point if the Integration Service encounters a non-
fatal error. When the Integration Service encounters a non-fatal error, it processes the error row and continues
processing the transaction. If the transaction boundary is a commit row, the Integration Service rolls back the
entire transaction and writes it to the reject file.
The following table describes row indicators in the reject file for rolled-back transactions:

Row Indicator Description

4 Rolled-back insert

5 Rolled-back update

6 Rolled-back delete

Note: The Integration Service does not roll back a transaction if it encounters an error before it processes any
row through the Transaction Control transformation.

User-Defined Commits 213


Roll Back on Failed Commit
When the Integration Service reaches the commit point for all targets in a transaction control unit, it issues
commits sequentially for each target. If the commit fails for any target connection group within a transaction
control unit, the Integration Service rolls back all data to the last successful commit point. The Integration
Service cannot roll back committed transactions, but it does write the transactions to the reject file.
For example, use the mapping in Figure 12-5 on page 214 to read through the following situation. This
mapping has one transaction control unit and three target connection groups. The target names contain
information about the target connection group. For example, TCG1_T1 represents the first target connection
group and the first target.
The Integration Service uses the following logic when it processes the mapping in Figure 12-5 on page 214:
1. The Integration Service reaches the third commit point for all targets.
2. It begins to issue commits sequentially for each target.
3. The Integration Service successfully commits to TCG1_T1 and TCG1_T2.
4. The commit fails for TCG2_T3.
5. The Integration Service does not issue a commit for TCG3_T4.
6. The Integration Service rolls back TCG2_T3 and TCG3_T4 to the second commit point, but it cannot
roll back TCG1_T1 and TCG1_T2 to the second commit point because it successfully committed at the
third commit point.
7. The Integration Service writes the rows to the reject file from TCG2_T3 and TCG3_T4. These are the
rollback rows associated with the third commit point.
8. The Integration Service writes the row to the reject file from TCG_T1 and TCG1_T2. These are the
commit rows associated with the third commit point.
Figure 12-5 illustrates Integration Service behavior when it rolls back on a failed commit:

Figure 12-5. Roll Back on Failed Commit Example

Third commit is successful (3).


Rows appear in the reject file (8).

Third commit fails (4).


Integration Service rolls back to second commit (6).
Rows appear in reject file (7).

Integration Service does not issue third commit (5).


It rolls back to second commit (6).
Rows appear in reject file (7).

The following table describes row indicators in the reject file for committed transactions in a failed transaction
control unit:

Row Indicator Description

7 Committed insert

8 Committed update

9 Committed delete

214 Chapter 12: Understanding Commit Points


Understanding Transaction Control
PowerCenter lets you define transactions that the Integration Service uses when it processes transformations
and when it commits and rolls back data at a target. You can define a transaction based on a varying number of
input rows. A transaction is a set of rows bound by commit or rollback rows, the transaction boundaries. Some
rows may not be bound by transaction boundaries. This set of rows is an open transaction. You can choose to
commit at end of file or to roll back open transactions when you configure the session.
The Integration Service can process input rows for a transformation each row at a time, for all rows in a
transaction, or for all source rows together. Processing a transformation for all rows in a transaction lets you
include transformations, such as an Aggregator, in a real-time session.
Transaction boundaries originate from transaction control points. A transaction control point is a
transformation that defines or redefines the transaction boundary in the following ways:
♦ Generates transaction boundaries. The transformations that define transaction boundaries differ, depending
on the session commit type:
− Target-based and user-defined commit. Transaction generators generate transaction boundaries. A
transaction generator is a transformation that generates both commit and rollback rows. The Transaction
Control and Custom transformation are transaction generators.
− Source-based commit. Some active sources generate commits. They do not generate rollback rows. Also,
transaction generators generate commit and rollback rows.
♦ Drops incoming transaction boundaries. When a transformation drops incoming transaction boundaries,
and does not generate commits, the Integration Service outputs all rows into an open transaction. All active
sources that generate commits and transaction generators drop incoming transaction boundaries.
For a list of transaction control points, see Table 12-1 on page 216.

RELATED TOPICS:
♦ “Determining the Commit Source” on page 209

♦ “Setting Commit Properties” on page 218


♦ “Transformation Scope” on page 215

Transformation Scope
You can configure how the Integration Service applies the transformation logic to incoming data with the
Transformation Scope transformation property. When the Integration Service processes a transformation, it
either drops transaction boundaries or preserves transaction boundaries, depending on the transformation scope
and the mapping configuration.
You can choose one of the following values for the transformation scope:
♦ Row. Applies the transformation logic to one row of data at a time. Choose Row when a row of data does not
depend on any other row. When you choose Row for a transformation connected to multiple upstream
transaction control points, the Integration Service drops transaction boundaries and outputs all rows from
the transformation as an open transaction. When you choose Row for a transformation connected to a single
upstream transaction control point, the Integration Service preserves transaction boundaries.
♦ Transaction. Applies the transformation logic to all rows in a transaction. Choose Transaction when a row of
data depends on all rows in the same transaction, but does not depend on rows in other transactions. When
you choose Transaction, the Integration Service preserves incoming transaction boundaries. It resets any
cache, such as an aggregator or lookup cache, when it receives a new transaction.
When you choose Transaction for a multiple input group transformation, you must connect all input groups
to the same upstream transaction control point.
♦ All Input. Applies the transformation logic on all incoming data. When you choose All Input, the
Integration Service drops incoming transaction boundaries and outputs all rows from the transformation as
an open transaction. Choose All Input when a row of data depends on all rows in the source.

Understanding Transaction Control 215


Table 12-1 lists the transformation scope values available for each transformation:

Table 12-1. Transformation Scope Property Values

Transformation Row Transaction All Input

Aggregator Optional. Default.


Transaction control point.

Application Source n/a


Qualifier Transaction control point.

Custom* Optional. Optional. Default.


Transaction control point Transaction control point Always a transaction
when configured to when configured to control point.
generate commits or when generate commits. Generates commits when
connected to multiple it has one output group or
upstream transaction when configured to
control points. generate commits.
Otherwise, it generates an
open transaction.

Expression Default. Does not display.

External Procedure Default. Does not display.

Filter Default. Does not display.

HTTP Default. Read only.

Java* Default for passive Optional for active Default for active
transformations. transformations. transformations.

Joiner Optional. Default.


Transaction control point.

Lookup Default. Does not display.

MQ Source Qualifier n/a


Transaction control point.

Normalizer (VSAM) n/a


Transaction control point.

Normalizer (relational) Default. Does not display.

Rank Optional. Default.


Transaction control point.

Router Default. Does not display.

Sorter Optional. Default.


Transaction control point.

Sequence Generator Default. Does not display.

Source Qualifier n/a


Transaction control point.

SQL Default for script mode Optional. Default for query mode
SQL transformations. Transaction control point SQL transformations.
when configured to
generate commits.

Stored Procedure Default. Does not display.

Transaction Control Default. Does not display.


Transaction control point.

Union Default. Does not display.

Unstructured Data Default. Read only.

Update Strategy Default. Does not display.

216 Chapter 12: Understanding Commit Points


Table 12-1. Transformation Scope Property Values

Transformation Row Transaction All Input

XML Generator Optional. Default. Does not display.


Transaction when the
flush on commit is set to
create a new document.

XML Parser Default. Does not display.

XML Source Qualifier n/a


Transaction control point.
*For more information about how the Transformation Scope property affects Custom or Java transformations, see “Custom
Transformation” or “Java Transformation” in the PowerCenter Transformation Guide.

Understanding Transaction Control Units


A transaction control unit is the group of targets connected to an active source that generates commits or an
effective transaction generator. A transaction control unit is a subset of a target load order group and may
contain multiple target connection groups.
When the Integration Service reaches the commit point for all targets in a transaction control unit, it issues
commits sequentially for each target.
Figure 12-6 illustrates transaction control units with a Transaction Control transformation:

Figure 12-6. Transaction Control Units

Target Connection Group 1

Transaction
Control Unit
Target Connection Group 2 1 Target
Load
Order
Group
Target Connection Group 3

Target Connection Group 4 Transaction


Control Unit
2

Note that T5_ora1 uses the same connection name as T1_ora1 and T2_ora1. Because T5_ora1 is connected to
a separate Transaction Control transformation, it is in a separate transaction control unit and target connection
group. If you connect T5_ora1 to tc_TransactionControlUnit1, it will be in the same transaction control unit
as all targets, and in the same target connection group as T1_ora1 and T2_ora1.

RELATED TOPICS:
♦ “Working with Target Connection Groups” on page 177

Rules and Guidelines


Consider the following rules and guidelines when you work with transaction control:

Understanding Transaction Control 217


♦ Transformations with Transaction transformation scope must receive data from a single transaction control
point.
♦ The Integration Service uses the transaction boundaries defined by the first upstream transaction control
point for transformations with Transaction transformation scope.
♦ Transaction generators can be effective or ineffective for a target. The Integration Service uses the
transaction generated by an effective transaction generator when it loads data to a target. For more
information about effective and ineffective transaction generators, see the PowerCenter Transformation
Guide.
♦ The Workflow Manager prevents you from using incremental aggregation in a session with an Aggregator
transformation with Transaction transformation scope.
♦ Transformations with All Input transformation scope cause a transaction generator to become ineffective for
a target in a user-defined commit session. For more information about using transaction generators in
mappings, see the PowerCenter Transformation Guide.
♦ The Integration Service resets any cache at the beginning of each transaction for Aggregator, Joiner, Rank,
and Sorter transformations with Transaction transformation scope.
♦ You can choose the Transaction transformation scope for Joiner transformations when you use sorted input.
♦ When you add a partition point at a transformation with Transaction transformation scope, the Workflow
Manager uses the pass-through partition type by default. You cannot change the partition type.

Creating Target Files by Transaction


You can generate a separate output file each time the Integration Service starts a new transaction. You can
dynamically name each target flat file.
To generate a separate output file for each transaction, add a FileName port to the flat file target definition.
When you connect the FileName port in the mapping, the PowerCenter writes a separate target file at each
commit. The Integration Service uses the FileName port value from the first row in each transaction to name
the output file. For more information about creating target files by transaction, see the PowerCenter Designer
Guide.

Setting Commit Properties


When you create a session, you can configure commit properties. The properties you set depend on the type of
mapping and the type of commit you want the Integration Service to perform. Configure commit properties in
the General Options settings of the Properties tab.
Table 12-2 describes the session commit properties that you set in the General Options settings of the
Properties tab:

Table 12-2. Session Commit Properties

Property Target-Based Source-Based User-Defined

Commit Type Selected by default if no Choose for source-based Selected by default if


transaction generator or only commit if no transaction effective transaction
ineffective transaction generator or only generators are in the
generators are in the ineffective transaction mapping.
mapping. generators are in the
mapping.

Commit Interval* Default is 10,000. Default is 10,000. n/a

218 Chapter 12: Understanding Commit Points


Table 12-2. Session Commit Properties

Property Target-Based Source-Based User-Defined

Commit on End Commits data at the end of Commits data at the end Commits data at the end
of File the file. Enabled by default. of the file. Clear this of the file. Clear this option
You cannot disable this option if you want the if you want the Integration
option. Integration Service to roll Service to roll back open
back open transactions. transactions.

Roll Back If the Integration Service If the Integration Service If the Integration Service
Transactions on encounters a non-fatal error, encounters a non-fatal encounters a non-fatal
Errors you can choose to roll back error, you can choose to error, you can choose to
the transaction at the next roll back the transaction at roll back the transaction at
commit point. the next commit point. the next commit point.
When the Integration When the Integration When the Integration
Service encounters a Service encounters a Service encounters a
transformation error, it rolls transformation error, it transformation error, it
back the transaction if the rolls back the transaction rolls back the transaction if
error occurs after the if the error occurs after the the error occurs after the
effective transaction effective transaction effective transaction
generator for the target. generator for the target. generator for the target.
* Tip: When you bulk load to Microsoft SQL Server or Oracle targets, define a large commit interval. Microsoft SQL Server and
Oracle start a new bulk load transaction after each commit. Increasing the commit interval reduces the number of bulk load
transactions and increases performance.

Setting Commit Properties 219


220 Chapter 12: Understanding Commit Points
CHAPTER 13

Recovering Workflows
This chapter includes the following topics:
♦ Overview, 221
♦ State of Operation, 222
♦ Recovery Options, 225
♦ Configuring Workflow Recovery, 226
♦ Configuring Task Recovery, 227
♦ Resuming Sessions, 229
♦ Working with Repeatable Data, 230
♦ Steps to Recover Workflows and Tasks, 234
♦ Rules and Guidelines for Session Recovery, 235

Overview
Workflow recovery allows you to continue processing the workflow and workflow tasks from the point of
interruption. You can recover a workflow if the Integration Service can access the workflow state of operation.
The workflow state of operation includes the status of tasks in the workflow and workflow variable values. The
Integration Service stores the state in memory or on disk, based on how you configure the workflow:
♦ Enable recovery. When you enable a workflow for recovery, the Integration Service saves the workflow state
of operation in a shared location. You can recover the workflow if it terminates, stops, or aborts. The
workflow does not have to be running.
♦ Suspend. When you configure a workflow to suspend on error, the Integration Service stores the workflow
state of operation in memory. You can recover the suspended workflow if a task fails. You can fix the task
error and recover the workflow.
The Integration Service recovers tasks in the workflow based on the recovery strategy of the task. By default, the
recovery strategy for Session and Command tasks is to fail the task and continue running the workflow. You
can configure the recovery strategy for Session and Command tasks. The strategy for all other tasks is to restart
the task.
When you have high availability, PowerCenter recovers a workflow automatically if a service process that is
running the workflow fails over to a different node. You can configure a running workflow to recover a task
automatically when the task terminates. PowerCenter also recovers a session and workflow after a database
connection interruption.

221
When the Integration Service runs in safe mode, it stores the state of operation for workflows configured for
recovery. If the workflow fails the Integration Service fails over to a backup node, the Integration Service does
not automatically recover the workflow. You can manually recover the workflow if you have the appropriate
privileges on the Integration Service.

State of Operation
When you recover a workflow or session, the Integration Service restores the workflow or session state of
operation to determine where to begin recovery processing. The Integration Service stores the workflow state of
operation in memory or on disk based on the way you configure the workflow. The Integration Service stores
the session state of operation based on the way you configure the session.

Workflow State of Operation


The Integration Service stores the workflow state of operation when you enable the workflow for recovery or for
suspension. When the workflow is suspended, the state of operation is in memory.
When you enable a workflow for recovery, the Integration Service stores the workflow state of operation in the
shared location, $PMStorageDir. The Integration Service can restore the state of operation to recover a
stopped, aborted, or terminated workflow. When it performs recovery, it restores the state of operation to
recover the workflow from the point of interruption. When the workflow completes, the Integration Service
removes the workflow state of operation from the shared folder.
The workflow state of operation includes the following information:
♦ Active service requests
♦ Completed and running task status
♦ Workflow variable values
When you run concurrent workflows, the Integration Service appends the instance name or the workflow run
ID to the workflow recovery storage file in $PMStorageDir.
When you enable a workflow for recovery the Integration Service does not store the session state of operation
by default. You can configure the session recovery strategy to save the session state of operation.

Session State of Operation


When you configure the session recovery strategy to resume from the last checkpoint, the Integration Service
stores the session state of operation in the shared location, $PMStorageDir. The Integration Service also saves
relational target recovery information in target database tables. When the Integration Service performs
recovery, it restores the state of operation to recover the session from the point of interruption. It uses the target
recovery data to determine how to recover the target tables.
You can configure the session to save the session state of operation even if you do not save the workflow state of
operation. You can recover the session, or you can recover the workflow from the session.
The session state of operation includes the following information:
♦ Source. If the output from a source is not deterministic and repeatable, the Integration Service saves the
result from the SQL query to a shared storage file in $PMStorageDir. The Integration Service saves the
result from the SQL query to a shared storage file in $PMStorageDir.
♦ Transformation. The Integration Service creates checkpoints in $PMStorageDir to determine where to start
processing the pipeline when it runs a recovery session.
When you run a session with an incremental Aggregator transformation, the Integration Service creates a
backup of the Aggregator cache files in $PMCacheDir at the beginning of a session run. The Integration
Service promotes the backup cache to the initial cache at the beginning of a session recovery run.

222 Chapter 13: Recovering Workflows


♦ Relational target recovery data. The Integration Service writes recovery information to recovery tables in
the target database to determine the last row committed to the target when the session was interrupted.

RELATED TOPICS:
♦ “Working with Repeatable Data” on page 230

Target Recovery Tables


When the Integration Service runs a session that has a resume recovery strategy, it writes to recovery tables on
the target database system. When the Integration Service recovers the session, it uses information in the
recovery tables to determine where to begin loading data to target tables.
If you want the Integration Service to create the recovery tables, grant table creation privilege to the database
user name configured in the target database connection. If you do not want the Integration Service to create the
recovery tables, create the recovery tables manually.
The Integration Service creates the following recovery tables in the target database:
♦ PM_RECOVERY. Contains target load information for the session run. The Integration Service removes
the information from this table after each successful session and initializes the information at the beginning
of subsequent sessions.
♦ PM_TGT_RUN_ID. Contains information the Integration Service uses to identify each target on the
database. The information remains in the table between session runs. If you manually create this table, you
must create a row and enter a value other than zero for LAST_TGT_RUN_ID to ensure that the session
recovers successfully.
♦ PM_REC_STATE. Contains information the Integration Service uses to determine if it needs to write
messages to the target table during recovery for a real-time session.
If you edit or drop the recovery tables before you recover a session, the Integration Service cannot recover the
session. If you disable recovery, the Integration Service does not remove the recovery tables from the target
database. You must manually remove the recovery tables.
Table 13-1 describes the format of PM_RECOVERY:

Table 13-1. PM_RECOVERY Table Definition

Column Name Datatype

REP_GID VARCHAR(240)

WFLOW_ID INTEGER

WFLOW_RUN_ID INTEGER

WFLOW_RUN_INS_NAME VARCHAR(240)

SUBJ_ID INTEGER

TASK_INST_ID INTEGER

TGT_INST_ID INTEGER

PARTITION_ID INTEGER

TGT_RUN_ID INTEGER

RECOVERY_VER INTEGER

CHECK_POINT INTEGER

ROW_COUNT INTEGER

State of Operation 223


Table 13-2 describes the format of PM_TGT_RUN_ID:

Table 13-2. PM_TGT_RUN_ID Table Definition

Column Name Datatype

LAST_TGT_RUN_ID INTEGER

Table 13-3 describes the format of PM_REC_STATE:

Table 13-3. PM_REC_STATE Table Definition

Column Name Datatype

OWNER_TYPE_ID INTEGER

REP_GID VARCHAR(240)

FOLDER_ID INTEGER

WFLOW_ID INTEGER

WFLOW_RUN_INS_NAME VARCHAR(240)

WLET_ID INTEGER

TASK_INST_ID INTEGER

WID_INST_ID INTEGER

GROUP_ID INTEGER

PART_ID INTEGER

PLUGIN_ID INTEGER

APPL_ID VARCHAR(38)

SEQ_NUM INTEGER

VERSION INTEGER

CHKP_NUM INTEGER

STATE_DATA VARCHAR(1024)

Oracle uses the NUMBER datatype instead of the INTEGER datatype.


Note: When concurrent recovery sessions write to the same target database, the Integration Service may
encounter a deadlock on PM_RECOVERY. To retry writing to PM_RECOVERY on deadlock, you can
configure the Session Retry on Deadlock option to retry the deadlock for the session.

RELATED TOPICS:
♦ “Deadlock Retry” on page 171

♦ “PM_REC_STATE Table” on page 199

Creating Target Recovery Tables


You can manually create the target recovery tables. Informatica provides SQL scripts in the following directory:
<PowerCenter installation_dir>\server\bin\RecoverySQL

Run one of the following scripts to create the recovery tables in the target database:

Script Database

create_schema_db2.sql IBM DB2

create_schema_inf.sql Informix

create_schema_ora.sql Oracle

224 Chapter 13: Recovering Workflows


Script Database

create_schema_sql.sql Microsoft SQL Server

create_schema_syb.sql Sybase

create_schema_ter.sql Teradata

Recovery Options
To perform recovery, you must configure the mapping, workflow tasks, and the workflow for recovery.
Table 13-4 describes the options that you can configure for recovery:

Table 13-4. Configurable Options for Recovery

Option Location Description

Suspend Workflow on Workflow Suspends the workflow when a task in the workflow fails. You
Error can fix the failed tasks and recover a suspended workflow. For
more information, see “Recovering Suspended Workflows” on
page 227.

Suspension Email Workflow Sends an email when the workflow suspends. For more
information, see “Recovering Suspended Workflows” on
page 227.

Enable HA Recovery Workflow Saves the workflow state of operation in a shared location. You
do not need high availability to enable workflow recovery. For
more information, see “Configuring Workflow Recovery” on
page 226.

Automatically Recover Workflow Recovers terminated Session and Command tasks while the
Terminated Tasks workflow is running. You must have the high availability option.
For more information, see “Automatically Recovering Terminated
Tasks” on page 229.

Maximum Automatic Workflow The number of times the Integration Service attempts to recover
Recovery Attempts a Session or Command task. For more information, see
“Automatically Recovering Terminated Tasks” on page 229.

Recovery Strategy Session, The recovery strategy for a Session or Command task.
Command Determines how the Integration Service recovers a Session or
Command task during workflow recovery and how it recovers a
session during session recovery. For more information, see
“Configuring Task Recovery” on page 227.

Fail Task If Any Command Enables the Command task to fail if any of the commands in the
Command Fails task fail. If you do not set this option, the task continues to run
when any of the commands fail. You can use this option with
Suspend Workflow on Error to suspend the workflow if any
command in the task fails. For more information, see
“Configuring Task Recovery” on page 227.

Recovery Options 225


Table 13-4. Configurable Options for Recovery

Option Location Description

Output is Deterministic Transformation Indicates that the transformation always generates the same set
of data from the same input data. The Integration Service can
resume a session from the last checkpoint when the output is
repeatable and deterministic. When you enable this option with
the Output is Repeatable option for a relational source qualifier,
the Integration Service does not save the SQL results to shared
storage. For more information, see “Output is Deterministic” on
page 231.

Output is Repeatable Transformation Indicates whether the transformation generates rows in the same
order between session runs. The Integration Service can resume
a session from the last checkpoint when the output is repeatable
and deterministic.When you enable this option with the Output is
Deterministic option for a relational source qualifier, the
Integration Service does not save the SQL results to shared
storage. For more information, see “Output is Repeatable” on
page 231.

Warning: If you configure a transformation as repeatable and deterministic, it is your responsibility to ensure
that the data is repeatable. If you try to recover a session with transformations that do not generate repeatable
and deterministic data, the recovery process can result in corrupted data.

Configuring Workflow Recovery


To configure a workflow for recovery, you must enable the workflow for recovery or configure the workflow to
suspend on task error. When the workflow is configured for recovery, you can recover it if it stops, aborts,
terminates, or suspends.
Table 13-5 describes each recoverable workflow status:

Table 13-5. Recoverable Workflow Status

Status Description

Aborted You abort the workflow in the Workflow Monitor or through pmcmd. You can also choose to abort
all running workflows when you disable the service process in the Administration Console. You
can recover an aborted workflow if you enable the workflow for recovery. You can recover an
aborted workflow in the Workflow Monitor or by using pmcmd.

Stopped You stop the workflow in the Workflow Monitor or through pmcmd. You can also choose to stop
all running workflows when you disable the service or service process in the Administration
Console. You can recover a stopped workflow if you enable the workflow for recovery. You can
recover a stopped workflow in the Workflow Monitor or by using pmcmd.

Suspended A task fails and the workflow is configured to suspend on a task error. If multiple tasks are
running, the Integration Service suspends the workflow when all running tasks either succeed or
fail. You can fix the errors that caused the task or tasks to fail before you run recovery.
By default, a workflow continues after a task fails. To suspend the workflow when a task fails,
configure the workflow to suspend on task error.

Terminated The service process running the workflow shuts down unexpectedly. Tasks terminate on all
nodes running the workflow. A workflow can terminate when a task in the workflow terminates
and you do not have the high availability option. You can recover a terminated workflow if you
enable the workflow for recovery. When you have high availability, the service process fails over
to another node and workflow recovery starts.
Note: A failed workflow is a workflow that completes with failure. You cannot recover a failed workflow.

For more information about workflow status, “Workflow and Task Status” on page 346.

226 Chapter 13: Recovering Workflows


Recovering Stopped, Aborted, and Terminated Workflows
When you enable a workflow for recovery, the Integration Service saves the workflow state of operation to a file
during the workflow run. You can recover a stopped, terminated, or aborted workflow. Enable recovery on the
Properties tab of the workflow.

Recovering Suspended Workflows


You can configure a workflow to suspend if a task in the workflow fails. By default, a workflow continues to run
when a task fails. You can suspend the workflow at task failure, fix the task that failed, and recover the
workflow. When you suspend a workflow, the workflow state of operation stays in memory. You can fix the
error that cause the task to fail and recover the workflow from the point of interruption. If the task fails again,
the Integration Service suspends the workflow again. You can recover a suspended workflow, but you cannot
restart it. Configure a workflow to suspend on the General tab of the workflow properties.
You can also configure the workflow to send an email when a task suspends.

RELATED TOPICS:
♦ “Working with Suspension Email” on page 244

Configuring Task Recovery


When you recover a workflow, the Integration Service recovers the tasks based on the recovery strategy for each
task. Depending on the task, the recovery strategy can be fail task and continue workflow, resume from the last
checkpoint, or restart task.
When you enable workflow recovery, you can recover a task that you abort or stop. You can recover a task that
terminates due to network or service process failures. When you configure a workflow to suspend on error, you
can recover a failed task when you recover the workflow.
Table 13-6 describes each recoverable task status:

Table 13-6. Recoverable Task Statuses

Status Description

Aborted You abort the workflow or task in the Workflow Monitor or through pmcmd. You can also choose
to abort all running workflows when you disable the service or service process in the
Administration Console. You can also configure a session to abort based on mapping
conditions.
You can recover the workflow in the Workflow Monitor to recover the task or you can recover the
workflow using pmcmd.

Stopped You stop the workflow or task in the Workflow Monitor or through pmcmd. You can also choose
to stop all running workflows when you disable the service or service process in the
Administration Console.
You can recover the workflow in the Workflow Monitor to recover the task or you can recover the
workflow using pmcmd.

Failed The Integration Service failed the task due to errors. You can recover a failed task using
workflow recovery when the workflow is configured to suspend on task failure. When the
workflow is not suspended you can recover a failed task by recovering just the session or
recovering the workflow from the session.
You can fix the error and recover the workflow in the Workflow Monitor or you can recover the
workflow using pmcmd.

Terminated The Integration Service stops unexpectedly or loses network connection to the master service
process. You can recover the workflow in the Workflow Monitor or you can recover the workflow
using pmcmd after the Integration Service restarts.

Configuring Task Recovery 227


Task Recovery Strategies
Each task in a workflow has a recovery strategy. When the Integration Service recovers a workflow, it recovers
tasks based on the recovery strategy:
♦ Restart task. When the Integration Service recovers a workflow, it restarts each recoverable task that is
configured with a restart strategy. You can configure Session and Command tasks with a restart recovery
strategy. All other tasks have a restart recovery strategy by default.
♦ Fail task and continue workflow. When the Integration Service recovers a workflow, it does not recover the
task. The task status becomes failed, and the Integration Service continues running the workflow.
Configure a fail recovery strategy if you want to complete the workflow, but you do not want to recover the
task. You can configure Session and Command tasks with the fail task and continue workflow recovery
strategy.
♦ Resume from the last checkpoint. The Integration Service recovers a stopped, aborted, or terminated
session from the last checkpoint. You can configure a Session task with a resume strategy.
Table 13-7 describes the recovery strategy for each task type:

Table 13-7. Recovery Strategy by Task Type

Task Type Recovery Strategy Comments

Assignment Restart task.

Command Restart task. Default is fail task and continue workflow.


Fail task and continue workflow.

Control Restart task.

Decision Restart task.

Email Restart task. The Integration Service might send duplicate email.

Event-Raise Restart task.

Event-Wait Restart task.

Session Resume from the last checkpoint. Default is fail task and continue workflow.
Restart task.
Fail task and continue workflow.

Timer Restart task. If you use a relative time from the start time of a task or
workflow, set the timer with the original value less the
passed time.

Worklet n/a The Integration Service does not recover a worklet. You
can recover the session in the worklet by expanding the
worklet in the Workflow Monitor and choosing Recover
Task.

RELATED TOPICS:
♦ “Configuring Task Recovery” on page 227

♦ “Session Task Strategies” on page 229


♦ “Resuming Sessions” on page 229

Command Task Strategies


When you configure a Command task, you can choose a recovery strategy to restart or fail:
♦ Fail task and continue workflow. If you want to suspend the workflow on Command task error, you must
configure the task with a fail strategy. If the Command task has more than one command, and you configure
a fail strategy, you need to configure the task to fail if any command fails.
♦ Restart task. When the Integration Service recovers a workflow, it restarts a Command task that is
configured with a restart strategy.

228 Chapter 13: Recovering Workflows


Configure the recovery strategy on the Properties page of the Command task.

RELATED TOPICS:
♦ “Working with the Command Task” on page 100

Session Task Strategies


When you configure a session for recovery, you can recover the session when you recover a workflow, or you
can recover the session without running the rest of the workflow.
When you configure a session, you can choose a recovery strategy of fail, restart, or resume:
♦ Resume from the last checkpoint. The Integration Service saves the session state of operation and maintains
target recovery tables. If the session aborts, stops, or terminates, the Integration Service uses the saved
recovery information to resume the session from the point of interruption.
♦ Restart task. The Integration Service runs the session again when it recovers the workflow. When you
recover with restart task, you might need to remove the partially loaded data in the target or design a
mapping to skip the duplicate rows.
♦ Fail task and continue workflow. When the Integration Service recovers a workflow, it does not recover the
session. The session status becomes failed, and the Integration Service continues running the workflow.
Configure the recovery strategy on the Properties page of the Session task.

Automatically Recovering Terminated Tasks


When you have the high availability option, you can configure automatic recovery of terminated tasks. When
you enable automatic task recovery, the Integration Service recovers terminated Session and Command tasks
without user intervention if the workflow is still running. You configure the number of times the Integration
Service attempts to recover the task. Enable automatic task recovery in the workflow properties.

Resuming Sessions
When you configure session recovery to resume from the last checkpoint, the Integration Service creates
checkpoints in $PMStorageDir to determine where to start processing session recovery. When the Integration
Service resumes a session, it restores the session state of operation, including the state of each source, target, and
transformation. The Integration Service determines how much of the source data it needs to process.
When the Integration Service resumes a session, the recovery session must produce the same data as the original
session. The session is not valid if you configure recovery to resume from the last checkpoint, but the session
cannot produce repeatable data.
The Integration Service can recover flat file sources including FTP sources. It can truncate or append to flat file
and FTP targets.
When you recover a session from the last checkpoint, the Integration Service restores the session state of
operation to determine the type of recovery it can perform:
♦ Incremental. The Integration Service starts processing data at the point of interruption. It does not read or
transform rows that it processed before the interruption. By default, the Integration Service attempts to
perform incremental recovery.
♦ Full. The Integration Service reads all source rows again and performs all transformation logic if it cannot
perform incremental recovery. The Integration Service begins writing to the target at the last commit point.
If any session component requires full recovery, the Integration Service performs full recovery on the session.

Resuming Sessions 229


Table 13-8 describes when the Integration Service performs incremental or full recovery, depending on the
session configuration:

Table 13-8. Incremental and Full Recovery Session Recovery Situations

Component Incremental Recovery Full Recovery

Commit type The session uses a source-based commit. The session uses a target-based
The mapping does not contain any commit or user-defined commit.
transformation that generates commits.

Transformation Transformations propagate transactions and At least one transformation is


Scope the transformation scope must be configured with the All transformation
Transaction or Row. scope.

File Source A file source supports incremental reads. n/a

FTP Source The FTP server must support the seek The FTP server does not support the
operation to allow incremental reads. seek operation.

Relational Source A relational source supports incremental n/a


reads when the output is deterministic and
repeatable. If the output is not deterministic
and repeatable, the Integration Service
supports incremental relational source reads
by staging SQL results to a storage file.

VSAM Source n/a Integration Service performs full


recovery.

XML Source n/a Integration Service performs full


recovery.

XML Generator An XML Generator transformation must be n/a


Transformation configured with Transaction transformation
scope.

XML Target An XML target must be configured to n/a


generate a new XML document on commit.

Working with Repeatable Data


When you configure recovery to resume from the last checkpoint, the recovery session must be able to produce
the same data in the same order as the original session.
When you validate a session, the Workflow Manager verifies that the transformations are configured to produce
repeatable and deterministic data. The session is not valid if you configure recovery to resume from the last
checkpoint, but the transformations are not configured for repeatable and deterministic data.
Session data is repeatable when all targets receive repeatable data from the following mapping objects:
♦ Source. The output data from the source is repeatable between the original run and the recovery run. For
more information, see “Source Repeatability” on page 230.
♦ Transformation. The output data from each transformation to the target is repeatable. For more
information, see “Transformation Repeatability” on page 231.

Source Repeatability
You can resume a session from the last checkpoint when each source generates the same set of data and the
order of the output is repeatable between runs. Source data is repeatable based on the type of source in the
session.

230 Chapter 13: Recovering Workflows


Relational Source
A relational source might produce data that is not the same or in the same order between workflow runs. When
you configure recovery to resume from the last checkpoint, the Integration Service stores the SQL result in a
cache file to guarantee the output order for recovery.
If you know the SQL result will be the same between workflow runs, you can configure the source qualifier to
indicate that the data is repeatable and deterministic. When the relational source output is deterministic and
the output is always repeatable, the Integration Service does not store the SQL result in a cache file. When the
relational output is not repeatable, the Integration Service can skip creating the cache file if a transformation in
the mapping always produces ordered data.

SDK Source
If an SDK source produces repeatable data, you can enable Output is Deterministic and Output is Repeatable
in the SDK Source Qualifier transformation.

Flat File Source


A flat file does not change between session and recovery runs. If you change a source file before you recover a
session, the recovery session might produce unexpected results.

Transformation Repeatability
You can configure a session to resume from the last checkpoint when transformations in the session produce the
same data between the session and recovery run. All transformations have properties that determine if the
transformation can produce repeatable data. A transformation can produce the same data between a session and
recovery run if the output is deterministic and the output is repeatable.
Warning: If you configure a transformation as repeatable and deterministic, it is your responsibility to ensure
that the data is repeatable. If you try to recover a session with transformations that do not generate repeatable
and deterministic data, the recovery process can result in corrupted data.

Output is Deterministic
A transformation generates deterministic output when it always creates the same output data from the same
input data.

Output is Repeatable
A transformation generates repeatable data when it generates rows in the same order between session runs.
Transformations produce repeatable data based on the transformation type, the transformation configuration,
or the mapping configuration.
Transformations produce repeatable data in the following circumstances:
♦ Always. The order of the output data is consistent between session runs even if the order of the input data is
inconsistent between session runs.
♦ Based on input order. The transformation produces repeatable data between session runs when the order of
the input data from all input groups is consistent between session runs. If the input data from any input
group is not ordered, then the output is not ordered.
When a transformation generates repeatable data based on input order, during session validation, the
Workflow Manager validates the mapping to determine if the transformation can produce repeatable data.
For example, an Expression transformation produces repeatable data only if it receives repeatable data.
♦ Never. The order of the output data is inconsistent between session runs.

Working with Repeatable Data 231


Configuring a Mapping for Recovery
You can configure a mapping to enable transformations in the session to produce the same data between the
session and recovery run. When a mapping contains a transformation that never produces repeatable data, you
can add a transformation that always produces repeatable data immediately after it.
For example, you connect a transformation that never produces repeatable data directly to a transformation that
produces repeatable data based on input order. You cannot configure recovery to resume from the last
checkpoint unless the data is repeatable. To enable the session for recovery, you can add a transformation that
always produces repeatable data after the transformation that never produces repeatable data.
The following figure shows a mapping that you cannot recover with resume from the last checkpoint:

Never produces repeatable data.

Produces repeatable data if it


receives repeatable data.

The mapping contains two Source Qualifier transformations that produce repeatable data. The mapping
contains a Union and Custom transformation that never produce repeatable data. The Lookup transformation
produces repeatable data when it receives repeatable data. Therefore, the target does not receive repeatable data
and you cannot configure the session to resume recovery.
You can modify the mapping to enable resume recovery. Add a Sorter transformation configured for distinct
output rows immediately after the transformations that never output repeatable data. Add the Sorter
transformation after the Custom transformation.
The following figure shows the mapping with a Sorter transformation connected to the Custom transformation:

Configured for distinct output rows.


Always produces repeatable data.

The Lookup transformation produces repeatable data because it receives repeatable data from the Sorter
transformation.

232 Chapter 13: Recovering Workflows


Table 13-9 describes when transformations produce repeatable data:

Table 13-9. Repeatable Data in Transformations

Transformation Repeatable Data

Aggregator Always.

Application Source Qualifier* Based on input order.

Unstructured Data Based on input order. Configure the property according to the
transformation procedure behavior.

Custom* Based on input order. Configure the property according to the


transformation procedure behavior.

Expression Based on input order.

External Procedure* Based on input order. Configure the property according to the
transformation procedure behavior.

Filter Based on input order.

HTTP Based on input order. Configure the property according to the


transformation procedure behavior.

Joiner Based on input order.

Java Based on input order. Configure the property according to the


transformation procedure behavior.

Lookup, dynamic Always. The lookup source must be the same as a target in the session.

Lookup, static Based on input order.

MQ Source Qualifier Always.

Normalizer, pipeline Based on input order.

Normalizer, VSAM Always. The normalizer generates source data in the form of unique
primary keys. When you resume a session the session might generate
different key values than if it completed successfully.

Rank Always.

Router Based on input order.

Sequence Generator Always. The Integration Service stores the current value to the repository.

Sorter, configured for distinct Always.


output rows

Sorter, not configured for Based on input order.


distinct output rows

Source Qualifier, flat file Always.

Source Qualifier, relational* Based on input order. Configure the transformation according to the source
data. The Integration Service stages the data if the data is not repeatable.

SQL Transformation Based on input order. Configure the transformation according to the source
data.

Stored Procedure* Based on input order. Configure the property according to the
transformation procedure behavior.

Transaction Control Based on input order.

Union Never.

Update Strategy Based on input order.

XML Generator Always.

XML Parser Based on input order. Configure the transformation according to the source
data.

Working with Repeatable Data 233


Table 13-9. Repeatable Data in Transformations

Transformation Repeatable Data

XML Source Qualifier Always.


* You can configure the Output is Repeatable and Output is Deterministic properties for some transformations. Or you can add
a transformation that produces repeatable data immediately after the transformation.

Steps to Recover Workflows and Tasks


You can recover a workflow if you configure the workflow for recovery. You can recover a session when you
configure a session recovery strategy. When you configure a session recovery strategy, you do not have to enable
workflow recovery to recover a session.
You can use one of the following methods to recover a workflow or task:
♦ Recover a workflow. Continue processing the workflow from the point of interruption.
♦ Recover a session. Recover a session but not the rest of the workflow.
♦ Recover a workflow from a session. Recover a session and continue processing a workflow.
If the Integration Service uses operating system profiles, recover the session or workflow using the same
operating system profile that the Integration Service used to run the session or workflow.
If you want to restart a workflow or task without recovery, you can restart the workflow or task in cold start
mode. Recovery behavior for real-time sessions varies depending on the real-time source.

RELATED TOPICS:
♦ “Restarting and Recovering Real-time Sessions” on page 202

♦ “Restarting a Task or Workflow Without Recovery” on page 344

Recovering a Workflow
When you recover a workflow, the Integration Service restores the workflow state of operation and continues
processing from the point of failure. The Integration Service uses the task recovery strategy to recover the task
that failed.
You configure a workflow for recovery by configuring the workflow to suspend when a task fails, or by enabling
recovery in the Workflow Properties.
You can recover a workflow using the Workflow Manager, the Workflow Monitor, or pmcmd.

To recover a workflow using the Workflow Manager:

1. Select the workflow in the Navigator or open the workflow in the Workflow Designer workspace.
2. Right-click the workflow and choose Recover Workflow.
The Integration Service recovers the interrupted tasks and runs the rest of the workflow.

To recover a workflow using the Workflow Monitor:

1. Select the workflow in the Workflow Monitor.


2. Right-click the workflow and choose Recover.
The Integration Service recovers the failed tasks and runs the rest of the workflow.
You can also use the pmcmd recoverworkflow command to recover a workflow.

234 Chapter 13: Recovering Workflows


Recovering a Session
You can recover a failed, terminated, aborted, or stopped session without recovering the workflow. If the
workflow completed, you can recover the session without running the rest of the workflow. You must configure
a recovery strategy of restart or resume from the last checkpoint to recover a session. The Integration Service
recovers the session according to the task recovery strategy. You do not need to suspend the workflow or enable
workflow recovery to recover a session.

To recover a session from the Workflow Monitor:

1. Double-click the workflow in the Workflow Monitor to expand it and display the task.
2. Right-click the session and choose Recover Task.
The Integration Service recovers the failed session according to the recovery strategy.
You can also use the pmcmd starttask with the -recover option to recover a session.

RELATED TOPICS:
♦ “Task Recovery Strategies” on page 228

Recovering a Workflow From a Session


If a session stops, aborts, or terminates and the workflow does not complete, you can recover the workflow from
a session if you configured a session recovery strategy. When you recover the session, the Integration Service
uses the recovery strategy to recover the session and continue the workflow. You can recover a session even if
you do not suspend the workflow or enable workflow recovery.

To recover a workflow from a session in the Workflow Monitor:

1. Double-click the workflow in the Workflow Monitor to expand it and display the session.
2. Right-click the session and choose Recover Workflow from Task.
The Integration Service recovers the failed session according to the recovery strategy.
You can use the pmcmd startworkflow with the -recover option to recover a workflow from a session.
Note: To recover a session within a worklet, expand the worklet and then choose to recover the task.

Rules and Guidelines for Session Recovery


Use the following rules and guidelines when recovering sessions:
♦ The Integration Service creates a new session log when it runs a recovery session.
♦ A session reports performance statistics for the last successful run.
♦ You can recover a session containing a transformation that uses the random number generator (RAND)
function if you provide a seed parameter.

Configuring Recovery to Resume from the Last Checkpoint


Use the following rules and guidelines when configuring recovery to resume from last checkpoint:
♦ You must use pass-through partitioning for each transformation.
♦ You cannot configure recovery to resume from the last checkpoint for a session that runs on a grid.
♦ When you configure a session for full pushdown optimization, the Integration Service runs the session on
the database. As a result, it cannot perform incremental recovery if the session fails. When you perform
recovery for sessions that contain SQL overrides, the Integration Service must drop and recreate views.

Rules and Guidelines for Session Recovery 235


♦ When you modify a workflow or session between the interrupted run and the recovery run, you might get
unexpected results. The Integration Service does not prevent recovery for a modified workflow. The recovery
workflow or session log displays a message when the workflow or the task is modified since last run.
♦ The pre-session command and pre-SQL commands run only once when you resume a session from the last
checkpoint. If a pre- or post- command or SQL command fails, the Integration Service runs the command
again during recovery. Design the commands so you can rerun them.
♦ You cannot configure a session to resume if it writes to a relational target in bulk mode.

Unrecoverable Workflows or Tasks


In some cases, the Integration Service cannot recover a workflow or task. You cannot recover a workflow or task
under the following circumstances:
♦ You change the number of partitions. If you change the number of partitions after a session fails, the
recovery session fails.
♦ The interrupted task has a fail recovery strategy. If you configure a Command or Session recovery to fail
and continue the workflow recovery, the task is not recoverable.
♦ Recovery storage file is missing. The Integration Service fails the recovery session or workflow if the
recovery storage file is missing from $PMStorageDir or if the definition of $PMStorageDir changes between
the original and recovery run.
♦ Recovery table is empty or missing from the target database. The Integration Service fails a recovery session
under the following circumstances:
− You deleted the table after the Integration Service created it.
− The session enabled for recovery failed immediately after the Integration Service removed the recovery
information from the table.
You might get inconsistent data if you perform recovery under the following circumstances:
♦ The sources or targets change after the initial session. If you drop or create indexes or edit data in the
source or target tables before recovering a session, the Integration Service may return missing or repeat rows.
♦ The source or target code pages change after the initial session failure. If you change the source or target
code page, the Integration Service might return incorrect data. You can perform recovery if the code pages
are two-way compatible with the original code pages.

236 Chapter 13: Recovering Workflows


CHAPTER 14

Sending Email
This chapter includes the following topics:
♦ Overview, 237
♦ Configuring Email on UNIX, 238
♦ Configuring Email on Windows, 239
♦ Working with Email Tasks, 240
♦ Working with Post-Session Email, 242
♦ Working with Suspension Email, 244
♦ Using Service Variables to Address Email, 245
♦ Tips, 245

Overview
You can send email to designated recipients when the Integration Service runs a workflow. For example, if you
want to track how long a session takes to complete, you can configure the session to send an email containing
the time and date the session starts and completes. Or, if you want the Integration Service to notify you when a
workflow suspends, you can configure the workflow to send email when it suspends.
To send email when the Integration Service runs a workflow, perform the following steps:
♦ Configure the Integration Service to send email. Before creating Email tasks, configure the Integration
Service to send email. For more information, see “Configuring Email on UNIX” on page 238 and
“Configuring Email on Windows” on page 239.
If you use a grid or high availability in a Windows environment, you must use the same Microsoft Outlook
profile on each node to ensure the Email task can succeed.
♦ Create Email tasks. Before you can configure a session or workflow to send email, you need to create an
Email task. For more information, see “Working with Email Tasks” on page 240.
♦ Configure sessions to send post-session email. You can configure the session to send an email when the
session completes or fails. You create an Email task and use it for post-session email. For more information,
see “Working with Post-Session Email” on page 242.
When you configure the subject and body of post-session email, use email variables to include information
about the session run, such as session name, status, and the total number of rows loaded. You can also use
email variables to attach the session log or other files to email messages. For more information, see “Email
Variables and Format Tags” on page 242.

237
♦ Configure workflows to send suspension email. You can configure the workflow to send an email when the
workflow suspends. You create an Email task and use it for suspension email. For more information, see
“Working with Suspension Email” on page 244.
The Integration Service sends the email based on the locale set for the Integration Service process running the
session.
You can use parameters and variables in the email user name, subject, and text. For Email tasks and suspension
email, you can use service, service process, workflow, and worklet variables. For post-session email, you can use
any parameter or variable type that you can define in the parameter file. For example, you can use the
$PMSuccessEmailUser or $PMFailureEmailUser service variable to specify the email recipient for post-session
email. For more information about using service variables in email addresses, see “Using Service Variables to
Address Email” on page 245. For more information about defining parameters and variables in parameter files,
see “Parameter Files” on page 403.

Configuring Email on UNIX


The Integration Service on UNIX uses rmail to send email. To send email, the user who starts Informatica
Services must have the rmail tool installed in the path.
If you want to send email to more than one person, separate the email address entries with a comma. Do not
put spaces between addresses.

To verify the rmail tool is accessible on AIX:

1. Log in to the UNIX system as the PowerCenter user who starts the Informatica Services.
2. Type the following lines at the prompt and press Enter:
rmail <your fully qualified email address>,<second fully qualified email address>
From <your_user_name>

3. To indicate the end of the message, type ^D.


You should receive a blank email from the email account of the user you specify in the From line. If not,
locate the directory where rmail resides and add that directory to the path.

To verify the rmail tool is accessible on all other UNIX machines:

1. Log in to the UNIX system as the PowerCenter user who starts the Informatica Services.
2. Type the following line at the prompt and press Enter:
rmail <your fully qualified email address>,<second fully qualified email address>

3. To indicate the end of the message, enter a period (.) on a separate line and press Enter. Or, type ^D.
You should receive a blank email from the email account of the PowerCenter user. If not, locate the
directory where rmail resides and add that directory to the path.
After you verify that rmail is installed correctly, you can send email.

RELATED TOPICS:
♦ “Working with Email Tasks” on page 240

238 Chapter 14: Sending Email


Configuring Email on Windows
The Integration Service on Windows uses Microsoft Outlook to send email using the MAPI interface. You must
meet the following requirements to send email on an Integration Service on Windows:
♦ Install the Microsoft Outlook mail client on each node configured to run the Integration Service.
♦ Run Microsoft Outlook on a Microsoft Exchange Server.
Complete the following steps to configure the Integration Service on Windows to send email:
1. Configure a Microsoft Outlook profile.
2. Configure Logon network security.
3. Create distribution lists in the Personal Address Book in Microsoft Outlook.
4. Verify the Integration Service is configured to send email using the Microsoft Outlook profile you created
in step 1.
The Integration Service on Windows sends email in MIME format. You can include characters in the subject
and body that are not in 7-bit ASCII. For more information about the MIME format or the MIME decoding
process, see the email documentation.

Step 1. Configure a Microsoft Outlook User


You must set up a profile for a Microsoft Outlook user before you can configure the Integration Service to send
email. The user profile must contain the following services:
♦ Microsoft Exchange Server
♦ Personal Address Book
Note: If you have high availability or if you use a grid, use the same profile for each node configured to run a
service process.

To configure a Microsoft Outlook user:

1. Open the Control Panel on the machine running the Integration Service process.
2. Double-click the Mail (or Mail and Fax) icon.
3. On the Services tab of the user Properties dialog box, click Show Profiles.
The Mail dialog box displays the list of profiles configured for the computer.
4. If you have already set up a Microsoft Outlook profile, skip to “Step 2. Configure Logon Network
Security” on page 240. If you do not already have a Microsoft Outlook profile set up, continue to step 5.
5. Click Add in the mail properties window.
The Microsoft Outlook Setup Wizard appears.
6. Select Use The Following Information Services and then select Microsoft Exchange Server. Click Next.
7. Enter a profile name and click Next.
8. Enter the name of the Microsoft Exchange Server. Enter the mailbox name. Click Next.
9. Indicate whether you travel with a computer. Click Next.
10. Enter the path to a personal address book. Click Next.
11. Indicate whether you want to run Outlook when you start Windows. Click Next.
The Setup Wizard indicates that you have successfully configured an Outlook profile.
12. Click Finish.

Configuring Email on Windows 239


Step 2. Configure Logon Network Security
You must configure the Logon Network Security before you run the Microsoft Exchange Server.

To configure Logon Network Security for the Microsoft Exchange Server:

1. Open the Control Panel on the machine running the Integration Service process.
2. Double-click the Mail (or Mail and Fax) icon.
The User Properties sheet appears.
3. On the Services tab, select Microsoft Exchange Server and click Properties.
4. Click the Advanced tab. Set the Logon network security option to NT Password Authentication.
5. Click OK.

Step 3. Create Distribution Lists


When the Integration Service runs on Windows, you can enter one email address in the Workflow Manager. If
you want to send email to multiple recipients, create a distribution list containing these addresses in the
Personal Address Book in Microsoft Outlook. Enter the distribution list name as the recipient when configuring
email.
For more information about working with a Personal Address Book, refer to Microsoft Outlook documentation.

Step 4. Verify the Integration Service Settings


After you create the Microsoft Outlook profile, verify the Integration Service is configured to send email as that
Microsoft Outlook user. You may need to verify the profile with the domain administrator.

To verify the Microsoft Exchange profile in the Integration Service:

1. From the Administration Console, click the Properties tab for the Integration Service.
2. In the Configuration Properties tab, select Edit.
3. In the MSExchangeProfile field, verify that the name of Microsoft Exchange profile matches the Microsoft
Outlook profile you created.

Working with Email Tasks


You can send email during a workflow using the Email task on the Workflow Manager. You can create reusable
Email tasks in the Task Developer for any type of email. Or, you can create non-reusable Email tasks in the
Workflow and Worklet Designer.
Use Email tasks in any of the following locations:
♦ Session properties. You can configure the session to send email when the session completes or fails. For more
information, see “Working with Post-Session Email” on page 242.
♦ Workflow properties. You can configure the workflow to send email when the workflow is interrupted. For
more information, see “Working with Suspension Email” on page 244.
♦ Workflows or worklets. You can include an Email task anywhere in the workflow or worklet to send email
based on a condition you define. For more information, see “Using Email Tasks in a Workflow or Worklet”
on page 241.

240 Chapter 14: Sending Email


Using Email Tasks in a Workflow or Worklet
Use Email tasks anywhere in a workflow or worklet. For example, you might configure a workflow to send an
email if a certain number of rows fail for a session.
For example, you may have a Session task in the workflow and you want the Integration Service to send an
email if more than 20 rows are dropped. To do this, you create a condition in the link, and create a non-reusable
Email task. The workflow sends an email if the session fails more than 20 rows are dropped.

Email Address Tips and Guidelines


Consider the following tips and guidelines when you enter the email address in an Email task:
♦ Enter the email address using 7-bit ASCII characters only.
♦ You can use service, service process, workflow, and worklet variables in the email address.
♦ You can send email to any valid email address. On Windows, the mail recipient does not have to have an
entry in the Global Address book of the Microsoft Outlook profile.
♦ If the Integration Service runs on Windows, you can send email to multiple recipients by creating a
distribution list in the Personal Address book. All recipients must also be in the Global Address book. You
cannot enter multiple addresses separated by commas or semicolons.
♦ If the Integration Service runs on UNIX, you can enter multiple email addresses separated by a comma. Do
not include spaces between email addresses.

Steps to Create an Email Task


You can create Email tasks in the Task Developer, Worklet Designer, and Workflow Designer.

To create an Email task in the Task Developer:

1. In the Task Developer, click Tasks > Create.


The Create Task dialog box appears.
2. Select an Email task and enter a name for the task. Click Create.
The Workflow Manager creates an Email task in the workspace.
3. Click Done.
4. Double-click the Email task in the workspace.
The Edit Tasks dialog box appears.
5. Click Rename to enter a name for the task.
6. Enter a description for the task in the Description field.
7. Click the Properties tab.
8. Enter the fully qualified email address of the mail recipient in the Email User Name field.
9. Enter the subject of the email in the Email Subject field. You can use a service, service process, workflow, or
worklet variable in the email subject. Or, you can leave this field blank.
10. Click the Open button in the Email Text field to open the Email Editor.
11. Enter the text of the email message in the Email Editor. You can use service, service process, workflow, and
worklet variables in the email text. Or, you can leave the Email Text field blank.
Note: You can incorporate format tags and email variables in a post-session email. However, you cannot add
them to an Email task outside the context of a session.
12. Click OK twice to save the changes.

Working with Email Tasks 241


RELATED TOPICS:
♦ “Email Address Tips and Guidelines” on page 241
♦ “Email Variables and Format Tags” on page 242

Working with Post-Session Email


You can configure a session to send email when it fails or succeeds. You can create separate email tasks for
success and failure email.
The Integration Service sends post-session email at the end of a session, after executing post-session shell
commands or stored procedures. When the Integration Service encounters an error sending the email, it writes
a message to the Log Service. It does not fail the session.
You can specify a reusable Email that task you create in the Task Developer for either success email or failure
email. Or, you can create a non-reusable Email task for each session property. When you create a non-reusable
Email task for a session, you cannot use the Email task in a workflow or worklet.
You cannot specify a non-reusable Email task you create in the Workflow or Worklet Designer for post-session
email.
You can use parameters and variables in the email user name, subject, and text. Use any parameter or variable
type that you can define in the parameter file. For example, you can use the service variable
$PMSuccessEmailUser or $PMFailureEmailUser for the email recipient. Ensure that you specify the values of
the service variables for the Integration Service that runs the session. You can also enter a parameter or variable
within the email subject or text, and define it in the parameter file.

RELATED TOPICS:
♦ “Using Service Variables to Address Email” on page 245

Email Variables and Format Tags


Use email variables and format tags in an email message for post-session emails. You can use some email
variables in the subject of the email. With email variables, you can include important session information in the
email, such as the number of rows loaded, the session completion time, or read and write statistics. You can also
attach the session log or other relevant files to the email. Use format tags in the body of the message to make the
message easier to read.
Note: The Integration Service does not limit the type or size of attached files. However, since large attachments
can cause problems with the email system, avoid attaching excessively large files, such as session logs generated
using verbose tracing. The Integration Service generates an error message in the email if an error occurs
attaching the file.
The following table describes the email variables that you can use in a post-session email:

Email Variable Description

%a<filename> Attach the named file. The file must be local to the Integration Service. The following file
names are valid: %a<c:\data\sales.txt> or %a</users/john/data/sales.txt>. The email does
not display the full path for the file. Only the attachment file name appears in the email.
Note: The file name cannot include the greater than character (>) or a line break.

%b Session start time.

%c Session completion time.

%d Name of the repository containing the session.

%e Session status.

242 Chapter 14: Sending Email


Email Variable Description

%g Attach the session log to the message. The Integration Service attaches a session log if
you configure the session to create a log file. If you do not configure the session to create
a log file or if you run a session on a grid, the Integration Service creates a temporary file
in the PowerCenter Services installation directory and attaches the file. If the Integration
Service does not use operating system profiles, verify that the user that starts Informatica
Services has permissions on PowerCenter Services installation directory to create a
temporary log file. If the Integration Service uses operating system profiles, verify that the
operating system user of the operating system profile has permissions on PowerCenter
Services installation directory to create a temporary log file.

%i Session elapsed time (session completion time-session start time).

%l Total rows loaded.

%m Name of the mapping used in the session.

%n Name of the folder containing the session.

%r Total rows rejected.

%s Session name.

%t Source and target table details, including read throughput in bytes per second and write
throughput in rows per second. The Integration Service includes all information displayed
in the session detail dialog box.

%u Repository user name.

%v Integration Service name.

%w Workflow name.

%y Session run mode (normal or recovery).

%z Workflow run instance name.


Note: The Integration Service ignores %a, %g, and %t when you include them in the email subject. Include these variables in
the email message only.

The following table lists the format tags you can use in an Email task:

Formatting Format Tag

tab \t

new line \n

Configuring Post-Session Email


You can configure post-session email to use a reusable or non-reusable Email task.

Using a Reusable Email Task


Use the following steps to configure post-session email to use a reusable Email task.

To configure post-session email to use a reusable Email task:

1. Open the session properties and click the Components tab.


2. Select Reusable in the Type column for the success email or failure email field.
3. Click the Open button in the Value column to select the reusable Email task.
4. Select the Email task in the Object Browser dialog box and click OK.
5. Optionally, edit the Email task for this session property by clicking the Edit button in the Value column.
If you edit the Email task for either success email or failure email, the edits only apply to this session.
6. Click OK to close the session properties.

Working with Post-Session Email 243


Using a Non-Reusable Email Task
Use the following steps to configure success email or failure email to use a non-reusable Email task.

To configure success email or failure email to use a non-reusable Email task:

1. Open the session properties and click the Components tab.


2. Select Non-Reusable in the Type column for the success email or failure email field.
3. Open the email editor using the Open button.
4. Edit the Email task and click OK.
5. Click OK to close the session properties.

RELATED TOPICS:
♦ “Working with Email Tasks” on page 240

Sample Email
The following example shows a user-entered text from a sample post-session email configuration using
variables:
Session complete.
Session name: %s
Integration Service name: %v
%l
%r
%e
%b
%c
%i
%g

The following is sample output from the configuration above:


Session complete.
Session name: sInstrTest
Integration Service name: Node01IS
Total Rows Loaded = 1
Total Rows Rejected = 0
Completed
Start Time: Tue Nov 22 12:26:31 2005
Completion Time: Tue Nov 22 12:26:41 2005
Elapsed time: 0:00:10 (h:m:s)

Working with Suspension Email


You can configure a workflow to send email when the Integration Service suspends the workflow. For example,
when a task fails, the Integration Service suspends the workflow and sends the suspension email. You can fix the
error and recover the workflow.
If another task fails while the Integration Service is suspending the workflow, you do not get the suspension
email again. However, the Integration Service sends another suspension email if another task fails after you
recover the workflow.
Configure suspension email on the General tab of the workflow properties. You can use service, service process,
workflow, and worklet variables in the email user name, subject, and text. For example, you can use the service
variable $PMSuccessEmailUser or $PMFailureEmailUser for the email recipient. Ensure that you specify the
values of the service variables for the Integration Service that runs the session. You can also enter a parameter or
variable within the email subject or text, and define it in the parameter file.

244 Chapter 14: Sending Email


To configure suspension email:

1. In the Workflow Designer, open the workflow.


2. Click Workflows > Edit to open the workflow properties.
3. On the General tab, select Suspend on Error.
4. Click the Browse Emails button to select a reusable Email task.
Note: The Workflow Manager returns an error if you do not have any reusable Email task in the folder.
Create a reusable Email task in the folder before you configure suspension email.
5. Choose a reusable Email task and click OK.
6. Click OK to close the workflow properties.

RELATED TOPICS:
♦ “Suspending the Workflow” on page 81
♦ “Using Service Variables to Address Email” on page 245

Using Service Variables to Address Email


Use service variables to address email in Email tasks, post-session email, and suspension email. When you
configure the Integration Service, you configure the service variables. You may need to verify these variables
with the domain administrator. You can use the following service variables as the email recipient:
♦ $PMSuccessEmailUser. Defines the email address of the user to receive email when a session completes
successfully. Use this variable with post-session email. You can also use it to address email in standalone
Email tasks or suspension email.
♦ $PMFailureEmailUser. Defines the email address of the user to receive email when a session completes with
failure or when the Integration Service suspends a workflow. Use this variable with post-session or
suspension email. You can also use it to address email in standalone Email tasks.
When you use one of these service variables, the Integration Service sends email to the address configured for
the service variable. $PMSuccessEmailUser and $PMFailureEmailUser are optional process variables. Verify that
you define a variable before using it to address email.
You might use this functionality when you have an administrator who troubleshoots all failed sessions. Instead
of entering the administrator email address for each session, use the email variable $PMFailureEmailUser as the
recipient for post-session email. If the administrator changes, you can correct all sessions by editing the
$PMFailureEmailUser service variable, instead of editing the email address in each session.
You might also use this functionality when you have different administrators for different Integration Services.
If you deploy a folder from one repository to another or otherwise change the Integration Service that runs the
session, the new service sends email to users associated with the new service when you use process variables
instead of hard-coded email addresses.

Tips
The following suggestions can extend the capabilities of Email tasks.

When the Integration Service runs on Windows, configure a Microsoft Outlook profile for each node.
If you run the Integration Service on multiple nodes in a Windows environment, create a Microsoft Outlook
profile for each node. To use the profile on multiple nodes for multiple users, create a generic Microsoft
Outlook profile, such as “PowerCenter,” and use this profile on each node in the domain. Use the same profile

Using Service Variables to Address Email 245


on each node to ensure that the Microsoft Exchange Profile you configured for the Integration Service matches
the profile on each node.

Use service variables to address email.


Use service variables to address email in Email tasks, post-session email, and suspension email. When the service
variables $PMSuccessEmailUser and $PMFailureEmailUser are configured for the Integration Service, use them
to address email. You can change the email recipient for all sessions the service runs by editing the service
variables. It is easier to deploy sessions into production if you define service variables for both development and
production servers.

Generate and send post-session reports.


Use a post-session success command to generate a report file and attach that file to a success email. For example,
you create a batch file called Q3rpt.bat that generates a sales report, and you are running Microsoft Outlook on
Windows.

Use other mail programs.


If you do not have Microsoft Outlook, use a post-session success command to invoke a command line email
program, such as Windmill. In this case, you do not have to enter the email user name or subject, since the
recipients, email subject, and body text will be contained in the batch file, sendmail.bat.

246 Chapter 14: Sending Email


CHAPTER 15

Understanding Pipeline Partitioning


This chapter includes the following topics:
♦ Overview, 247
♦ Partitioning Attributes, 248
♦ Dynamic Partitioning, 250
♦ Cache Partitioning, 252
♦ Mapping Variables in Partitioned Pipelines, 253
♦ Partitioning Rules, 254
♦ Configuring Partitioning, 255

Overview
You create a session for each mapping you want the Integration Service to run. Each mapping contains one or
more pipelines. A pipeline consists of a source qualifier and all the transformations and targets that receive data
from that source qualifier. When the Integration Service runs the session, it can achieve higher performance by
partitioning the pipeline and performing the extract, transformation, and load for each partition in parallel.
A partition is a pipeline stage that executes in a single reader, transformation, or writer thread. The number of
partitions in any pipeline stage equals the number of threads in the stage. By default, the Integration Service
creates one partition in every pipeline stage.
If you have the Partitioning option, you can configure multiple partitions for a single pipeline stage. You can
configure partitioning information that controls the number of reader, transformation, and writer threads that
the master thread creates for the pipeline. You can configure how the Integration Service reads data from the
source, distributes rows of data to each transformation, and writes data to the target. You can configure the
number of source and target connections to use.
Complete the following tasks to configure partitions for a session:
♦ Set partition attributes including partition points, the number of partitions, and the partition types. For
more information about partitioning attributes, see “Partitioning Attributes” on page 248.
♦ You can enable the Integration Service to set partitioning at run time. When you enable dynamic
partitioning, the Integration Service scales the number of session partitions based on factors such as the
source database partitions or the number of nodes in a grid. For more information about dynamic
partitioning, see “Dynamic Partitioning” on page 250.
♦ After you configure a session for partitioning, you can configure memory requirements and cache directories
for each transformation. For more information about cache partitioning, see “Cache Partitioning” on
page 252.

247
♦ The Integration Service evaluates mapping variables for each partition in a target load order group. You can
use variable functions in the mapping to set the variable values. For more information about mapping
variables in partitioned pipelines, see “Mapping Variables in Partitioned Pipelines” on page 253.
♦ When you create multiple partitions in a pipeline, the Workflow Manager verifies that the Integration
Service can maintain data consistency in the session using the partitions. When you edit object properties in
the session, you can impact partitioning and cause a session to fail. For information about how the
Workflow Manager validates partitioning, see “Partitioning Rules” on page 254.
♦ You add or edit partition points in the session properties. When you change partition points you can define
the partition type and add or delete partitions. For more information about configuring partition
information, see “Configuring Partitioning” on page 255.

Partitioning Attributes
You can set the following attributes to partition a pipeline:
♦ Partition points. Partition points mark thread boundaries and divide the pipeline into stages. The
Integration Service redistributes rows of data at partition points.
♦ Number of partitions. A partition is a pipeline stage that executes in a single thread. If you purchase the
Partitioning option, you can set the number of partitions at any partition point. When you add partitions,
you increase the number of processing threads, which can improve session performance.
♦ Partition types. The Integration Service creates a default partition type at each partition point. If you have
the Partitioning option, you can change the partition type. The partition type controls how the Integration
Service distributes data among partitions at partition points.

Partition Points
By default, the Integration Service sets partition points at various transformations in the pipeline. Partition
points mark thread boundaries and divide the pipeline into stages. A stage is a section of a pipeline between any
two partition points. When you set a partition point at a transformation, the new pipeline stage includes that
transformation.
Figure 15-1 shows the default partition points and pipeline stages for a mapping with one pipeline:

Figure 15-1. Default Partition Points and Stages in a Sample Mapping

* Default Partition Points


* * *

First Stage Second Stage Third Stage Fourth Stage

When you add a partition point, you increase the number of pipeline stages by one. Similarly, when you delete
a partition point, you reduce the number of stages by one. Partition points mark the points in the pipeline
where the Integration Service can redistribute data across partitions.
For example, if you place a partition point at a Filter transformation and define multiple partitions, the
Integration Service can redistribute rows of data among the partitions before the Filter transformation processes
the data. The partition type you set at this partition point controls the way in which the Integration Service
passes rows of data to each partition.

248 Chapter 15: Understanding Pipeline Partitioning


RELATED TOPICS:
♦ “Working with Partition Points” on page 259

Number of Partitions
The number of threads that process each pipeline stage depends on the number of partitions. A partition is a
pipeline stage that executes in a single reader, transformation, or writer thread. The number of partitions in any
pipeline stage equals the number of threads in that stage.
You can define up to 64 partitions at any partition point in a pipeline. When you increase or decrease the
number of partitions at any partition point, the Workflow Manager increases or decreases the number of
partitions at all partition points in the pipeline. The number of partitions remains consistent throughout the
pipeline. If you define three partitions at any partition point, the Workflow Manager creates three partitions at
all other partition points in the pipeline. In certain circumstances, the number of partitions in the pipeline
must be set to one.
Increasing the number of partitions or partition points increases the number of threads. Therefore, increasing
the number of partitions or partition points also increases the load on the node. If the node contains enough
CPU bandwidth, processing rows of data in a session concurrently can increase session performance. However,
if you create a large number of partitions or partition points in a session that processes large amounts of data,
you can overload the system.
The number of partitions you create equals the number of connections to the source or target. If the pipeline
contains a relational source or target, the number of partitions at the source qualifier or target instance equals
the number of connections to the database. If the pipeline contains file sources, you can configure the session to
read the source with one thread or with multiple threads.
Figure 15-2 shows the threads in a mapping with three partitions:

Figure 15-2. Thread Creation for a Mapping with Three Partitions


Default Partition Points
*
* * *

Threads for Partition #1


Threads for Partition #2
Threads for Partition #3

3 Reader Threads 6 Transformation Threads 3 Writer Threads


(First Stage) (Second Stage) (Third Stage) (Fourth Stage)

When you define three partitions across the mapping, the master thread creates three threads at each pipeline
stage, for a total of 12 threads.
The Integration Service runs the partition threads concurrently. When you run a session with multiple
partitions, the threads run as follows:
1. The reader threads run concurrently to extract data from the source.
2. The transformation threads run concurrently in each transformation stage to process data. The Integration
Service redistributes data among the partitions at each partition point.
3. The writer threads run concurrently to write data to the target.

Partitioning Multiple Input Group Transformations


The master thread creates a reader and transformation thread for each pipeline in the target load order group. A
target load order group has multiple pipelines when it contains a transformation with multiple input groups.

Partitioning Attributes 249


When you connect more than one pipeline to a multiple input group transformation, the Integration Service
maintains the transformation threads or creates a new transformation thread depending on whether or not the
multiple input group transformation is a partition point:
♦ Partition point does not exist at multiple input group transformation. When a partition point does not
exist at a multiple input group transformation, the Integration Service processes one thread at a time for the
multiple input group transformation and all downstream transformations in the stage.
♦ Partition point exists at multiple input group transformation. When a partition point exists at a multiple
input group transformation, the Integration Service creates a new pipeline stage and processes the stage with
one thread for each partition. The Integration Service creates one transformation thread for each partition
regardless of the number of output groups the transformation contains.

Partition Types
When you configure the partitioning information for a pipeline, you must define a partition type at each
partition point in the pipeline. The partition type determines how the Integration Service redistributes data
across partition points.
The Integration Services creates a default partition type at each partition point. If you have the Partitioning
option, you can change the partition type. The partition type controls how the Integration Service distributes
data among partitions at partition points. You can create different partition types at different points in the
pipeline.
You can define the following partition types in the Workflow Manager:
♦ Database partitioning. The Integration Service queries the IBM DB2 or Oracle database system for table
partition information. It reads partitioned data from the corresponding nodes in the database. You can use
database partitioning with Oracle or IBM DB2 source instances on a multi-node tablespace. You can use
database partitioning with DB2 targets.
♦ Hash auto-keys. The Integration Service uses a hash function to group rows of data among partitions. The
Integration Service groups the data based on a partition key. The Integration Service uses all grouped or
sorted ports as a compound partition key. You may need to use hash auto-keys partitioning at Rank, Sorter,
and unsorted Aggregator transformations.
♦ Hash user keys. The Integration Service uses a hash function to group rows of data among partitions. You
define the number of ports to generate the partition key.
♦ Key range. With key range partitioning, the Integration Service distributes rows of data based on a port or
set of ports that you define as the partition key. For each port, you define a range of values. The Integration
Service uses the key and ranges to send rows to the appropriate partition. Use key range partitioning when
the sources or targets in the pipeline are partitioned by key range.
♦ Pass-through. In pass-through partitioning, the Integration Service processes data without redistributing
rows among partitions. All rows in a single partition stay in the partition after crossing a pass-through
partition point. Choose pass-through partitioning when you want to create an additional pipeline stage to
improve performance, but do not want to change the distribution of data across partitions.
♦ Round-robin. The Integration Service distributes data evenly among all partitions. Use round-robin
partitioning where you want each partition to process approximately the same number of rows.

Dynamic Partitioning
If the volume of data grows or you add more CPUs, you might need to adjust partitioning so the session run
time does not increase. When you use dynamic partitioning, you can configure the partition information so the
Integration Service determines the number of partitions to create at run time.
The Integration Service scales the number of session partitions at run time based on factors such as source
database partitions or the number of nodes in a grid.

250 Chapter 15: Understanding Pipeline Partitioning


If any transformation in a stage does not support partitioning, or if the partition configuration does not support
dynamic partitioning, the Integration Service does not scale partitions in the pipeline. The data passes through
one partition.
Complete the following tasks to scale session partitions with dynamic partitioning:
♦ Set the partitioning. The Integration Service increases the number of partitions based on the partitioning
method you choose. For more information about dynamic partitioning methods, see “Configuring Dynamic
Partitioning” on page 251.
♦ Set session attributes for dynamic partitions. You can set session attributes that identify source and target
file names and directories. The session uses the session attributes to create the partition-level attributes for
each partition it creates at run time. For more information about setting session attributes for dynamic
partitions, see “Configuring Partition-Level Attributes” on page 252.
♦ Configure partition types. You can edit partition points and partition types using the Partitions view on the
Mapping tab of session properties. For information about using dynamic partitioning with different
partition types, see “Using Dynamic Partitioning with Partition Types” on page 252. For information about
configuring partition types, see “Configuring Partitioning” on page 255.
Note: Do not configure dynamic partitioning for a session that contains manual partitions. If you set dynamic
partitioning to a value other than disabled and you manually partition the session, the session is invalid.

Configuring Dynamic Partitioning


Configure dynamic partitioning on the Config Object tab of session properties. Configure dynamic
partitioning using one of the following methods:
♦ Disabled. Do not use dynamic partitioning. Defines the number of partitions on the Mapping tab.
♦ Based on number of partitions. Sets the partitions to a number that you define in the Number of Partitions
attribute. Use the $DynamicPartitionCount session parameter, or enter a number greater than 1.
♦ Based on number of nodes in grid. Sets the partitions to the number of nodes in the grid running the
session. If you configure this option for sessions that do not run on a grid, the session runs in one partition
and logs a message in the session log.
♦ Based on source partitioning. Determines the number of partitions using database partition information.
The number of partitions is the maximum of the number of partitions at the source.
♦ Based on number of CPUs. Sets the number of partitions equal to the number of CPUs on the node that
prepares the session. If the session is configured to run on a grid, dynamic partitioning sets the number of
partitions equal to the number of CPUs on the node that prepares the session multiplied by the number of
nodes in the grid.

RELATED TOPICS:
♦ “Database Partitioning Partition Type” on page 284

Rules and Guidelines for Dynamic Partitioning


Use the following rules and guidelines with dynamic partitioning:
♦ Dynamic partitioning uses the same connection for each partition.
♦ You cannot use dynamic partitioning with XML sources and targets.
♦ You cannot use dynamic partitioning with the Debugger.
♦ Sessions that use SFTP fail if you enable dynamic partitioning.
♦ When you set dynamic partitioning to a value other than disabled, and you manually partition the session
on the Mapping tab, you invalidate the session.
♦ The session fails if you use a parameter other than $DynamicPartitionCount to set the number of partitions.
♦ The following dynamic partitioning configurations cause a session to run with one partition:

Dynamic Partitioning 251


− You override the default cache directory for an Aggregator, Joiner, Lookup, or Rank transformation. The
Integration Service partitions a transformation cache directory when the default is $PMCacheDir.
− You override the Sorter transformation default work directory. The Integration Service partitions the
Sorter transformation work directory when the default is $PMTempDir.
− You use an open-ended range of numbers or date keys with a key range partition type.
− You use datatypes other than numbers or dates as keys in key range partitioning.
− You use key range relational target partitioning.
− You create a user-defined SQL statement or a user-defined source filter.
− You set dynamic partitioning to the number of nodes in the grid, and the session does not run on a grid.
− You use pass-through relational source partitioning.
− You use dynamic partitioning with an Application Source Qualifier.
− You use SDK or PowerConnect sources and targets with dynamic partitioning.

Using Dynamic Partitioning with Partition Types


The following rules apply to using dynamic partitioning with different partition types:
♦ Pass-through partitioning. If you change the number of partitions at a partition point, the number of
partitions in each pipeline stage changes. If you use pass-through partitioning with a relational source, the
session runs in one partition in the stage.
♦ Key range partitioning. You must define a closed range of numbers or date keys to use dynamic
partitioning. The keys must be numeric or date datatypes. Dynamic partitioning does not scale partitions
with key range partitioning on relational targets.
♦ Database partitioning. When you use database partitioning, the Integration Service creates session
partitions based on the source database partitions. Use database partitioning with Oracle and IBM DB2
sources.
♦ Hash auto-keys, hash user keys, or round-robin. Use hash user keys, hash auto-keys, and round-robin
partition types to distribute rows with dynamic partitioning. Use hash user keys and hash auto-keys
partitioning when you want the Integration Service to distribute rows to the partitions by group. Use round-
robin partitioning when you want the Integration Service to distribute rows evenly to partitions.

Configuring Partition-Level Attributes


When you use dynamic partitioning, the Integration Service defines the partition-level attributes for each
partition it creates at run time. It names the file and directory attributes based on session-level attribute names
that you define in the session properties.
For example, you define the session reject file name as accting_detail.bad. When the Integration Service creates
partitions at run time, it creates a reject file for each partition, such as accting_detail1.bad, accting_detail2.bad,
accting_detail3.bad.

Cache Partitioning
When you create a session with multiple partitions, the Integration Service may use cache partitioning for the
Aggregator, Joiner, Lookup, Rank, and Sorter transformations. When the Integration Service partitions a cache,
it creates a separate cache for each partition and allocates the configured cache size to each partition. The
Integration Service stores different data in each cache, where each cache contains only the rows needed by that
partition. As a result, the Integration Service requires a portion of total cache memory for each partition.
After you configure the session for partitioning, you can configure memory requirements and cache directories
for each transformation in the Transformations view on the Mapping tab of the session properties. To

252 Chapter 15: Understanding Pipeline Partitioning


configure the memory requirements, calculate the total requirements for a transformation, and divide by the
number of partitions. To improve performance, you can configure separate directories for each partition.
The following table describes the situations when the Integration Service uses cache partitioning for each
applicable transformation:

Transformation Description

Aggregator Transformation You create multiple partitions in a session with an Aggregator transformation.
You do not have to set a partition point at the Aggregator transformation.

Joiner Transformation You create a partition point at the Joiner transformation.

Lookup Transformation You create a hash auto-keys partition point at the Lookup transformation.

Rank Transformation You create multiple partitions in a session with a Rank transformation. You do
not have to set a partition point at the Rank transformation.

Sorter Transformation You create multiple partitions in a session with a Sorter transformation. You do
not have to set a partition point at the Sorter transformation.

RELATED TOPICS:
♦ “Partitioning Joiner Transformations” on page 272
♦ “Partitioning Lookup Transformations” on page 277
♦ “Session Caches” on page 455

Mapping Variables in Partitioned Pipelines


When you specify multiple partitions in a target load order group that uses mapping variables, the Integration
Service evaluates the value of a mapping variable in each partition separately. The Integration Service uses the
following process to evaluate variable values:
1. It updates the current value of the variable separately in each partition according to the variable function
used in the mapping.
2. After loading all the targets in a target load order group, the Integration Service combines the current
values from each partition into a single final value based on the aggregation type of the variable.
3. If there is more than one target load order group in the session, the final current value of a mapping
variable in a target load order group becomes the current value in the next target load order group.
4. When the Integration Service finishes loading the last target load order group, the final current value of the
variable is saved into the repository.
Use one of the following variable functions in the mapping to set the variable value:
♦ SetCountVariable
♦ SetMaxVariable
♦ SetMinVariable
Table 15-1 describes how the Integration Service calculates variable values across partitions:

Table 15-1. Variable Value Calculations with Partitioned Sessions

Variable Function Variable Value Calculation Across Partitions

SetCountVariable Integration Service calculates the final count values from all partitions.

Mapping Variables in Partitioned Pipelines 253


Table 15-1. Variable Value Calculations with Partitioned Sessions

Variable Function Variable Value Calculation Across Partitions

SetMaxVariable Integration Service compares the final variable value for each partition and saves the
highest value.

SetMinVariable Integration Service compares the final variable value for each partition and saves the
lowest value.

Note: Use variable functions only once for each mapping variable in a pipeline. The Integration Service
processes variable functions as it encounters them in the mapping. The order in which the Integration Service
encounters variable functions in the mapping may not be the same for every session run. This may cause
inconsistent results when you use the same variable function multiple times in a mapping.

Partitioning Rules
You can create multiple partitions in a pipeline if the Integration Service can maintain data consistency when it
processes the partitioned data. When you create a session, the Workflow Manager validates each pipeline for
partitioning.

Partition Restrictions for Editing Objects


When you edit object properties, you can impact your ability to create multiple partitions in a session or to run
an existing session with multiple partitions.

Before You Create a Session


When you create a session, the Workflow Manager checks the mapping properties. Mappings dynamically pick
up changes to shortcuts, but not to reusable objects, such as reusable transformations and mapplets. Therefore,
if you edit a reusable object in the Designer after you save a mapping and before you create a session, you must
open and resave the mapping for the Workflow Manager to recognize the changes to the object.

After You Create a Session with Multiple Partitions


When you edit a mapping after you create a session with multiple partitions, the Workflow Manager does not
invalidate the session even if the changes violate partitioning rules. The Integration Service fails the session the
next time it runs unless you edit the session so that it no longer violates partitioning rules.
The following changes to mappings can cause session failure:
♦ You delete a transformation that was a partition point.
♦ You add a transformation that is a default partition point.
♦ You move a transformation that is a partition point to a different pipeline.
♦ You change a transformation that is a partition point in any of the following ways:
− The existing partition type is invalid.
− The transformation can no longer support multiple partitions.
− The transformation is no longer a valid partition point.
♦ You disable partitioning or you change the partitioning between a single node and a grid in a transformation
after you create a pipeline with multiple partitions.
♦ You switch the master and detail source for the Joiner transformation after you create a pipeline with
multiple partitions.

254 Chapter 15: Understanding Pipeline Partitioning


Partition Restrictions for PowerExchange
You can specify multiple partitions for PowerExchange and PowerExchange Client for PowerCenter. However,
there are additional restrictions. For more information about these products, see the product documentation.

Configuring Partitioning
When you create or edit a session, you can change the partitioning for each pipeline in a mapping. If the
mapping contains multiple pipelines, you can specify multiple partitions in some pipelines and single partitions
in others. You update partitioning information using the Partitions view on the Mapping tab of session
properties.
Add, delete, or edit partition points on the Partitions view of session properties. If you add a key range partition
point, you can define the keys in each range.
Table 15-2 lists the configuration options for the Partitions view on the Mapping tab:

Table 15-2. Options on Session Properties Partitions View on the Mapping Tab

Partitions View Option Description

Add Partition Point Click to add a new partition point. When you add a partition point, the
transformation name appears under the Partition Points node.

Delete Partition Point Click to delete the selected partition point.


You cannot delete certain partition points.

Edit Partition Point Click to edit the selected partition point. This opens the Edit Partition Point dialog
box. For information about the options in this dialog box, see Table 15-3 on
page 256.

Key Range Displays the key and key ranges for the partition point, depending on the partition
type.
For key range partitioning, specify the key ranges.
For hash user keys partitioning, this field displays the partition key.
The Workflow Manager does not display this area for other partition types.

Edit Keys Click to add or remove the partition key for key range or hash user keys
partitioning. You cannot create a partition key for hash auto-keys, round-robin, or
pass-through partitioning.

Configuring a Partition Point


You can perform the following tasks when you edit or add a partition point:
♦ Specify the partition type at the partition point.
♦ Add and delete partitions.
♦ Enter a description for each partition.

Configuring Partitioning 255


Figure 15-3 shows the configuration options where you edit partition points:

Figure 15-3. Edit Partition Point Dialog Box


Selected Partition Point
Add a partition.

Delete a partition.

Select a partition.
Enter the partition description.

Specify the partition type.

Table 15-3 describes the configuration options for partition points:

Table 15-3. Edit Partition Point Dialog Box Options

Partition Options Description

Select Partition Type Changes the partition type.

Partition Names Selects individual partitions from this dialog box to configure.

Add a Partition Adds a partition. You can add up to 64 partitions at any partition point. The
number of partitions must be consistent across the pipeline. Therefore, if you
define three partitions at one partition point, the Workflow Manager defines three
partitions at all partition points in the pipeline.

Delete a Partition Deletes the selected partition. Each partition point must contain at least one
partition.

Description Enter an optional description for the current partition.

You can enter a description for each partition you create. To enter a description, select the partition in the Edit
Partition Point dialog box, and then enter the description in the Description field.

256 Chapter 15: Understanding Pipeline Partitioning


Steps for Adding Partition Points to a Pipeline
You add partition points from the Mappings tab of the session properties.

To add a partition point:

1. On the Partitions view of the Mapping tab, select a transformation that is not already a partition point, and
click the Add a Partition Point button.
Tip: You can select a transformation from the Non-Partition Points node.

2. Select the partition type for the partition point or accept the default value.
3. Click OK.
The transformation appears in the Partition Points node in the Partitions view on the Mapping tab of the
session properties.

Configuring Partitioning 257


258 Chapter 15: Understanding Pipeline Partitioning
CHAPTER 16

Working with Partition Points


This chapter includes the following topics:
♦ Overview, 259
♦ Adding and Deleting Partition Points, 260
♦ Partitioning Relational Sources, 262
♦ Partitioning File Sources, 263
♦ Partitioning Relational Targets, 267
♦ Partitioning File Targets, 268
♦ Partitioning Custom Transformations, 270
♦ Partitioning Joiner Transformations, 272
♦ Partitioning Lookup Transformations, 277
♦ Partitioning Sorter Transformations, 279
♦ Restrictions for Transformations, 280

Overview
Partition points mark the boundaries between threads in a pipeline. The Integration Service redistributes rows
of data at partition points. You can add partition points to increase the number of transformation threads and
increase session performance. For information about adding and deleting partition points, see “Adding and
Deleting Partition Points” on page 260.
When you configure a session to read a source database, the Integration Service creates a separate connection
and SQL query to the source database for each partition. You can customize or override the SQL query. For
more information about partitioning relational sources, see “Partitioning Relational Sources” on page 262.
When you configure a session to load data to a relational target, the Integration Service creates a separate
connection to the target database for each partition at the target instance. You configure the reject file names
and directories for the target. The Integration Service creates one reject file for each target partition. For more
information about partitioning relational targets, see “Partitioning Relational Targets” on page 267.
You can configure a session to read a source file with one thread or with multiple threads. You must choose the
same connection type for all partitions that read the file. For more information about partitioning source files,
see “Partitioning File Sources” on page 263.
When you configure a session to write to a file target, you can write the target output to a separate file for each
partition or to a merge file that contains the target output for all partitions. You can configure connection
settings and file properties for each target partition. For more information about configuring target files, see
“Partitioning File Targets” on page 268.

Overview 259
When you create a partition point at transformations, the Workflow Manager sets the default partition type.
You can change the partition type depending on the transformation type.

Adding and Deleting Partition Points


Partition points mark the thread boundaries in a pipeline and divide the pipeline into stages. When you add
partition points, you increase the number of transformation threads, which can improve session performance.
The Integration Service can redistribute rows of data at partition points, which can also improve session
performance.
When you create a session, the Workflow Manager creates one partition point at each transformation in the
pipeline.
Table 16-1 lists the transformations with partition points:

Table 16-1. Transformation Partition Points

Partition Point Description Restrictions

Source Qualifier Controls how the Integration Service You cannot delete this partition point.
Normalizer extracts data from the source and passes
it to the source qualifier.

Rank Ensures that the Integration Service You can delete these partition points if
Unsorted Aggregator groups rows properly before it sends them the pipeline contains only one partition
to the transformation. or if the Integration Service passes all
rows in a group to a single partition
before they enter the transformation.

Target Instances Controls how the writer passes data to the You cannot delete this partition point.
targets

Multiple Input Group The Workflow Manager creates a partition You cannot delete this partition point.
point at a multiple input group
transformation when it is configured to
process each partition with one thread, or
when a downstream multiple input group
Custom transformation is configured to
process each partition with one thread.
For example, the Workflow Manager
creates a partition point at a Joiner
transformation that is connected to a
downstream Custom transformation
configured to use one thread per partition.
This ensures that the Integration Service
uses one thread to process each partition
at a Custom transformation that requires
one thread per partition. You cannot delete
this partition point.

Rules and Guidelines


Use the following rules and guidelines apply when adding and deleting partition points:
♦ You cannot create a partition point at a source instance.
♦ You cannot create a partition point at a Sequence Generator transformation or an unconnected
transformation.
♦ You can add a partition point at any other transformation provided that no partition point receives input
from more than one pipeline stage.
♦ You cannot delete a partition point at a Source Qualifier transformation, a Normalizer transformation for
COBOL sources, or a target instance.

260 Chapter 16: Working with Partition Points


♦ You cannot delete a partition point at a multiple input group Custom transformation that is configured to
use one thread per partition.
♦ You cannot delete a partition point at a multiple input group transformation that is upstream from a
multiple input group Custom transformation that is configured to use one thread per partition.
♦ The following partition types have restrictions with dynamic partitioning:
− Pass-through. When you use dynamic partitioning, if you change the number of partitions at a partition
point, the number of partitions in each pipeline stage changes.
− Key Range. To use key range with dynamic partitioning you must define a closed range of numbers or
date keys. If you use an open-ended range, the session runs with one partition.
You can add and delete partition points at other transformations in the pipeline according to the following
rules:
♦ You cannot create partition points at source instances.
♦ You cannot create partition points at Sequence Generator transformations or unconnected transformations.
♦ You can add partition points at any other transformation provided that no partition point receives input
from more than one pipeline stage.
Figure 16-1 shows the valid partition points in a mapping:

Figure 16-1. Sample Mapping Showing Valid Partition Points

* Valid Partition Points


* *
*

In this mapping, the Workflow Manager creates partition points at the source qualifier and target instance by
default. You can place an additional partition point at Expression transformation EXP_3.
If you place a partition point at EXP_3 and define one partition, the master thread creates the following
threads:

* Partition Points
* *
*

Reader Thread Transformation Threads Writer Thread


(First Stage) (Second Stage) (Third Stage) (Fourth Stage)

In this case, each partition point receives data from only one pipeline stage, so EXP_3 is a valid partition point.

Adding and Deleting Partition Points 261


The following transformations are not valid partition points:

Transformation Reason

Source Source instance.

SG_1 Sequence Generator transformation.

EXP_1 and EXP_2 If you could place a partition point at EXP_1 or EXP_2, you would create an additional
pipeline stage that processes data from the source qualifier to EXP_1 or EXP_2. In this
case, EXP_3 would receive data from two pipeline stages, which is not allowed.

Partitioning Relational Sources


When you run a session that partitions relational or Application sources, the Integration Service creates a
separate connection to the source database for each partition. It then creates an SQL query for each partition.
You can customize the query for each source partition by entering filter conditions in the Transformation view
on the Mapping tab. You can also override the SQL query for each source partition using the Transformations
view on the Mapping tab.
Note: When you create a custom SQL query to read database tables and you set database partitioning,
Integration Service reverts to pass-through partitioning and prints a message in the session log.

Entering an SQL Query


You can enter an SQL override if you want to customize the SELECT statement in the SQL query. The SQL
statement you enter on the Transformations view of the Mapping tab overrides any customized SQL query that
you set in the Designer when you configure the Source Qualifier transformation.
The SQL query also overrides any key range and filter condition that you enter for a source partition. So, if you
also enter a key range and source filter, the Integration Service uses the SQL query override to extract source
data.
If you create a key that contains null values, you can extract the nulls by creating another partition and entering
an SQL query or filter to extract null values.
To enter an SQL query for each partition, click the Browse button in the SQL Query field. Enter the query in
the SQL Editor dialog box, and then click OK.
If you entered an SQL query in the Designer when you configured the Source Qualifier transformation, that
query appears in the SQL Query field for each partition. To override this query, click the Browse button in the
SQL Query field, revise the query in the SQL Editor dialog box, and then click OK.

Entering a Filter Condition


If you specify key range partitioning at a relational source qualifier, you can enter an additional filter condition.
When you do this, the Integration Service generates a WHERE clause that includes the filter condition you
enter in the session properties.
The filter condition you enter on the Transformations view of the Mapping tab overrides any filter condition
that you set in the Designer when you configure the Source Qualifier transformation.

262 Chapter 16: Working with Partition Points


If you use key range partitioning, the filter condition works in conjunction with the key ranges. For example,
you want to select data based on customer ID, but you do not want to extract information for customers
outside the USA. Define the following key ranges:
CUSTOMER_ID Start Range End Range
Partition #1 135000
Partition #2 135000

If you know that the IDs for customers outside the USA fall within the range for a particular partition, you can
enter a filter in that partition to exclude them. Therefore, you enter the following filter condition for the second
partition:
CUSTOMERS.COUNTRY = ‘USA’

When the session runs, the following queries for the two partitions appear in the session log:
READER_1_1_1> RR_4010 SQ instance [SQ_CUSTOMERS] SQL Query [SELECT
CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.LAST_NAME FROM CUSTOMERS WHERE
CUSTOMER.CUSTOMER ID < 135000]
[...]
READER_1_1_2> RR_4010 SQ instance [SQ_CUSTOMERS] SQL Query [SELECT
CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.LAST_NAME FROM CUSTOMERS WHERE
CUSTOMERS.COUNTRY = ‘USA’ AND 135000 <= CUSTOMERS.CUSTOMER_ID]

To enter a filter condition, click the Browse button in the Source Filter field. Enter the filter condition in the
SQL Editor dialog box, and then click OK.
If you entered a filter condition in the Designer when you configured the Source Qualifier transformation, that
query appears in the Source Filter field for each partition. To override this filter, click the Browse button in the
Source Filter field, change the filter condition in the SQL Editor dialog box, and then click OK.

Partitioning File Sources


When a session uses a file source, you can configure it to read the source with one thread or with multiple
threads. The Integration Service creates one connection to the file source when you configure the session to read
with one thread, and it creates multiple concurrent connections to the file source when you configure the
session to read with multiple threads.
Use the following types of partitioned file sources:
♦ Flat file. You can configure a session to read flat file, XML, or COBOL source files.
♦ Command. You can configure a session to use an operating system command to generate source data rows or
generate a file list. For more information about using a command to generate source data, see “Working
with File Sources” on page 151.
When connecting to file sources, you must choose the same connection type for all partitions. You may choose
different connection objects as long as each object is of the same type.
To specify single- or multi-threaded reading for flat file sources, configure the source file name property for
partitions 2-n. To configure for single-threaded reading, pass empty data through partitions 2-n. To configure
for multi-threaded reading, leave the source file name blank for partitions 2-n.

Rules and Guidelines for Partitioning File Sources


Use the following rules and guidelines when you configure a file source session with multiple partitions:
♦ Use pass-through partitioning at the source qualifier.
♦ Use single- or multi-threaded reading with flat file or COBOL sources.
♦ Use single-threaded reading with XML sources.

Partitioning File Sources 263


♦ You cannot use multi-threaded reading if the source files are non-disk files, such as FTP files or WebSphere
MQ sources.
♦ If you use a shift-sensitive code page, use multi-threaded reading if the following conditions are true:
− The file is fixed-width.
− The file is not line sequential.
− You did not enable user-defined shift state in the source definition.
♦ To read data from the three flat files concurrently, you must specify three partitions at the source qualifier.
Accept the default partition type, pass-through.
♦ If you configure a session for multi-threaded reading, and the Integration Service cannot create multiple
threads to a file source, it writes a message to the session log and reads the source with one thread.
♦ When the Integration Service uses multiple threads to read a source file, it may not read the rows in the file
sequentially. If sort order is important, configure the session to read the file with a single thread. For
example, sort order may be important if the mapping contains a sorted Joiner transformation and the file
source is the sort origin.
♦ You can also use a combination of direct and indirect files to balance the load.
♦ Session performance for multi-threaded reading is optimal with large source files. The load may be
unbalanced if the amount of input data is small.
♦ You cannot use a command for a file source if the command generates source data and the session is
configured to run on a grid or is configured with the resume from the last checkpoint recovery strategy.

Using One Thread to Read a File Source


When the Integration Service uses one thread to read a file source, it creates one connection to the source. The
Integration Service reads the rows in the file or file list sequentially. You can configure single-threaded reading
for direct or indirect file sources in a session:
♦ Reading direct files. You can configure the Integration Service to read from one or more direct files. If you
configure the session with more than one direct file, the Integration Service creates a concurrent connection
to each file. It does not create multiple connections to a file.
♦ Reading indirect files. When the Integration Service reads an indirect file, it reads the file list and then reads
the files in the list sequentially. If the session has more than one file list, the Integration Service reads the file
lists concurrently, and it reads the files in the list sequentially.

Using Multiple Threads to Read a File Source


When the Integration Service uses multiple threads to read a source file, it creates multiple concurrent
connections to the source. The Integration Service may or may not read the rows in a file sequentially.
You can configure multi-threaded reading for direct or indirect file sources in a session:
♦ Reading direct files. When the Integration Service reads a direct file, it creates multiple reader threads to
read the file concurrently. You can configure the Integration Service to read from one or more direct files.
For example, if a session reads from two files and you create five partitions, the Integration Service may
distribute one file between two partitions and one file between three partitions.
♦ Reading indirect files. When the Integration Service reads an indirect file, it creates multiple threads to read
the file list concurrently. It also creates multiple threads to read the files in the list concurrently. The
Integration Service may use more than one thread to read a single file.

Configuring for File Partitioning


After you create partition points and configure partitioning information, you can configure source connection
settings and file properties on the Transformations view of the Mapping tab. Click the source instance name
you want to configure under the Sources node. When you click the source instance name for a file source, the
Workflow Manager displays connection and file properties in the session properties.

264 Chapter 16: Working with Partition Points


You can configure the source file names and directories for each source partition. The Workflow Manager
generates a file name and location for each partition.
Table 16-2 describes the file properties settings for file sources in a mapping:

Table 16-2. File Properties Settings for File Sources

Attribute Description

Input Type Type of source input. You can choose the following types of source input:
- File. For flat file, COBOL, or XML sources.
- Command. For source data or a file list generated by a command.
You cannot use a command to generate XML source data.

Concurrent read Order in which multiple partitions read input rows from a source file. You can choose the
partitioning following options:
- Optimize throughput. The Integration Service does not preserve input row order.
- Keep relative input row order. The Integration Service preserves the input row order for
the rows read by each partition.
- Keep absolute input row order. The Integration Service preserves the input row order
for all rows read by all partitions.

Source File Directory name of flat file source. By default, the Integration Service looks in the service
Directory process variable directory, $PMSourceFileDir, for file sources.
If you specify both the directory and file name in the Source Filename field, clear this
field. The Integration Service concatenates this field with the Source Filename field
when it runs the session.
You can also use the $InputFileName session parameter to specify the file location.

Source File Name File name, or file name and path of flat file source. Optionally, use the $InputFileName
session parameter for the file name.
The Integration Service concatenates this field with the Source File Directory field when
it runs the session. For example, if you have “C:\data\” in the Source File Directory field,
then enter “filename.dat” in the Source Filename field. When the Integration Service
begins the session, it looks for “C:\data\filename.dat”.
By default, the Workflow Manager enters the file name configured in the source
definition.

Source File Type You can choose the following source file types:
- Direct. For source files that contain the source data.
- Indirect. For source files that contain a list of files. When you select Indirect, the
Integration Service finds the file list and reads each listed file when it runs the session.

Command Type Type of source data the command generates. You can choose the following command
types:
- Command generating data for commands that generate source data input rows.
- Command generating file list for commands that generate a file list.

Command Command used to generate the source file data.

RELATED TOPICS:
♦ “Configuring Commands for File Sources” on page 152

Configuring Sessions to Use a Single Thread


To configure a session to read a file with a single thread, pass empty data through partitions 2-n. To pass empty
data, create a file with no data, such as “empty.txt,” and put it in the source file directory. Then, use “empty.txt”
as the source file name.
Note: You cannot configure single-threaded reading for partitioned sources that use a command to generate
source data.

Partitioning File Sources 265


Table 16-3 describes the session configuration and the Integration Service behavior when it uses a single thread
to read source files:

Table 16-3. Configuring Source File Name for Single-Threaded Reading

Source File Name Value Integration Service Behavior

Partition #1 ProductsA.txt Integration Service creates one thread to read


Partition #2 empty.txt ProductsA.txt. It reads rows in the file sequentially. After it
Partition #3 empty.txt reads the file, it passes the data to three partitions in the
transformation pipeline.

Partition #1 ProductsA.txt Integration Service creates two threads. It creates one


Partition #2 empty.txt thread to read ProductsA.txt, and it creates one thread to
Partition #3 ProductsB.txt read ProductsB.txt. It reads the files concurrently, and it
reads rows in the files sequentially.

If you use FTP to access source files, you can choose a different connection for each direct file.

RELATED TOPICS:
♦ “Using FTP” on page 443

Configuring Sessions to Use Multiple Threads


To configure a session to read a file with multiple threads, leave the source file name blank for partitions 2-n.
The Integration Service uses partitions 2-n to read a portion of the previous partition file or file list. The
Integration Service ignores the directory field of that partition.
To configure a session to read from a command with multiple threads, enter a command for each partition or
leave the command property blank for partitions 2-n. If you enter a command for each partition, the
Integration Service creates a thread to read the data generated by each command. Otherwise, the Integration
Service uses partitions 2-n to read a portion of the data generated by the command for the first partition.
Table 16-4 describes the session configuration and the Integration Service behavior when it uses multiple
threads to read source files:

Table 16-4. Configuring Source File Name for Multi-Threaded Reading

Attribute Value Integration Service Behavior

Partition #1 ProductsA.txt Integration Service creates three threads to concurrently read


Partition #2 <blank> ProductsA.txt.
Partition #3 <blank>

Partition #1 ProductsA.txt Integration Service creates three threads to read ProductsA.txt


Partition #2 <blank> and ProductsB.txt concurrently. Two threads read ProductsA.txt
Partition #3 ProductsB.txt and one thread reads ProductsB.txt.

Table 16-5 describes the session configuration and the Integration Service behavior when it uses multiple
threads to read source data piped from a command:

Table 16-5. Configuring Commands for Multi-Threaded Reading

Attribute Value Integration Service Behavior

Partition #1 CommandA Integration Service creates three threads to concurrently read data
Partition #2 <blank> piped from the command.
Partition #3 <blank>

Partition #1 CommandA Integration Service creates three threads to read data piped from
Partition #2 <blank> CommandA and CommandB. Two threads read the data piped
Partition #3 CommandB from CommandA and one thread reads the data piped from
CommandB.

266 Chapter 16: Working with Partition Points


Configuring Concurrent Read Partitioning
By default, the Integration Service does not preserve row order when multiple partitions read from a single file
source. To preserve row order when multiple partitions read from a single file source, configure concurrent read
partitioning. You can configure the following options:
♦ Optimize throughput. The Integration Service does not preserve row order when multiple partitions read
from a single file source. Use this option if the order in which multiple partitions read from a file source is
not important.
♦ Keep relative input row order. Preserves the sort order of the input rows read by each partition. Use this
option if you want to preserve the sort order of the input rows read by each partition.
Table 16-6 shows an example sort order of a file source with 10 rows by two partitions:

Table 16-6. Keep Relative Input Row Order

Partition Rows Read

Partition #1 1,3,5,8,9

Partition #2 2,4,6,7,10

♦ Keep absolute input row order. Preserves the sort order of all input rows read by all partitions. Use this
option if you want to preserve the sort order of the input rows each time the session runs. In a pass-through
mapping with passive transformations, the order of the rows written to the target will be in the same order as
the input rows.
Table 16-7 shows an example sort order of a file source with 10 rows by two partitions:

Table 16-7. Keep Absolute Input Row Order

Partition Rows Read

Partition #1 1,2,3,4,5

Partition #2 6,7,8,9,10

Note: By default, the Integration Service uses the Keep absolute input row order option in sessions configured
with the resume from the last checkpoint recovery strategy.

Partitioning Relational Targets


When you configure a pipeline to load data to a relational target, the Integration Service creates a separate
connection to the target database for each partition at the target instance. It concurrently loads data for each
partition into the target database.
Configure partition attributes for targets in the pipeline on the Mapping tab of session properties. For relational
targets, you configure the reject file names and directories. The Integration Service creates one reject file for
each target partition.
Table 16-8 describes the partitioning attributes for relational targets in a pipeline:

Table 16-8. Partitioning Relational Target Attributes

Attribute Description

Reject File Directory Location for the target reject files. Default is $PMBadFileDir.

Reject File Name Name of reject file. Default is target name partition number.bad. You can also use
the session parameter, $BadFileName, as defined in the parameter file.

Partitioning Relational Targets 267


Database Compatibility
When you configure a session with multiple partitions at the target instance, the Integration Service creates one
connection to the target for each partition. If you configure multiple target partitions in a session that loads to
a database or ODBC target that does not support multiple concurrent connections to tables, the session fails.
When you create multiple target partitions in a session that loads data to an Informix database, you must create
the target table with row-level locking. If you insert data from a session with multiple partitions into an
Informix target configured for page-level locking, the session fails and returns the following message:
WRT_8206 Error: The target table has been created with page level locking. The session
can only run with multi partitions when the target table is created with row level
locking.

Sybase IQ does not allow multiple concurrent connections to tables. If you create multiple target partitions in a
session that loads to Sybase IQ, the Integration Service loads all of the data in one partition.

Partitioning File Targets


When you configure a session to write to a file target, you can write the target output to a separate file for each
partition or to a merge file that contains the target output for all partitions. When you run the session, the
Integration Service writes to the individual output files or to the merge file concurrently. You can also send the
data for a single partition or for all target partitions to an operating system command.
You can configure connection settings and file properties for each target partition. You configure these settings
in the Transformations view on the Mapping tab. You can also configure the session to use partitioned FTP file
targets.

Configuring Connection Settings


Use the Connections settings in the Transformations view on the Mapping tab to configure the connection type
for all target partitions. You can choose different connection objects for each partition, but they must all be of
the same type.
Use one of the following connection types with target files:
♦ None. Write the partitioned target files to the local machine.
♦ FTP. Transfer the partitioned target files to another machine. You can transfer the files to any machine to
which the Integration Service can connect.
♦ Loader. Use an external loader that can load from multiple output files. This option appears if the pipeline
loads data to a relational target and you choose a file writer in the Writers settings on the Mapping tab. If
you choose a loader that cannot load from multiple output files, the Integration Service fails the session.
♦ Message Queue. Transfer the partitioned target files to a WebSphere MQ message queue.
Note: You can merge target files if you choose a local or FTP connection type for all target partitions. You
cannot merge output files from sessions with multiple partitions if you use an external loader or a WebSphere
MQ message queue as the target connection type.

268 Chapter 16: Working with Partition Points


Table 16-9 describes the connection options for file targets in a mapping:

Table 16-9. File Targets Connection Options

Attribute Description

Connection Type Choose an FTP, external loader, or message queue connection. Select None for a
local connection.
The connection type is the same for all partitions.

Value For an FTP, external loader, or message queue connection, click the Open button
in this field to select the connection object.
You can specify a different connection object for each partition.

Configuring File Properties


Use the Properties settings in the Transformations view on the Mapping tab to configure file properties for flat
file sources.
Table 16-10 describes the file properties for file targets in a mapping:

Table 16-10. Target File Properties

Attribute Description

Merge Type Type of merge the Integration Service performs on the data for partitioned targets.
When merging target files, the Integration Service writes the output for all
partitions to the merge file or a command when the session runs.
You cannot merge files if the session uses an external loader or a message queue.

Merge File Directory Location of the merge file. Default is $PMTargetFileDir.

Merge File Name Name of the merge file. Default is target name.out.

Append if Exists Appends the output data to the target files and reject files for each partition.
Appends output data to the merge file if you merge the target files. You cannot use
this option for target files that are non-disk files, such as FTP target files.
If you do not select this option, the Integration Service truncates each target file
before writing the output data to the target file. If the file does not exist, the
Integration Service creates it.

Output Type Type of target for the session. Select File to write the target data to a file target.
Select Command to send target data to a command. You cannot select Command
for FTP or queue target connection.

Header Options Create a header row in the file target.

Header Command Command used to generate the header row in the file target.

Footer Command Command used to generate a footer row in the file target.

Merge Command Command used to process merged target data.

Output File Directory Location of the target file. Default is $PMTargetFileDir.

Output File Name Name of target file. Default is target name partition number.out. You can also use
the session parameter, $OutputFileName, as defined in the parameter file.

Reject File Directory Location for the target reject files. Default is $PMBadFileDir.

Reject File Name Name of reject file. Default is target name partition number.bad. You can also use
the session parameter, $BadFileName, as defined in the parameter file.

Command Command used to process the target output data for a single partition.

Configuring Commands for Partitioned File Targets


Use a command to process target data for a single partition or process merge data for all target partitions in a
session. On UNIX, use any valid UNIX command or shell script. On Windows, use any valid DOS or batch
file. The Integration Service sends the data to a command instead of a flat file target or merge file.

Partitioning File Targets 269


Use a command to process the following types of target data:
♦ Target data for a single partition. You can enter a command for each target partition. The Integration
Service sends the target data to the command when the session runs.
To send the target data for a single partition to a command, select Command for the Output Type. Enter a
command for the Command property for the partition in the session properties.
♦ Merge data for all target partitions. You can enter a command to process the merge data for all partitions.
The Integration Service concurrently sends the target data for all partitions to the command when the
session runs. The command may not maintain the order of the target data.
To send merge data for all partitions to a command, select Command as the Output Type and enter a
command for the Merge Command Line property in the session properties.

RELATED TOPICS:
♦ “Working with File Targets” on page 178

Configuring Merge Options


You can merge target data for the partitions in a session. When you merge target data, the Integration Service
creates a merge file for all target partitions.
You can configure the following merge file options:
♦ Sequential Merge. The Integration Service creates an output file for all partitions and then merges them into
a single merge file at the end of the session. The Integration Service sequentially adds the output data for
each partition to the merge file. The Integration Service creates the individual target file using the Output
File Name and Output File Directory values for the partition.
♦ File list. The Integration Service creates a target file for all partitions and creates a file list that contains the
paths of the individual files. The Integration Service creates the individual target file using the Output File
Name and Output File Directory values for the partition. If you write the target files to the merge directory
or a directory under the merge directory, the file list contains relative paths. Otherwise, the list file contains
absolute paths. Use this file as a source file if you use the target files as source files in another mapping.
♦ Concurrent Merge. The Integration Service concurrently writes the data for all target partitions to the
merge file. It does not create intermediate files for each partition. Since the Integration Service writes to the
merge file concurrently for all partitions, the sort order of the data in the merge file may not be sequential.
For more information about merging target files in sessions that use an FTP connection, see “Configuring FTP
in a Session” on page 445.

Partitioning Custom Transformations


When a mapping contains a Custom transformation, a Java transformation, SQL transformation, or an HTTP
transformation, you can edit the following partitioning information:
♦ Add multiple partitions. You can create multiple partitions when the Custom transformation allows
multiple partitions. For more information, see “Working with Multiple Partitions” on page 271.
♦ Create partition points. You can create a partition point at a Custom transformation even when the
transformation does not allow multiple partitions. For more information, see “Creating Partition Points” on
page 271.
The Java, SQL, and HTTP transformations were built using the Custom transformation and have the same
partitioning features. Not all transformations created using the Custom transformation have the same
partitioning features as the Custom transformation.
When you configure a Custom transformation to process each partition with one thread, the Workflow
Manager adds partition points depending on the mapping configuration. For more information, see “Working
with Threads” on page 271.

270 Chapter 16: Working with Partition Points


Working with Multiple Partitions
You can configure a Custom transformation to allow multiple partitions in mappings. You can add partitions to
the pipeline if you set the Is Partitionable property for the transformation. You can select the following values
for the Is Partitionable option:
♦ No. The transformation cannot be partitioned. The transformation and other transformations in the same
pipeline are limited to one partition. You might choose No if the transformation processes all the input data
together, such as data cleansing.
♦ Locally. The transformation can be partitioned, but the Integration Service must run all partitions in the
pipeline on the same node. Choose Local when different partitions of the transformation must share objects
in memory.
♦ Across Grid. The transformation can be partitioned, and the Integration Service can distribute each
partition to different nodes.
Note: When you add multiple partitions to a mapping that includes a multiple input or output group Custom
transformation, you define the same number of partitions for all groups.

Creating Partition Points


You can create a partition point at a Custom transformation even when the transformation does not allow
multiple partitions. Use the following rules and guidelines when you create a partition point at a Custom
transformation:
♦ You can define the partition type for each input group in the transformation. You cannot define the partition
type for output groups.
♦ Valid partition types are pass-through, round-robin, key range, and hash user keys.

Working with Threads


To configure a Custom transformation so the Integration Service uses one thread to process the transformation
for each partition, enable Requires Single Thread Per Partition Custom transformation property. The Workflow
Manager creates a pass-through partition point based on the number of input groups and the location of the
Custom transformation in the mapping.

One Input Group


When a single input group Custom transformation is downstream from a multiple input group Custom
transformation that does not have a partition point, the Workflow Manager places a pass-through partition
point at the closest upstream multiple input group transformation.
For example, consider the following mapping:

Multiple Input Groups

Single Input Group

Requires one thread for each


partition.

* Partition Point Does not require one thread for each partition.

CT_quartile contains one input group and is downstream from a multiple input group transformation.
CT_quartile requires one thread for each partition, but the upstream Custom transformation does not. The

Partitioning Custom Transformations 271


Workflow Manager creates a partition point at the closest upstream multiple input group transformation,
CT_Sort.

Multiple Input Groups


The Workflow Manager places a partition point at a multiple input group Custom transformation that requires
a single thread for each partition.
For example, consider the following mapping:

Multiple Input Groups

* Partition Point
Requires one thread for each
partition.

Does not require one thread for each partition.

CT_Order_class and CT_Order_Prep have multiple input groups, but only CT_Order_Prep requires one
thread for each partition. The Workflow Manager creates a partition point at CT_Order_Prep.

Partitioning Joiner Transformations


When you create a partition point at the Joiner transformation, the Workflow Manager sets the partition type
to hash auto-keys when the transformation scope is All Input. The Workflow Manager sets the partition type to
pass-through when the transformation scope is Transaction.
You must create the same number of partitions for the master and detail source. If you configure the Joiner
transformation for sorted input, you can change the partition type to pass-through. You can specify only one
partition if the pipeline contains the master source for a Joiner transformation and you do not add a partition
point at the Joiner transformation.
The Integration Service uses cache partitioning when you create a partition point at the Joiner transformation.
When you use partitioning with a Joiner transformation, you can create multiple partitions for the master and
detail source of a Joiner transformation.
If you do not create a partition point at the Joiner transformation, you can create n partitions for the detail
source, and one partition for the master source (1:n).
Note: You cannot add a partition point at the Joiner transformation when you configure the Joiner
transformation to use the row transformation scope.

Partitioning Sorted Joiner Transformations


When you include a Joiner transformation that uses sorted input, you must verify the Joiner transformation
receives sorted data. If the sources contain large amounts of data, you may want to configure partitioning to
improve performance. However, partitions that redistribute rows can rearrange the order of sorted data, so it is
important to configure partitions to maintain sorted data.
For example, when you use a hash auto-keys partition point, the Integration Service uses a hash function to
determine the best way to distribute the data among the partitions. However, it does not maintain the sort
order, so you must follow specific partitioning guidelines to use this type of partition point.

272 Chapter 16: Working with Partition Points


When you join data, you can partition data for the master and detail pipelines in the following ways:
♦ 1:n. Use one partition for the master source and multiple partitions for the detail source. The Integration
Service maintains the sort order because it does not redistribute master data among partitions.
♦ n:n. Use an equal number of partitions for the master and detail sources. When you use n:n partitions, the
Integration Service processes multiple partitions concurrently. You may need to configure the partitions to
maintain the sort order depending on the type of partition you use at the Joiner transformation.
Note: When you use 1:n partitions, do not add a partition point at the Joiner transformation. If you add a
partition point at the Joiner transformation, the Workflow Manager adds an equal number of partitions to both
master and detail pipelines.
Use different partitioning guidelines, depending on where you sort the data:
♦ Using sorted flat files. Use one of the following partitioning configurations:
− Use 1:n partitions when you have one flat file in the master pipeline and multiple flat files in the detail
pipeline. Configure the session to use one reader-thread for each file.
− Use n:n partitions when you have one large flat file in the master and detail pipelines. Configure partitions
to pass all sorted data in the first partition, and pass empty file data in the other partitions.
♦ Using sorted relational data. Use one of the following partitioning configurations:
− Use 1:n partitions for the master and detail pipeline.
− Use n:n partitions. If you use a hash auto-keys partition, configure partitions to pass all sorted data in the
first partition.
♦ Using the Sorter transformation. Use n:n partitions. If you use a hash auto-keys partition at the Joiner
transformation, configure each Sorter transformation to use hash auto-keys partition points as well.
Add only pass-through partition points between the sort origin and the Joiner transformation.

Using Sorted Flat Files


Use 1:n partitions when you have one flat file in the master pipeline and multiple flat files in the detail pipeline.
When you use 1:n partitions, the Integration Service maintains the sort order because it does not redistribute
data among partitions. When you have one large flat file in each master and detail pipeline, use n:n partitions
and add a pass-through or hash auto-keys partition at the Joiner transformation. When you add a hash auto-
keys partition point, you must configure partitions to pass all sorted data in the first partition to maintain the
sort order.

Using 1:n Partitions


If the session uses one flat file in the master pipeline and multiple flat files in the detail pipeline, use one
partition for the master source and n partitions for the detail file sources (1:n). Add a pass-through partition
point at the detail Source Qualifier transformation. Do not add a partition point at the Joiner transformation.
The Integration Service maintains the sort order when you create one partition for the master source because it
does not redistribute sorted data among partitions.
When you have multiple files in the detail pipeline that have the same structure, pass the files to the Joiner
transformation using the following guidelines:
♦ Configure the mapping with one source and one Source Qualifier transformation in each pipeline.
♦ Specify the path and file name for each flat file in the Properties settings of the Transformations view on the
Mapping tab of the session properties.
♦ Each file must use the same file properties as configured in the source definition.
♦ The range of sorted data in the flat files can overlap. You do not need to use a unique range of data for each
file.

Partitioning Joiner Transformations 273


Figure 16-2 shows sorted file data joined using 1:n partitioning:

Figure 16-2. Sorted File Data with 1:n Partitions

Flat File
Source
Qualifier

Joiner
transformation
Flat File 1
Source
Flat File 2 Qualifier
with pass-
Flat File 3 through
partition

Sorted Data

Sorted output depends on join type.

The Joiner transformation may output unsorted data depending on the join type. If you use a full outer or
detail outer join, the Integration Service processes unmatched master rows last, which can result in unsorted
data.

Using n:n Partitions


If the session uses sorted flat file data, use n:n partitions for the master and detail pipelines. You can add a pass-
through partition or hash auto-keys partition at the Joiner transformation. If you add a pass-through partition
at the Joiner transformation, maintain the sort order in mappings. For more information, see the PowerCenter
Transformation Guide.
If you add a hash auto-keys partition point at the Joiner transformation, you can maintain the sort order by
passing all sorted data to the Joiner transformation in a single partition. When you pass sorted data in one
partition, the Integration Service maintains the sort order when it redistributes data using a hash function.
To allow the Integration Service to pass all sorted data in one partition, configure the session to use the sorted
file for the first partition and empty files for the remaining partitions.
The Integration Service redistributes the rows among multiple partitions and joins the sorted data.
Figure 16-3 shows sorted file data passed through a single partition to maintain sort order:

Figure 16-3. Sorted File Data Passed Through a Single Partition

Source
Qualifier

Joiner
transformation
with hash auto-
keys partition
point
Source
Qualifier

Sorted Data

No Data

274 Chapter 16: Working with Partition Points


This example shows sorted data passed in a single partition to maintain the sort order. The first partition
contains sorted file data while all other partitions pass empty file data. At the Joiner transformation, the
Integration Service distributes the data among all partitions while maintaining the order of the sorted data.

Using Sorted Relational Data


When you join relational data, use 1:n partitions for the master and detail pipeline. When you use 1:n
partitions, you cannot add a partition point at the Joiner transformation. If you use n:n partitions, you can add
a pass-through or hash auto-keys partition at the Joiner transformation. If you use a hash auto-keys partition
point, you must configure partitions to pass all sorted data in the first partition to maintain sort order.

Using 1:n Partitions


If the session uses sorted relational data, use one partition for the master source and n partitions for the detail
source (1:n). Add a key-range or pass-through partition point at the Source Qualifier transformation. Do not
add a partition point at the Joiner transformation. The Integration Service maintains the sort order when you
create one partition for the master source because it does not redistribute data among partitions.
Figure 16-4 shows sorted relational data with 1:n partitioning:

Figure 16-4. Sorted Relational Data with 1:n Partitioning

Relational Source Qualifier


Source transformation

Joiner
transformation

Relational Source Qualifier


Source transformation with
key-range or pass-
through partition point Sorted Data
Unsorted Data

Sorted output
depends on join
type.

The Joiner transformation may output unsorted data depending on the join type. If you use a full outer or
detail outer join, the Integration Service processes unmatched master rows last, which can result in unsorted
data.

Using n:n Partitions


If the session uses sorted relational data, use n:n partitions for the master and detail pipelines and add a pass-
through or hash auto-keys partition point at the Joiner transformation. When you use a pass-through partition
at the Joiner transformation, maintain sorted data in mappings. For more information, see the PowerCenter
Transformation Guide.
When you use a hash auto-keys partition point, you maintain the sort order by passing all sorted data to the
Joiner transformation in a single partition. Add a key-range partition point at the Source Qualifier
transformation that contains all source data in the first partition. When you pass sorted data in one partition,
the Integration Service redistributes data among multiple partitions using a hash function and joins the sorted
data.

Partitioning Joiner Transformations 275


Figure 16-5 shows sorted relational data passed through a single partition to maintain the sort order:

Figure 16-5. Sorted Relational Data Passed Through a Single Partition

Source Qualifier
Relational transformation with
Source key-range partition
point Joiner
transformation
with hash auto-
keys partition
Source Qualifier point
Relational transformation with
Source key-range partition
point

Sorted Data

No Data

This example shows sorted relational data passed in a single partition to maintain the sort order. The first
partition contains sorted relational data while all other partitions pass empty data. After the Integration Service
joins the sorted data, it redistributes data among multiple partitions.

Using Sorter Transformations


If the session uses the Sorter transformations to sort data, use n:n partitions for the master and detail pipelines.
Use a hash auto-keys partition point at the Sorter transformation to group the data. You can add a pass-through
or hash auto-keys partition point at the Joiner transformation.
The Integration Service groups data into partitions of the same hash values, and the Sorter transformation sorts
the data before passing it to the Joiner transformation. When the Integration Service processes the Joiner
transformation configured with a hash auto-keys partition, it maintains the sort order by processing the sorted
data using the same partitions it uses to route the data from each Sorter transformation.
Figure 16-6 shows Sorter transformations used with hash auto-keys to maintain sort order:

Figure 16-6. Using Sorter Transformations with Hash Auto-Keys to Maintain Sort Order

Source with Source Sorter


unsorted Qualifier transformation
data transformation with hash auto-
keys partition Joiner
point transformation
with hash auto-
keys or pass-
Sorter through
Source with Source transformation partition point
unsorted Qualifier with hash auto-
data transformation keys partition
point

Sorted Data
Unsorted Data

Note: For best performance, use sorted flat files or sorted relational data. You may want to calculate the
processing overhead for adding Sorter transformations to the mapping.

276 Chapter 16: Working with Partition Points


Optimizing Sorted Joiner Transformations with Partitions
When you use partitions with a sorted Joiner transformation, you may optimize performance by grouping data
and using n:n partitions.

Add a Hash Auto-keys Partition Upstream of the Sort Origin


To obtain expected results and get best performance when partitioning a sorted Joiner transformation, you must
group and sort data. To group data, ensure that rows with the same key value are routed to the same partition.
The best way to ensure that data is grouped and distributed evenly among partitions is to add a hash auto-keys
or key-range partition point before the sort origin. Placing the partition point before you sort the data ensures
that you maintain grouping and sort the data within each group.

Use n:n Partitions


You may be able to improve performance for a sorted Joiner transformation by using n:n partitions. When you
use n:n partitions, the Joiner transformation reads master and detail rows concurrently and does not need to
cache all of the master data. This reduces memory usage and speeds processing. When you use 1:n partitions,
the Joiner transformation caches all the data from the master pipeline and writes the cache to disk if the
memory cache fills. When the Joiner transformation receives the data from the detail pipeline, it must then read
the data from disk to compare the master and detail pipelines.

Partitioning Lookup Transformations


You can configure cache partitioning for a Lookup transformation. You can create multiple partitions for static
and dynamic lookup caches.
The cache for a pipeline Lookup transformation is built in an independent pipeline from the pipeline that
contains the Lookup transformation. You can create multiple partitions in both pipelines.

Cache Partitioning Lookup Transformations


Use cache partitioning for static and dynamic caches, and named and unnamed caches. When you create a
partition point at a connected Lookup transformation, use cache partitioning under the following conditions:
♦ Use the hash auto-keys partition type for the Lookup transformation.
♦ The lookup condition must contain only equality operators.
♦ The database is configured for case-sensitive comparison.
For example, if the lookup condition contains a string port and the database is not configured for case-
sensitive comparison, the Integration Service does not perform cache partitioning and writes the following
message to the session log:
CMN_1799 Cache partitioning requires case sensitive string comparisons. Lookup will not
use partitioned cache as the database is configured for case insensitive string
comparisons.

The Integration Service uses cache partitioning when you create a hash auto-keys partition point at the Lookup
transformation.
When the Integration Service creates cache partitions, it begins creating caches for the Lookup transformation
when the first row of any partition reaches the Lookup transformation. If you configure the Lookup
transformation for concurrent caches, the Integration Service builds all caches for the partitions concurrently.

Partitioning Lookup Transformations 277


Sharing Partitioned Caches
Use the following guidelines when you share partitioned Lookup caches:
♦ Lookup transformations can share a partitioned cache if the transformations meet the following conditions:
− The cache structures are identical. The lookup/output ports for the first shared transformation must
match the lookup/output ports for the subsequent transformations.
− The transformations have the same lookup conditions, and the lookup condition columns are in the same
order.
♦ You cannot share a partitioned cache with a non-partitioned cache.
♦ When you share Lookup caches across target load order groups, you must configure the target load order
groups with the same number of partitions.
♦ If the Integration Service detects a mismatch between Lookup transformations sharing an unnamed cache, it
rebuilds the cache files.
♦ If the Integration Service detects a mismatch between Lookup transformations sharing a named cache, it
fails the session.

Partitioning Pipeline Lookup Transformation Cache


A pipeline Lookup transformation is enabled for caching by default. You can partition the lookup source to
improve performance when the Integration Service builds the lookup cache. The Lookup transformation begins
processing rows when the lookup source is cached.
When you configure a pipeline Lookup transformation, the lookup source and source qualifier are in a different
pipeline from the Lookup transformation. The pipeline is a partial pipeline because it contains no target. The
Integration Service reads the source data in the partial pipeline. You can create multiple partitions in the
pipeline to improve processing performance.
The Integration Service passes source data from the partial pipeline to the other pipeline when it builds the
cache. When the number of partitions in the partial pipeline is different from the number of partitions for the
Lookup transformation, the Integration Service creates a partition point. If the Lookup transformation has a
hash auto-keys partition point, the Integration Service creates the same number of partitions in the cache as in
the Lookup transformation. Otherwise the cache has one partition.

278 Chapter 16: Working with Partition Points


Figure 16-11 shows the partitions for a session that contains a pipeline Lookup transformation and a Source
Qualifier lookup source:

Figure 16-7. Partitioning a Pipeline Lookup Transformation Lookup Source Table

The pipeline has


four partitions.

The partial
pipeline has
three partitions.

The Lookup
transformation
has a hash auto-
key partition
point.

The Integration Service processes the Employee rows in three partitions. The pipeline containing the Lookup
transformation has four partitions. Since the Lookup transformation has a hash auto-key partition point, the
cache is partitioned into four partitions.

Partitioning Sequence Generator Transformations


If you configure multiple partitions in a session on a grid that uses an uncached Sequence Generator
transformation, the sequence numbers the Integration Service generates for each partition are not consecutive.

Partitioning Sorter Transformations


If you configure multiple partitions in a session that uses a Sorter transformation, the Integration Service sorts
data in each partition separately. The Workflow Manager lets you choose hash auto-keys, key-range, or pass-
through partitioning when you add a partition point at the Sorter transformation.
Use hash-auto keys partitioning when you place the Sorter transformation before an Aggregator transformation
configured to use sorted input. Hash auto-keys partitioning groups rows with the same values into the same
partition based on the partition key. After grouping the rows, the Integration Service passes the rows through
the Sorter transformation. The Integration Service processes the data in each partition separately, but hash auto-
keys partitioning accurately sorts all of the source data because rows with matching values are processed in the
same partition. You can delete the default partition point at the Aggregator transformation.
Use key-range partitioning when you want to send all rows in a partitioned session from multiple partitions into
a single partition for sorting. When you merge all rows into a single partition for sorting, the Integration
Service can process all of the data together.

Partitioning Sequence Generator Transformations 279


Use pass-through partitioning if you already used hash partitioning in the pipeline. This ensures that the data
passing into the Sorter transformation is correctly grouped among the partitions. Pass-through partitioning
increases session performance without increasing the number of partitions in the pipeline.

Configuring Sorter Transformation Work Directories


The Integration Service creates temporary files for each Sorter transformation in a pipeline. It reads and writes
data to these files while it performs the sort. The Integration Service stores these files in the Sorter
transformation work directories.
By default, the Workflow Manager sets the work directories for all partitions at Sorter transformations to
$PMTempDir. You can specify a different work directory for each partition in the session properties.

Restrictions for Transformations


Some restrictions on the number of partitions depend on the types of transformations in the pipeline. These
restrictions apply to all transformations, including reusable transformations, transformations created in
mappings and mapplets, and transformations, mapplets, and mappings referenced by shortcuts.
Table 16-11 describes the restrictions on the number of partitions for transformations:

Table 16-11. Restrictions on the Number of Partitions for Transformations

Transformation Restrictions

Custom Transformation By default, you can only specify one partition if the pipeline contains a Custom
transformation.
However, this transformation contains an option on the Properties tab to allow
multiple partitions. If you enable this option, you can specify multiple partitions at
this transformation. Do not select Is Partitionable if the Custom transformation
procedure performs the procedure based on all the input data together, such as
data cleansing.

External Procedure By default, you can only specify one partition if the pipeline contains an External
Transformation Procedure transformation.
This transformation contains an option on the Properties tab to allow multiple
partitions. If this option is enabled, you can specify multiple partitions at this
transformation.

Joiner Transformation Specify only one partition if the pipeline contains the master source for a Joiner
transformation and you do not add a partition point at the Joiner transformation.

XML Target Instance Specify only one partition if the pipeline contains XML targets.

Sequence numbers generated by Normalizer and Sequence Generator transformations might not be sequential
for a partitioned source, but they are unique.

Restrictions for Numerical Functions


The numerical functions CUME, MOVINGSUM, and MOVINGAVG calculate running totals and averages
on a row-by-row basis. According to the way you partition a pipeline, the order that rows of data pass through
a transformation containing one of these functions can change. Therefore, a session with multiple partitions
that uses CUME, MOVINGSUM, or MOVINGAVG functions may not always return the same calculated
result.

280 Chapter 16: Working with Partition Points


CHAPTER 17

Working with Partition Types


This chapter includes the following topics:
♦ Overview, 281
♦ Setting Partition Types, 283
♦ Database Partitioning Partition Type, 284
♦ Hash Auto-Keys Partition Type, 287
♦ Hash User Keys Partition Type, 287
♦ Key Range Partition Type, 288
♦ Pass-Through Partition Type, 291
♦ Round-Robin Partition Type, 292

Overview
The Integration Services creates a default partition type at each partition point. If you have the Partitioning
option, you can change the partition type. The partition type controls how the Integration Service distributes
data among partitions at partition points.
When you configure the partitioning information for a pipeline, you must define a partition type at each
partition point in the pipeline. The partition type determines how the Integration Service redistributes data
across partition points.
You can define the following partition types in the Workflow Manager:
♦ Database partitioning. The Integration Service queries the IBM DB2 or Oracle system for table partition
information. It reads partitioned data from the corresponding nodes in the database. Use database
partitioning with Oracle or IBM DB2 source instances on a multi-node tablespace. Use database
partitioning with DB2 targets. For more information, see “Database Partitioning Partition Type” on
page 284.
♦ Hash partitioning. Use hash partitioning when you want the Integration Service to distribute rows to the
partitions by group. For example, you need to sort items by item ID, but you do not know how many items
have a particular ID number.
You can use the following types of hash partitioning:
− Hash auto-keys. The Integration Service uses all grouped or sorted ports as a compound partition key. You
may need to use hash auto-keys partitioning at Rank, Sorter, and unsorted Aggregator transformations.
For more information, see “Hash Auto-Keys Partition Type” on page 287.

281
− Hash user keys. The Integration Service uses a hash function to group rows of data among partitions. You
define the number of ports to generate the partition key. “Hash User Keys Partition Type” on page 287.
♦ Key range. You specify one or more ports to form a compound partition key. The Integration Service passes
data to each partition depending on the ranges you specify for each port. Use key range partitioning where
the sources or targets in the pipeline are partitioned by key range. For more information, see “Key Range
Partition Type” on page 288.
♦ Pass-through. The Integration Service passes all rows at one partition point to the next partition point
without redistributing them. Choose pass-through partitioning where you want to create an additional
pipeline stage to improve performance, but do not want to change the distribution of data across partitions.
For more information, see “Pass-Through Partition Type” on page 291.
♦ Round-robin. The Integration Service distributes data evenly among all partitions. Use round-robin
partitioning where you want each partition to process approximately the same number of rows. For more
information, see “Database Partitioning Partition Type” on page 284.

Setting Partition Types in the Pipeline


You can create different partition types at different points in the pipeline.
Figure 17-1 shows a mapping where you can create partition types to increase session performance:

Figure 17-1. Sample Mapping

This mapping reads data about items and calculates average wholesale costs and prices. The mapping must read
item information from three flat files of various sizes, and then filter out discontinued items. It sorts the active
items by description, calculates the average prices and wholesale costs, and writes the results to a relational
database in which the target tables are partitioned by key range.
You can delete the default partition point at the Aggregator transformation because hash auto-keys partitioning
at the Sorter transformation sends all rows that contain items with the same description to the same partition.
Therefore, the Aggregator transformation receives data for all items with the same description in one partition
and can calculate the average costs and prices for this item correctly.
When you use this mapping in a session, you can increase session performance by defining different partition
types at the following partition points in the pipeline:
♦ Source qualifier. To read data from the three flat files concurrently, you must specify three partitions at the
source qualifier. Accept the default partition type, pass-through.
♦ Filter transformation. Since the source files vary in size, each partition processes a different amount of data.
Set a partition point at the Filter transformation, and choose round-robin partitioning to balance the load
going into the Filter transformation.
♦ Sorter transformation. To eliminate overlapping groups in the Sorter and Aggregator transformations, use
hash auto-keys partitioning at the Sorter transformation. This causes the Integration Service to group all
items with the same description into the same partition before the Sorter and Aggregator transformations
process the rows. You can delete the default partition point at the Aggregator transformation.
♦ Target. Since the target tables are partitioned by key range, specify key range partitioning at the target to
optimize writing data to the target.

282 Chapter 17: Working with Partition Types


Setting Partition Types
The Workflow Manager sets a default partition type for each partition point in the pipeline. At the source
qualifier and target instance, the Workflow Manager specifies pass-through partitioning. For Rank and unsorted
Aggregator transformations, for example, the Workflow Manager specifies hash auto-keys partitioning when the
transformation scope is All Input. When you create a new partition point, the Workflow Manager sets the
partition type to the default partition type for that transformation. You can change the default type.
You must specify pass-through partitioning for all transformations that are downstream from a transaction
generator or an active source that generates commits and upstream from a target or a transformation with
Transaction transformation scope. Also, if you configure the session to use constraint-based loading, you must
specify pass-through partitioning for all transformations that are downstream from the last active source.
If workflow recovery is enabled, the Workflow Manager sets the partition type to pass-through unless the
partition point is either an Aggregator transformation or a Rank transformation.
Table 17-1 lists valid partition types and the default partition type for different partition points in the pipeline:

Table 17-1. Valid Partition Types for Partition Points

Hash
Transformation Round- Hash Key Pass- Database Default Partition
User
(Partition Point) Robin Auto-Keys Range Through Partitioning Type
Keys

Source definition Not a valid


partition point

Source Qualifier X X X Pass-through


(relational sources) (Oracle, DB2)

Source Qualifier X Pass-through


(flat file sources)

Web Service Source X Pass-through


Qualifier

XML Source X Pass-through


Qualifier

Normalizer X Pass-through
(COBOL sources)

Normalizer X X X X Pass-through
(relational)

Aggregator (sorted) X Pass-through

Aggregator X X Based on
(unsorted) transformation
scope*

Custom X X X X Pass-through

Expression X X X X Pass-through

External Procedure X X X X Pass-through

Filter X X X X Pass-through

HTTP X Pass-through

Java X X X X Pass-through

Joiner X X Based on
transformation
scope*

Lookup X X X X X Pass-through

Setting Partition Types 283


Table 17-1. Valid Partition Types for Partition Points

Hash
Transformation Round- Hash Key Pass- Database Default Partition
User
(Partition Point) Robin Auto-Keys Range Through Partitioning Type
Keys

Rank X X Based on
transformation
scope*

Router X X X X Pass-through

Sequence Generator Not a valid


partition point

Sorter X X X Based on
transformation
scope*

Stored Procedure X X X X Pass-through

Transaction Control X X X X Pass-through

Union X X X X Pass-through

Update Strategy X X X X Pass-through

Unconnected Not a valid


transformation partition point

Web Service X Pass-through


Consumer

XML Generator X Pass-through

XML Parser X Pass-through

Relational target X X X X X (DB2) Pass-through


definition

Flat file target X X X X Pass-through


definition

Web Service target X Pass-through

XML target definition Not a valid


partition point
* The default partition type is pass-through when the transformation scope is Transaction and hash auto-keys when the transformation scope is All
Input.

Database Partitioning Partition Type


You can optimize session performance by using the database partitioning partition type for source and target
databases. When you use source database partitioning, the Integration Service queries the database system for
table partition information and fetches data into the session partitions. When you use target database
partitioning, the Integration Service loads data into corresponding database partition nodes.
Use database partitioning for Oracle and IBM DB2 sources and IBM DB2 targets. Use any number of pipeline
partitions and any number of database partitions. However, you can improve performance when the number of
pipeline partitions equals the number of database partitions.

Partitioning Database Sources


When you use source database partitioning, the Integration Service queries the database system catalog for
partition information. It distributes the data from the database partitions among the session partitions.

284 Chapter 17: Working with Partition Types


If the session has more partitions than the database, the Integration Service generates SQL for each database
partition and redistributes the data to the session partitions at the next partition point.

Database Partitioning with One Source


When you use database partitioning with a source qualifier with one source, the Integration Service generates
SQL queries for each database partition and distributes the data from the database partitions among the session
partitions equally.
For example, when a session has three partitions, and the database has five partitions, the Integration Service
executes SQL queries in the session partitions against the database partitions. The first and second session
partitions receive data from two database partitions. The third session partition receives data from one database
partition.
When you use an Oracle database, the Integration Service generates SQL statements similar to the following
statements for partition 1:
SELECT <column list> FROM <table name> PARTITION <database_partition1 name> UNION ALL
SELECT <column list> FROM <table name> PARTITION <database_partition4 name> UNION ALL

When you use an IBM DB2 database, the Integration Service creates SQL statements similar to the following
for partition 1:
SELECT <column list> FROM <table name>
WHERE (nodenumber(<column 1>)=0 OR nodenumber(<column 1>) = 3)

Partitioning a Source Qualifier with Multiple Sources


A relational source qualifier can receive data from multiple source tables. The Integration Service creates SQL
queries for database partitions based on the number of partitions in the database table with the most partitions.
It creates an SQL join condition to retrieve the data from the database partitions.
For example, a source qualifier receives data from two source tables. Each source table has two partitions. If the
session has three partitions and the database table has two partitions, one of the session partitions receives no
data.
The Integration Service generates the following SQL statements for Oracle:
Session Partition 1:
SELECT <column list> FROM t1 PARTITION (p1), t2 WHERE <join clause>
Session Partition 2:
SELECT <column list> FROM t1 PARTITION (p2), t2 WHERE <join clause>
Session Partition 3:
No SQL query.

The Integration Service generates the following SQL statements for IBM DB2:
Session Partition 1:
SELECT <column list> FROM t1,t2 WHERE ((nodenumber(t1 column1)=0) AND <join clause>
Session Partition 2:
SELECT <column list> FROM t1,t2 WHERE ((nodenumber(t1 column1)=1) AND <join clause>
Session Partition 3:
No SQL query.

Integration Service Handling with Source Database Partitioning


The Integration Service uses the following rules for database partitioning:
♦ If you specify database partitioning for a database other than Oracle or IBM DB2, the Integration Service
reads the data in a single partition and writes a message to the session log.
♦ If the number of session partitions is more than the number of partitions for the table in the database, the
excess partitions receive no data. The session log describes which partitions do not receive data.
♦ If the number of session partitions is less than the number of partitions for the table in the database, the
Integration Service distributes the data equally to the session partitions. Some session partitions receive data
from more than one database partition.

Database Partitioning Partition Type 285


♦ When you use database partitioning with dynamic partitioning, the Integration Service determines the
number of session partitions when the session begins.
♦ Session performance with partitioning depends on the data distribution in the database partitions. The
Integration Service generates SQL queries to the database partitions. The SQL queries perform union or join
commands, which can result in large query statements that have a performance impact.

Rules and Guidelines for Source Database Partitioning


Use the following rules and guidelines when you use the database partitioning partition type with relational
sources:
♦ You cannot use database partitioning when you configure the session to use source-based or user-defined
commits, constraint-based loading, or workflow recovery.
♦ When you configure a source qualifier for database partitioning, the Integration Service reverts to pass-
through partitioning under the following circumstances:
− The database table is stored on one database partition.
− You run the session in debug mode.
− You specify database partitioning for a session with one partition.
− You use pushdown optimization. Pushdown optimization works with the other partition types.
♦ When you create an SQL override to read database tables and you set database partitioning, the Integration
Service reverts to pass-through partitioning and writes a message to the session log.
♦ If you create a user-defined join, the Integration Service adds the join to the SQL statements it generates for
each partition.
♦ If you create a source filter, the Integration Service adds it to the WHERE clause in the SQL query for each
partition.

Target Database Partitioning


You can use target database partitioning for IBM DB2 databases only. When you load data to an IBM DB2
table stored on a multi-node tablespace, you can optimize session performance by using the database
partitioning partition type. When you use database partitioning, the Integration Service queries the DB2
system for table partition information and loads partitioned data to the corresponding nodes in the target
database.
By default, the Integration Service fails the session when you use database partitioning for non-DB2 targets.
However, you can configure the Integration Service to default to pass-through partitioning when you use
database partitioning for non-DB2 relational targets. Set the Integration Service property
TreatDBPartitionAsPassThrough to Yes in the Administration Console.
You can specify database partitioning for the target partition type with any number of pipeline partitions and
any number of database nodes. However, you can improve load performance further when the number of
pipeline partitions equals the number of database nodes.

Rules and Guidelines


Use the following rules and guidelines when you use database partitioning with database targets:
♦ You cannot use database partitioning when you configure the session to use source-based or user-defined
commit, constraint-based loading, or session recovery.
♦ The target table must contain a partition key, and you must link all not-null partition key columns in the
target instance to a transformation in the mapping.
♦ Enable high precision for the session when an IBM DB2 target table partition key is a Bigint column. The
Integration Service might fail the session when a partition key is a Bigint column and you do not enable high
precision for the session.

286 Chapter 17: Working with Partition Types


♦ If you create multiple partitions for a DB2 bulk load session, use database partitioning for the target
partition type. If you choose any other partition type, the Integration Service reverts to normal load and
writes the following message to the session log:
ODL_26097 Only database partitioning is support for DB2 bulk load. Changing target load
type variable to Normal.

♦ If you configure a session for database partitioning, the Integration Service reverts to pass-through
partitioning under the following circumstances:
− The DB2 target table is stored on one node.
− You run the session in debug mode using the Debugger.
− You configure the Integration Service to treat the database partitioning partition type as pass-through
partitioning and you use database partitioning for a non-DB2 relational target.

Hash Auto-Keys Partition Type


Use hash auto-keys partitioning at or before Rank, Sorter, Joiner, and unsorted Aggregator transformations to
ensure that rows are grouped properly before they enter these transformations.
Figure 17-2 shows a mapping with hash auto-keys partitioning. The Integration Service distributes rows to each
partition according to group before they enter the Sorter and Aggregator transformations:

Figure 17-2. Hash Auto-Keys Partitioning

Hash auto-keys partitioning groups data at the Sorter.

In this mapping, the Sorter transformation sorts items by item description. If items with the same description
exist in more than one source file, each partition will contain items with the same description. Without hash
auto-keys partitioning, the Aggregator transformation might calculate average costs and prices for each item
incorrectly.
To prevent errors in the cost and prices calculations, set a partition point at the Sorter transformation and set
the partition type to hash auto-keys. When you do this, the Integration Service redistributes the data so that all
items with the same description reach the Sorter and Aggregator transformations in a single partition.
For information about partition points where you can specify hash partitioning, see Table 17-1 on page 283.

Hash User Keys Partition Type


In hash user keys partitioning, the Integration Service uses a hash function to group rows of data among
partitions based on a user-defined partition key. You choose the ports that define the partition key:

Hash auto-keys partitioning groups data at the Sorter.

Hash Auto-Keys Partition Type 287


When you specify hash auto-keys partitioning in the preceding mapping, the Sorter transformation receives
rows of data grouped by the sort key, such as ITEM_DESC. If the item description is long, and you know that
each item has a unique ID number, you can specify hash user keys partitioning at the Sorter transformation and
select ITEM_ID as the hash key. This might improve the performance of the session since the hash function
usually processes numerical data more quickly than string data.
If you select hash user keys partitioning at any partition point, you must specify a hash key. The Integration
Service uses the hash key to distribute rows to the appropriate partition according to group.
For example, if you specify key range partitioning at a Source Qualifier transformation, the Integration Service
uses the key and ranges to create the WHERE clause when it selects data from the source. Therefore, you can
have the Integration Service pass all rows that contain customer IDs less than 135000 to one partition and all
rows that contain customer IDs greater than or equal to 135000 to another partition.
If you specify hash user keys partitioning at a transformation, the Integration Service uses the key to group data
based on the ports you select as the key. For example, if you specify ITEM_DESC as the hash key, the
Integration Service distributes data so that all rows that contain items with the same description go to the same
partition.
To specify the hash key, select the partition point on the Partitions view of the Mapping tab, and click Edit
Keys. This displays the Edit Partition Key dialog box. The Available Ports list displays the connected input and
input/output ports in the transformation. To specify the hash key, select one or more ports from this list, and
then click Add.
To rearrange the order of the ports that define the key, select a port in the Selected Ports list and click the up or
down arrow.

Key Range Partition Type


With key range partitioning, the Integration Service distributes rows of data based on a port or set of ports that
you define as the partition key. For each port, you define a range of values. The Integration Service uses the key
and ranges to send rows to the appropriate partition.
For example, if you specify key range partitioning at a Source Qualifier transformation, the Integration Service
uses the key and ranges to create the WHERE clause when it selects data from the source. Therefore, you can
have the Integration Service pass all rows that contain customer IDs less than 135000 to one partition and all
rows that contain customer IDs greater than or equal to 135000 to another partition.
If you specify hash user keys partitioning at a transformation, the Integration Service uses the key to group data
based on the ports you select as the key. For example, if you specify ITEM_DESC as the hash key, the
Integration Service distributes data so that all rows that contain items with the same description go to the same
partition.
Use key range partitioning in mappings where the source and target tables are partitioned by key range.
Figure 17-3 shows a mapping where key range partitioning can optimize writing to the target table:

Figure 17-3. Mapping Where Key Range Partitioning Can Increase Performance

Key range partitioning at the target


optimizes writing to the target tables.

The target table in the database is partitioned by ITEM_ID as follows:


♦ Partition 1: 0001–2999

288 Chapter 17: Working with Partition Types


♦ Partition 2: 3000–5999
♦ Partition 3: 6000–9999
To optimize writing to the target table, complete the following tasks:
1. Set the partition type at the target instance to key range.
2. Create three partitions.
3. Choose ITEM_ID as the partition key.
The Integration Service uses this key to pass data to the appropriate partition.
4. Set the key ranges as follows:
ITEM_ID Start Range End Range
Partition #1 3000
Partition #2 3000 6000
Partition #3 6000

When you set the key range, the Integration Service sends all items with IDs less than 3000 to the first
partition. It sends all items with IDs between 3000 and 5999 to the second partition. Items with IDs greater
than or equal to 6000 go to the third partition.

Adding a Partition Key


To specify the partition key for key range partitioning, select the partition point on the Partitions view of the
Mapping tab, and click Edit Keys. This displays the Edit Partition Key dialog box. The Available Ports list
displays the connected input and input/output ports in the transformation. To specify the partition key, select
one or more ports from this list, and then click Add.
Figure 17-4 shows one port selected as the partition key for the target table T_ITEM_PRICES:

Figure 17-4. Edit Partition Key Dialog Box

Rearrange the selected ports.

To rearrange the order of the ports that define the partition key, select a port in the Selected Ports list and click
the up or down arrow.
In key range partitioning, the order of the ports does not affect how the Integration Service redistributes rows
among partitions, but it can affect session performance. For example, you might configure the following
compound partition key:
Selected Ports
ITEMS.DESCRIPTION
ITEMS.DISCONTINUED_FLAG

Key Range Partition Type 289


Since boolean comparisons are usually faster than string comparisons, the session may run faster if you arrange
the ports in the following order:
Selected Ports
ITEMS.DISCONTINUED_FLAG
ITEMS.DESCRIPTION

Adding Key Ranges


After you identify the ports that make up the partition key, you must enter the ranges for each port on the
Partitions view of the Mapping tab.
You can leave the start or end range blank for a partition. When you leave the start range blank, the Integration
Service uses the minimum data value as the start range. When you leave the end range blank, the Integration
Service uses the maximum data value as the end range.
For example, you can add the following ranges for a key based on CUSTOMER_ID in a pipeline that contains
two partitions:
CUSTOMER_ID Start Range End Range
Partition #1 135000
Partition #2 135000

When the Integration Service reads the Customers table, it sends all rows that contain customer IDs less than
135000 to the first partition and all rows that contain customer IDs equal to or greater than 135000 to the
second partition. The Integration Service eliminates rows that contain null values or values that fall outside the
key ranges.
When you configure a pipeline to load data to a relational target, if a row contains null values in any column
that defines the partition key or if a row contains a value that fall outside all of the key ranges, the Integration
Service sends that row to the first partition.
When you configure a pipeline to read data from a relational source, the Integration Service reads rows that fall
within the key ranges. It does not read rows with null values in any partition key column.
If you want to read rows with null values in the partition key, use pass-through partitioning and create an SQL
override.

Adding Filter Conditions


If you specify key range partitioning for a relational source, you can specify optional filter conditions or
override the SQL query.

Rules and Guidelines for Creating Key Ranges


Use the following rules and guidelines when you create key ranges:
♦ The partition key must contain at least one port.
♦ If you choose key range partitioning at any partition point, you must specify a range for each port in the
partition key.
♦ Use the standard PowerCenter date format to enter dates in key ranges.
♦ The Workflow Manager does not validate overlapping string or numeric ranges.
♦ The Workflow Manager does not validate gaps or missing ranges.
♦ If you choose key range partitioning and need to enter a date range for any port, use the standard
PowerCenter date format.
♦ When you define key range partitioning at a Source Qualifier transformation, the Integration Service
defaults to pass-through partitioning if you change the SQL statement in the Source Qualifier
transformation.

290 Chapter 17: Working with Partition Types


♦ The Workflow Manager does not validate overlapping string ranges, overlapping numeric ranges, gaps, or
missing ranges.
♦ If a row contains a null value in any column that defines the partition key, or if a row contains values that fall
outside all of the key ranges, the Integration Service sends that row to the first partition.

Pass-Through Partition Type


In pass-through partitioning, the Integration Service processes data without redistributing rows among
partitions. Therefore, all rows in a single partition stay in that partition after crossing a pass-through partition
point.
When you add a partition point to a pipeline, the master thread creates an additional pipeline stage. Use pass-
through partitioning when you want to increase data throughput, but you do not want to increase the number
of partitions.
You can specify pass-through partitioning at any valid partition point in a pipeline.
Figure 17-5 shows a mapping where pass-through partitioning can increase data throughput:

Figure 17-5. Pass-Through Partitioning

Reader Thread Transformation Thread Writer Thread


(First Stage) (Second Stage) (Third Stage)

By default, this mapping contains partition points at the source qualifier and target instance. Since this
mapping contains an XML target, you can configure only one partition at any partition point.
In this case, the master thread creates one reader thread to read data from the source, one transformation thread
to process the data, and one writer thread to write data to the target. Each pipeline stage processes the rows as
follows:
Source Qualifier Transformations Target Instance
(First Stage) (Second Stage) (Third Stage)
Time
Row Set 1 – –
Row Set 2 Row Set 1 –
Row Set 3 Row Set 2 Row Set 1
Row Set 4 Row Set 3 Row Set 2
... ... ...
Row Set n Row Set n-1 Row Set n-2
Because the pipeline contains three stages, the Integration Service can process three sets of rows concurrently.
If the Expression transformations are very complicated, processing the second (transformation) stage can take a
long time and cause low data throughput. To improve performance, set a partition point at Expression
transformation EXP_2 and set the partition type to pass-through. This creates an additional pipeline stage. The
master thread creates an additional transformation thread:

Reader Thread Transformation Threads Writer Thread


(First Stage) (Second Stage) (Third Stage) (Fourth Stage)

Pass-Through Partition Type 291


The Integration Service can now process four sets of rows concurrently as follows:
Source FIL_1 & EXP_1 EXP_2 & LKP_1 Target
Qualifier Transformations Transformations Instance
(First Stage) (Second Stage) (Third Stage) (Fourth Stage)
Time
Row Set 1 - - -
Row Set 2 Row Set 1 - -
Row Set 3 Row Set 2 Row Set 1 -
Row Set 4 Row Set 3 Row Set 2 Row Set 1
... ... ... ...
Row Set n Row Set n-1 Row Set n-2 Row Set n-3
By adding an additional partition point at Expression transformation EXP_2, you replace one long running
transformation stage with two shorter running transformation stages. Data throughput depends on the longest
running stage. So in this case, data throughput increases.

Round-Robin Partition Type


In round-robin partitioning, the Integration Service distributes rows of data evenly to all partitions. Each
partition processes approximately the same number of rows. For more information about the partition points
where you can specify round-robin partitioning, see Table 17-1 on page 283.
Use round-robin partitioning when you need to distribute rows evenly and do not need to group data among
partitions. In a pipeline that reads data from file sources of different sizes, use round-robin partitioning to
ensure that each partition receives approximately the same number of rows.
Figure 17-6 shows a mapping where round-robin partitioning helps distribute rows before they enter a Filter
transformation:

Figure 17-6. Round-Robin Partitioning

Round-robin partitioning distributes data


evenly at the Filter transformation.

The session based on this mapping reads item information from three flat files of different sizes:
♦ Source file 1: 80,000 rows
♦ Source file 2: 5,000 rows
♦ Source file 3: 15,000 rows
When the Integration Service reads the source data, the first partition begins processing 80% of the data, the
second partition processes 5% of the data, and the third partition processes 15% of the data.
To distribute the workload more evenly, set a partition point at the Filter transformation and set the partition
type to round-robin. The Integration Service distributes the data so that each partition processes approximately
one-third of the data.

292 Chapter 17: Working with Partition Types


CHAPTER 18

Using Pushdown Optimization


This chapter includes the following topics:
♦ Overview, 293
♦ Running Pushdown Optimization Sessions, 294
♦ Working with Databases, 295
♦ Working with Dates, 297
♦ Working with Expressions, 298
♦ Error Handling, Logging, and Recovery, 303
♦ Working with Slowly Changing Dimensions, 304
♦ Working with Sequences and Views, 304
♦ Using the $$PushdownConfig Mapping Parameter, 308
♦ Configuring Sessions for Pushdown Optimization, 309

Overview
You can push transformation logic to the source or target database using pushdown optimization. When you
use pushdown optimization, the Integration Service translates the transformation logic into SQL queries and
sends the SQL queries to the database. The source or target database executes the SQL queries to process the
transformations. The amount of transformation logic you can push to the database depends on the database,
transformation logic, and mapping and session configuration.
When you run a session configured for pushdown optimization, the Integration Service analyzes the mapping
and writes one or more SQL statements based on the mapping transformation logic. The Integration Service
analyzes the transformation logic, mapping, and session configuration to determine the transformation logic it
can push to the database. The Integration Service executes any SQL statement generated against the source or
target tables, and it processes any transformation logic that it cannot push to the database.
Use the Pushdown Optimization Viewer to preview the SQL statements and mapping logic that the Integration
Service can push to the source or target database. You can also use the Pushdown Optimization Viewer to view
the messages related to pushdown optimization.

293
Figure 18-1 shows a mapping containing transformation logic that can be pushed to the source database:

Figure 18-1. Sample Mapping Used in a Pushdown Optimization Session

This mapping contains a Filter transformation that filters out all items except those with an ID greater than
1005. The Integration Service can push the transformation logic to the database. It generates the following SQL
statement to process the transformation logic:
INSERT INTO ITEMS(ITEM_ID, ITEM_NAME, ITEM_DESC, n_PRICE) SELECT ITEMS.ITEM_ID,
ITEMS.ITEM_NAME, ITEMS.ITEM_DESC, CAST(ITEMS.PRICE AS INTEGER) FROM ITEMS WHERE
(ITEMS.ITEM_ID >1005)

The Integration Service generates an INSERT SELECT statement to get the ID, NAME, and DESCRIPTION
columns from the source table. It filters the data using a WHERE clause. The Integration Service does not
extract data from the database at this time.

Running Pushdown Optimization Sessions


When you run a session configured for pushdown optimization, the Integration Service analyzes the mapping
and transformations to determine the transformation logic it can push to the database. If the mapping contains
a mapplet, the Integration Service expands the mapplet and processes the transformations in the mapplet as part
of the mapping.
You can configure pushdown optimization in the following ways:
♦ Using source-side pushdown optimization. The Integration Service pushes as much transformation logic as
possible to the source database.
♦ Using target-side pushdown optimization. The Integration Service pushes as much transformation logic as
possible to the target database.
♦ Using full pushdown optimization. The Integration Service pushes as much transformation logic as possible
to both source and target databases. If you configure a session for full pushdown optimization, and the
Integration Service cannot push all the transformation logic to the database, it performs source-side or
target-side pushdown optimization instead.

Running Source-Side Pushdown Optimization Sessions


When you run a session configured for source-side pushdown optimization, the Integration Service analyzes the
mapping from the source to the target or until it reaches a downstream transformation it cannot push to the
database.
The Integration Service generates and executes a SELECT statement based on the transformation logic for each
transformation it can push to the database. Then, it reads the results of this SQL statement and continues to
run the session.

Running Target-Side Pushdown Optimization Sessions


When you run a session configured for target-side pushdown optimization, the Integration Service analyzes the
mapping from the target to the source or until it reaches an upstream transformation it cannot push to the
database. It generates an INSERT, DELETE, or UPDATE statement based on the transformation logic for each

294 Chapter 18: Using Pushdown Optimization


transformation it can push to the database. The Integration Service processes the transformation logic up to the
point that it can push the transformation logic to the target database. Then, it executes the generated SQL.

Running Full Pushdown Optimization Sessions


To use full pushdown optimization, the source and target must be on the same database. When you run a
session configured for full pushdown optimization, the Integration Service analyzes the mapping starting with
the source and analyzes each transformation in the pipeline until it analyzes the target. It generates and executes
SQL statements against the source and target based on the transformation logic it can push to the database.
When you run a session with large quantities of data and full pushdown optimization, the database server must
run a long transaction. Consider the following database performance issues when you generate a long
transaction:
♦ A long transaction uses more database resources.
♦ A long transaction locks the database for longer periods of time. This reduces database concurrency and
increases the likelihood of deadlock.
♦ A long transaction increases the likelihood of an unexpected event.

Integration Service Behavior with Full Optimization


When you configure a session for full optimization, the Integration Service determines how much
transformation logic it can push to the database. When it can push all transformation logic to the database, it
generates an INSERT SELECT statement to run on the database. The statement incorporates transformation
logic from all the transformations in the mapping.
If the Integration Service can push only part of the transformation logic to the database, it pushes as much
transformation logic to the source and target database as possible. It then processes the remaining
transformation logic. For example, a mapping contains the following transformations:

Transformations are pushed to the source. Transformations are pushed to the target.

The Rank transformation cannot be pushed to the database. If you configure the session for full pushdown
optimization, the Integration Service pushes the Source Qualifier transformation and the Aggregator
transformation to the source, processes the Rank transformation, and pushes the Expression transformation and
target to the target database. The Integration Service does not fail the session if it can push only part of the
transformation logic to the database.

Working with Databases


You can configure pushdown optimization for the following databases:
♦ Oracle
♦ IBM DB2
♦ Teradata
♦ Microsoft SQL Server
♦ Sybase ASE

Working with Databases 295


♦ Databases that use ODBC drivers
When you use native drivers, the Integration Service generates SQL statements using native database SQL.
When you use ODBC drivers, the Integration Service generates SQL statements using ANSI SQL. The
Integration Service can generate more functions when it generates SQL statements using native language
instead of ANSI SQL.
When the Integration Service reads data, it can convert data to a format that differs from the database format.
Or, the database can use different default settings for handling null values, case sensitivity, and sort order than
the Integration Service. When these formats and settings differ, the transformation logic processed on the
database can produce different data than the same transformation logic processed by the Integration Service.
The database can produce different output than the Integration Service when the following settings and
conversions are different:
♦ Nulls treated as the highest or lowest value. The returned results can differ depending on how null values
are treated. The Integration Service and a database can treat null values differently. For example, you want to
push a Sorter transformation to an Oracle database. In the session, you configure nulls as the lowest value in
the sort order. Oracle treats null values as the highest value in the sort order.
♦ Sort order. The returned results can differ depending on the sort order applied while processing the
transformation logic. The Integration Service and a database can use different sort orders. For example, you
want to push the transformations in a session to a Microsoft SQL Server database, which is configured to use
a sort order that is not case sensitive. You configure the session properties to use the binary sort order, which
is case sensitive. The results differ based on whether the Integration Service or Microsoft SQL Server
database process the transformation logic.
♦ Case sensitivity. The returned results can differ depending on whether case sensitivity is enabled. For
example, the Integration Service uses case sensitive queries and the database does not. A Filter
transformation uses the following filter condition: IIF(col_varchar2 = ‘CA’, TRUE, FALSE). You need the
database to return rows that match ‘CA.’ However, if you push this transformation logic to a Microsoft SQL
Server database that is not case sensitive, it returns rows that match the values ‘Ca,’ ‘ca,’ ‘cA,’ and ‘CA.’
♦ Numeric values converted to character values. The format of the character value can vary depending on the
database or Integration Service that converts the numeric value. The Integration Service and a database can
convert the same numeric value to a character value in different formats. The database can convert numeric
values to an unacceptable character format. For example, a table contains the number 1234567890. When
the Integration Service converts the number to a character value, it inserts the characters ‘1234567890.’
However, a database might convert the number to ‘1.2E9.’ The two sets of characters represent the same
value. However, if you require the characters in the format ‘1234567890,’ you can disable pushdown
optimization.
♦ Precision. Transformation datatypes use a default numeric precision that can vary from the native datatypes.
For example, a transformation Decimal datatype has a precision of 1-28, while a Teradata Decimal datatype
has a precision of 1-18. The results can vary if the database uses a different precision than the Integration
Service.

Using ODBC Drivers


When you use ODBC drivers, the Integration Service cannot detect the database type and must generate ANSI
SQL. In some cases, ANSI SQL is not compatible with the database syntax. The following sections describe
problems that you can encounter when you use ODBC drivers, instead of the native drivers, with different
databases.

IBM DB2
You can encounter the following problems using ODBC drivers with an IBM DB2 database:
♦ A session containing a Sorter transformation fails if the sort is distinct and not case sensitive, and one of the
sort keys is a string datatype.
♦ A session containing a Lookup transformation fails for source-side or full pushdown optimization.

296 Chapter 18: Using Pushdown Optimization


♦ A session that requires type casting fails if the casting is from float or double to string, or if it requires any
other type of casting that IBM DB2 databases disallow.

Microsoft SQL Server


You can encounter the following problems using ODBC drivers with a Microsoft SQL Server database:
♦ A session containing a Sorter transformation fails if the sort is distinct and not case sensitive.
♦ A pushdown optimization session fails when it loads datetime data to a Microsoft SQL Server target using an
ODBC driver. The Integration Service converts datetime data to the ODBC Timestamp datatype, which is
not a supported Microsoft SQL Server datatype.

Sybase ASE
You can encounter the following problems using ODBC drivers with a Sybase ASE database:
♦ The session fails when it performs datatype conversions and you use Sybase ASE 12.5 or below.
♦ The session fails if you use a Joiner transformation configured for a full outer join.

Teradata
You can encounter the following problems using ODBC drivers with a Teradata database:
♦ The session fails if it converts a numeric datatype with precision greater than 18.
♦ The session fails when you use full pushdown optimization for a session containing a Sorter transformation.
♦ A sort on a distinct key can give inconsistent results if the sort is not case sensitive and one port is a character
port.
♦ An Integration Service and a database can produce different results for a session that contains an Aggregator
transformation if the group by port is of a string datatype and is not case sensitive.
♦ A session containing a Lookup transformation fails if it is configured for target-side pushdown optimization.
♦ A session that contains a date-to-string datatype conversion fails.

Working with Dates


The Integration Service and database can process dates differently. When you configure the session to push date
conversion to the database, you can receive unexpected results or the session can fail.
The database can produce different output than the Integration Service when the following date settings and
conversions are different:
♦ Date values converted to character values. The Integration Service converts the transformation Date/Time
datatype to the native datatype that supports subsecond precision in the database. The session fails if you
configure the datetime format in the session to a format that the database does not support. For example,
when the Integration Service performs the ROUND function on a date, it stores the date value in a character
column, using the format MM/DD/YYYY HH:MI:SS.US. When the database performs this function, it
stores the date in the default date format for the database. If the database is Oracle, it stores the date as the
default DD-MON-YY. If you require the date to be in the format MM/DD/YYYY HH:MI:SS.US, you can
disable pushdown optimization.
♦ Date formats for TO_CHAR and TO_DATE functions. The Integration Service uses the date format in the
TO_CHAR or TO_DATE function when the Integration Service pushes the function to the database. The
database converts each date string to a datetime value supported by the database.
For example, the Integration Service pushes the following expression to the database:
TO_DATE( DATE_PROMISED, 'MM/DD/YY' )

Working with Dates 297


The database interprets the date string in the DATE_PROMISED port based on the specified date format
string MM/DD/YY. The database converts each date string, such as 01/22/98, to the supported date value,
such as Jan 22 1998 00:00:00.
If the Integration Service pushes a date format to an IBM DB2, a Microsoft SQL Server, or a Sybase database
that the database does not support, the Integration Service stops pushdown optimization and processes the
transformation.
The Integration Service converts all dates before pushing transformations to an Oracle or Teradata database.
If the database does not support the date format after the date conversion, the session fails.
♦ HH24 date format. You cannot use the HH24 format in the date format string for Teradata. When the
Integration Service generates SQL for a Teradata database, it uses the HH format string instead.
♦ Blank spaces in date format strings. You cannot use blank spaces in the date format string in Teradata.
When the Integration Service generates SQL for a Teradata database, it substitutes the space with ‘B.’
♦ Handling subsecond precision for a Lookup transformation. If you enable subsecond precision for a
Lookup transformation, the database and Integration Service perform the lookup comparison using the
subsecond precision, but return different results. Unlike the Integration Service, the database does not
truncate the lookup results based on subsecond precision. For example, you configure the Lookup
transformation to show subsecond precision to the millisecond. If the lookup result is 8:20:35.123456, a
database returns 8:20:35.123456, but the Integration Service returns 8:20:35.123.
♦ SYSDATE built-in variable. When you use the SYSDATE built-in variable, the Integration Service returns
the current date and time for the node running the service process. However, when you push the
transformation logic to the database, the SYSDATE variable returns the current date and time for the
machine hosting the database. If the time zone of the machine hosting the database is not the same as the
time zone of the machine running the Integration Service process, the results can vary.

Working with Expressions


When you use pushdown optimization, the Integration Service converts the expression in the transformation or
in the workflow link by determining equivalent operators, variables, and functions in the database. If there is no
equivalent operator, variable, or function, the Integration Service processes the transformation logic. For
example, the Integration Service translates the aggregate function STDDEV() to STDDEV_SAMP() on
Teradata and to STDEV() on Microsoft SQL Server. The Integration Service processes any transformation that
uses the FIRST() aggregate function because no database supports this function.
Note: The Integration Service logs a message in the workflow log and the Pushdown Optimization Viewer when
it cannot push an expression to the database. Use the message to determine the reason why it could not push the
expression to the database.
The tables in this section summarize the availability of PowerCenter operators, variables, and functions in
databases.

Operators
Table 18-1 summarizes the availability of PowerCenter operators in databases. Each column marked with an X
indicates that the operator can be pushed to the database using source-side, target-side, or full pushdown

298 Chapter 18: Using Pushdown Optimization


optimization. Each column marked with an S indicates that the operator can be pushed to the database using
source-side pushdown optimization:

Table 18-1. Operators Available in Databases

IBM Microsoft Sybase


Operator Oracle Teradata ODBC
DB2 SQL Server ASE

+-*/ X X X X X X

% X X X X X

|| S S X S S

= > < >= <= <> X X X X X X

!= X X X X X X

^= X X X X X X

not and or X X X X X X

Variables
Table 18-2 summarizes the availability of PowerCenter variables in databases. Columns marked with an X
indicate variables that can be pushed to the database using source-side, target-side, or full pushdown
optimization:

Table 18-2. PowerCenter Variables Available in Databases

IBM Microsoft Sybase


Variable Oracle Teradata ODBC
DB2 SQL Server ASE

SESSSTARTTIME X X X X X X

SYSDATE X X X X X

WORKFLOWSTARTTIME

Functions
Table 18-3 summarizes the availability of PowerCenter functions in databases. Each column marked with an X
indicates the function can be pushed to the database using source-side, target-side, or full pushdown
optimization. Each column marked with an S indicates the function can be pushed to the database using
source-side pushdown optimization:

Table 18-3. PowerCenter Functions Available in Databases


Microsoft SQL Server

Function Exceptions
Sybase ASE
Teradata
IBM DB2

Oracle

ODBC

ABS() X X X X X X

ABORT()

ADD_TO_DATE() X S X S X You cannot push transformation logic to a Teradata


database for a transformation that uses an expression
containing ADD_TO_DATE to change days, hours,
minutes, or seconds.

AES_DECRYPT()

AES_ENCRYPT()

Working with Expressions 299


Table 18-3. PowerCenter Functions Available in Databases

Microsoft SQL Server


Function Exceptions

Sybase ASE
Teradata
IBM DB2

Oracle

ODBC
ASCII() X X X X

AVG() X X X X X X

CEIL() X X X X S

CHOOSE()

CHR() X X X X

CHRCODE()

COMPRESS()

CONCAT() S S X S S

COS() X X X X X X

COST() X X X X X X

COSH() X S X S X

COUNT() X X X X X X

CRC32()

CUME() X

DATE_COMPARE() S S S S S S

DATE_DIFF()

DECODE() X X X X X X

DECODE_BASE64()

DECOMPRESS()

ENCODE_BASE64()

ERROR()

EXP() X X X X X X

FIRST()

FLOOR() X X X X S

FV()

GET_DATE_PART() X X X X X

GREATEST() X

IIF() X X X X X X

IN() X X X X X X

INDEXOF()

INITCAP() X

INSTR() S X X S S

IS_DATE()

IS_NUMBER()

300 Chapter 18: Using Pushdown Optimization


Table 18-3. PowerCenter Functions Available in Databases

Microsoft SQL Server


Function Exceptions

Sybase ASE
Teradata
IBM DB2

Oracle

ODBC
IS_SPACES()

ISNULL() X X X X X X

LAST()

LAST_DAY() X Oracle returns the date up to the second. If the input


date contains subseconds, Oracle trims the date to the
second.

LEAST() X

LENGTH() X X X X X

LOWER() X X X X X X

LPAD() X

LTRIM() X X X X X If you use LTRIM in transformation logic, the database


treats the argument (' ') as NULL, but the Integration
Service treats the argument (' ') as spaces.

LOG() X S X S S

MAKE_DATE_TIME()

MAX() X X X X X X

MD5()

MEDIAN()

METAPHONE()

MIN() X X X X X X

MOD() X X X X X

MOVINGAVG()

MOVINGSUM()

NPER()

PERCENTILE()

PMT()

POWER() X X X X X X

PV()

RAND()

RATE()

REG_EXTRACT()

REG_MATCH()

REG_REPLACE

REVERSE()

REPLACECHR()

REPLACESTR()

Working with Expressions 301


Table 18-3. PowerCenter Functions Available in Databases

Microsoft SQL Server


Function Exceptions

Sybase ASE
Teradata
IBM DB2

Oracle

ODBC
RPAD() X

RTRIM() X X X X X If you use RTRIM in transformation logic, the database


treats the argument (' ') as NULL, but the Integration
Service treats the argument (' ') as spaces.

ROUND(DATE) X

ROUND(NUMBER) X X X X S

SET_DATE_PART()

SIGN() X X X X S

SIN() X X X X X X

SINH() X S X S X

SOUNDEX() X X X X If you use SOUNDEX in transformation logic, the


database treats the argument (' ') as NULL, but the
Integration Service treats the argument (' ') as spaces.

STDDEV() X X X X An IBM DB2 database and the Integration Service


produce different results for STDDEV. IBM DB2 uses a
different algorithm than other databases to calculate
STDDEV.

SUBSTR() S S X S S

SUM() X X X X X X

SQRT() X X X X X X

SYSDATE() X X X X X X When you push SYSDATE to the database, the


database server returns the timestamp in the time zone
of the database server, not the Integration Service.

SYSTIMESTAMP() X X X X X X When you push SYSTIMESTAMP to the database, the


database server returns the time stamp in the time zone
of the database server, not the Integration Service. If
you specify the format for SYSTIMESTAMP and push
SYSTIMESTAMP to an IBM DB2 or a Sybase database,
the database ignores the format and returns the
complete time stamp.

TAN() X X X X X X

TANH() X S X S X

TO_BIGINT X X X X X X

TO_CHAR(DATE) X X X X S

TO_CHAR(NUMBER) X X X X X

TO_DATE() X X X X X

TO_DECIMAL() X X X X X X

TO_FLOAT() X X X X X X

TO_INTEGER() X S X S X

TRUNC(DATE) X

TRUNC(NUMBER) X X X S S

302 Chapter 18: Using Pushdown Optimization


Table 18-3. PowerCenter Functions Available in Databases

Microsoft SQL Server


Function Exceptions

Sybase ASE
Teradata
IBM DB2

Oracle

ODBC
UPPER() X X X X X X

VARIANCE() X X X X An IBM DB2 database and the Integration Service


produce different results for VARIANCE. IBM DB2 uses
a different algorithm than other databases to calculate
VARIANCE.

Error Handling, Logging, and Recovery


The Integration Service and database process error handling, logging, and recovery differently.

Error Handling
When the Integration Service pushes transformation logic to the database, it cannot track errors that occur in
the database. As a result, it handles errors differently than when it processes the transformations in the session.
When the Integration Service runs a session configured for full pushdown optimization and an error occurs, the
database handles the errors. When the database handles errors, the Integration Service does not write reject rows
to the reject file.

Logging
When the Integration Service pushes transformation logic to the database, it cannot trace all the events that
occur inside the database server. The statistics the Integration Service can trace depend on the type of
pushdown optimization. When you push transformation logic to the database, the Integration Service generates
a session log with the following differences:
♦ The session log does not contain details for transformations processed by the database.
♦ The session log does not contain the thread busy percentage when the session is configured for full
pushdown optimization.
♦ The session log contains the number of loaded rows when the session is configured for source-side, target-
side, and full pushdown optimization.
♦ The session log does not contain the number of rows read from the source when the Integration Service uses
full pushdown optimization and pushes all transformation logic to the database.
♦ The session log contains the number of rows read from each source when the Integration Service uses source-
side pushdown optimization.

Recovery
If you configure a session for full pushdown optimization and the session fails, the Integration Service cannot
perform incremental recovery because the database processes the transformations. Instead, the database rolls
back the transactions. If the database server fails, it rolls back transactions when it restarts. If the Integration
Service fails, the database server rolls back the transaction.

Error Handling, Logging, and Recovery 303


If the failure occurs while the Integration Service is creating temporary sequence objects or views in the
database, which is before any rows have been processed, the Integration Service runs the generated SQL on the
database again.
If the failure occurs before the database processes all rows, the Integration Service performs the following tasks:
1. If applicable, the Integration Service drops and recreates temporary view or sequence objects in the
database to ensure duplicate values are not produced.
2. The Integration Service runs the generated SQL on the database again.
If the failure occurs while the Integration Service is dropping the temporary view or sequence objects from the
database, which is after all rows are processed, the Integration Service tries to drop the temporary objects again.

RELATED TOPICS:
♦ “Working with Sequences and Views” on page 304

Working with Slowly Changing Dimensions


You can push Type 1 and Type 3 slowly changing dimensions logic to a database. The slowly changing
dimensions logic in a mapping can be comprised of multiple transformations. The rules and guidelines of each
transformation determine how much slowly changing dimensions logic you can push to a database.
Use the following rules and guidelines when you configure the Integration Service to push slowly changing
dimensions transformation logic to a database:
♦ You can push transformations included in Type 1 and Type 3 slowly changing dimensions mapping to an
Oracle or IBM DB2 database.
♦ The source data must not have duplicate rows. The database can become deadlocked if it makes multiple
updates to the same row.
♦ You must create the slowly changing dimensions mapping using the 8.5 version of the Slowly Changing
Dimensions Wizard. You cannot push the slowly changing dimensions logic to the database if it was created
by the Slowly Changing Dimensions Wizard from a previous version.

RELATED TOPICS:
♦ “Pushdown Optimization and Transformations” on page 315

Working with Sequences and Views


To push transformation logic to a database, the Integration Service might create temporary objects in the
database. The Integration Service creates a temporary sequence object in the database to push Sequence
Generator transformation logic to the database.
The Integration Service creates a temporary view in the database in the following cases:
♦ To push a Source Qualifier transformation with a SQL override to the database.
♦ To push a filtered lookup, an unconnected relational lookup, or a Lookup transformation with a SQL
override to the database.
After the database transaction completes, the Integration Service drops sequence and view objects created for
pushdown optimization.
Note: The Integration Service does not parse or validate the SQL overrides. If you configure a session to push
the Source Qualifier or Lookup transformation with an SQL override to the database, test the SQL override
against the database before you run the session.

304 Chapter 18: Using Pushdown Optimization


Sequences
To push Sequence Generator transformation logic to a database, you must configure the session for pushdown
optimization with sequences.
If you configure a session to push Sequence Generator transformation logic to a database, the Integration
Service completes the following tasks:
1. Creates a sequence object in the database. The Integration Service generates the sequence object in the
database based on the Sequence Generator transformation logic. The Integration Service creates a unique
name for each sequence object. To create a unique sequence object name, it adds the prefix PM_S to a value
generated by a hash function.
2. Generates the SQL query and executes it against the database. The Integration Service generates and
executes the SQL query to push the Sequence Generator transformation logic to the database.
3. Drops the sequence object from the database. When the transaction completes, the Integration Service
drops the sequence object that it created in the database.

RELATED TOPICS:
♦ “Configuring Sessions for Pushdown Optimization” on page 309

Sequence Creation Example


You create the following mapping that uses a Sequence Generator transformation to generate primary keys for a
relational target:

When the Integration Service pushes transformation logic to the database, it executes the following SQL
statement to create the sequence object in the source database:
CREATE SEQUENCE PM_S6UHW42OGXTY7NICHYIOSRMC5XQ START WITH 1 INCREMENT BY 1 MINVALUE 0
MAXVALUE 9223372036854775807 NOCYCLE CACHE 9223372036854775807

After the Integration Service creates the sequence object, the Integration Service executes the SQL query to
process the transformation logic contained in the mapping:
INSERT INTO STORE_SALES(PRIMARYKEY, QUARTER, SALES, STORE_ID) SELECT
CAST(PM_S6UHW42OGXTY7NICHYIOSRMC5XQ.NEXTVAL AS FLOAT),
CAST(CAST(SALES_BYSTOREQUARTER_SRC.QUARTER AS FLOAT) AS VARCHAR2(10)),
CAST(CAST(SALES_BYSTOREQUARTER_SRC.SALES AS NUMBER(10, 2)) AS NUMBER(25, 2)),
CAST(SALES_BYSTOREQUARTER_SRC.STORE_ID AS NUMBER(0, 0)) FROM SALES_BYSTOREQUARTER_SRC

After the session completes, the Integration Service drops the sequence object from the database. If the session
fails, the Integration Service drops and recreates the sequence object before performing recovery tasks.

Views
You must configure the session for pushdown optimization with views to enable the Integration Service to
create the view objects in the database.
The Integration Service creates a view object under the following conditions:
♦ You configure pushdown optimization for a Source Qualifier or Lookup transformation configured with an
SQL override.
♦ You configure pushdown optimization for a Lookup transformation configured with a filter.

Working with Sequences and Views 305


♦ You configure pushdown optimization for an unconnected Lookup transformation.
If you configure the session for pushdown optimization with views, the Integration Service completes the
following tasks:
1. Creates a view in the database. The Integration Service creates a view in the database based on the lookup
filter, unconnected lookup, or SQL override in the Source Qualifier or Lookup transformation. To create a
unique view name, the Integration Service adds the prefix PM_V to a value generated by a hash function.
2. Executes an SQL query against the view. After the Integration Service creates a view object, it executes an
SQL query against the view created in the database to push the transformation logic to the source.
3. Drops the view from the database. When the transaction completes, the Integration Service drops the view
it created.

RELATED TOPICS:
♦ “Configuring Sessions for Pushdown Optimization” on page 309

View Creation Example


You create the following mapping that searches for 94117 zip codes in a customer database:

You want the search to return customers whose names match variations of the name Johnson, including names
such as Johnsen, Jonssen, and Jonson. To perform the name matching, you enter the following SQL override
for the Source Qualifier transformation:
SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME,
CUSTOMERS.LAST_NAME, CUSTOMERS.ADDRESS1, CUSTOMERS.ADDRESS2, CUSTOMERS.CITY,
CUSTOMERS.STATE, CUSTOMERS.POSTAL_CODE, CUSTOMERS.PHONE, CUSTOMERS.EMAIL FROM CUSTOMERS
WHERE CUSTOMERS.LAST_NAME LIKE 'John%' OR CUSTOMERS.LAST_NAME LIKE 'Jon%'

When the Integration Service pushes transformation logic for this session to the database, it executes the
following SQL statement to create a view in the source database:
CREATE VIEW PM_V4RZRW5GWCKUEWH35RKDMDPRNXI (CUSTOMER_ID, COMPANY, FIRST_NAME, LAST_NAME,
ADDRESS1, ADDRESS2, CITY, STATE, POSTAL_CODE, PHONE, EMAIL) AS SELECT
CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME, CUSTOMERS.LAST_NAME,
CUSTOMERS.ADDRESS1, CUSTOMERS.ADDRESS2, CUSTOMERS.CITY, CUSTOMERS.STATE,
CUSTOMERS.POSTAL_CODE, CUSTOMERS.PHONE, CUSTOMERS.EMAIL FROM CUSTOMERS WHERE
CUSTOMERS.LAST_NAME LIKE 'John%' OR CUSTOMERS.LAST_NAME LIKE 'Jon%'

After the Integration Service creates the view, it executes an SQL query to perform the transformation logic in
the mapping:
SELECT PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.CUSTOMER_ID,
PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.COMPANY, PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.FIRST_NAME,
PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.LAST_NAME, PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.ADDRESS1,
PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.ADDRESS2, PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.CITY,
PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.STATE, PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.POSTAL_CODE,
PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.PHONE, PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.EMAIL FROM
PM_V4RZRW5GWCKUEWH35RKDMDPRNXI WHERE (PM_V4RZRW5GWCKUEWH35RKDMDPRNXI.POSTAL_CODE =
94117)

After the session completes, the Integration Service drops the view from the database. If the session fails, the
Integration Service drops and recreates the view before performing recovery tasks.

306 Chapter 18: Using Pushdown Optimization


Troubleshooting Orphaned Sequences and Views
The Integration Service might not drop a sequence or view object from a database if the Integration Service,
session, or connectivity fails. In this case, manually remove these objects from the database.
Note: Orphaned sequence and view objects in the database do not affect performance.

Complete the following tasks to remove an orphaned sequence or view object from a database:
1. Identify the orphaned objects in the database. You can identify orphaned objects based on the session logs
or a query on the database. Analyze the session log to determine orphaned objects from a session run. Run
the database query to determine all orphaned objects in the database at a given time.
2. Remove the orphaned objects from the database. You can execute SQL statements to drop the orphaned
objects you identified.

Identifying Orphaned Objects Using Session Logs


The Integration Service writes an event log when it creates and drops a view or sequence object. If an
Integration Service, session, or connection fails when a session is running, you can check the session log to
determine sequence or view objects that were not dropped during the session.
For example, if the Integration Service drops the view PM_V4RZRW, the session log displays the following
message:
MAPPING> TM_6356 Starting pushdown cleanup SQL for source [CUSTOMERS]. : (Tue Feb 14
13:23:46 2006)

MAPPING> TM_6358 Executing pushdown cleanup SQL for source: DROP VIEW PM_V4RZRW
MAPPING> TM_6360 Completed pushdown cleanup SQL for source [CUSTOMERS ] successfully. : (Tue
Feb 14 13:23:46 2006)

Identifying Orphaned Objects Using an SQL Query


If the Integration Service does not drop the sequence or view objects, you can execute an SQL query on the
database to identify all orphaned sequence or view objects created by the Integration Service. If the Integration
Service ran multiple sessions or multiple Integration Services write to the same database account, the SQL
query returns all orphaned objects from every session that ran and did not drop sequence or view objects.
When the Integration Service creates a sequence or view object in the database, it adds the prefix PM_S to the
names of sequence objects and PM_V to the names of view objects. You can search for these objects based on
the prefix to identify them.
The following queries show the syntax to search for sequence objects created by the Integration Service:
IBM DB2:
SELECT SEQNAME FROM SYSCAT.SEQUENCES
WHERE SEQSCHEMA = CURRENT SCHEMA
AND SEQNAME LIKE ‘PM\_S%’ ESCAPE ‘\’

Oracle:
SELECT SEQUENCE_NAME FROM USER_SEQUENCES
WHERE SEQUENCE_NAME LIKE ‘PM\_S%’ ESCAPE ‘\’

The following queries show the syntax to search for view objects created by the Integration Service:
IBM DB2:
SELECT VIEWNAME FROM SYSCAT.VIEWS
WHERE VIEWSCHEMA = CURRENT SCHEMA
AND VIEW_NAME LIKE ‘PM\_V%’ ESCAPE ‘\’

Oracle:
SELECT VIEW_NAME FROM USER_VIEWS
WHERE VIEW_NAME LIKE ‘PM\_V%’ ESCAPE ‘\’

Working with Sequences and Views 307


Microsoft SQL Server or Sybase ASE:
SELECT NAME FROM SYSOBJECTS
WHERE TYPE=‘V’ AND NAME LIKE ‘PM\_V%’ ESCAPE ‘\’

Teradata:
SELECT TableName FROM DBC.Tables
WHERE CreatorName = USER
AND TableKind =‘V’
AND TableName LIKE ‘PM\_V%’ ESCAPE ‘\’

Removing the Orphaned Objects


After you get a list of the sequence and view objects created by the Integration Service, execute an SQL DROP
statement to drop the sequence or view objects from the database.
The following query shows the syntax to drop sequence objects created by the Integration Service on any
database:
DROP SEQUENCE <sequence name>

The following query shows the syntax to drop view objects created by the Integration Service on any database:
DROP VIEW <view name>

Using the $$PushdownConfig Mapping Parameter


Depending on the database workload, you might want to use source-side, target-side, or full pushdown
optimization at different times. For example, use source-side or target-side pushdown optimization during the
peak hours of the day, but use full pushdown optimization from midnight until 2 a.m. when database activity is
low.
To use different pushdown optimization configurations at different times, use the $$PushdownConfig mapping
parameter. The parameter lets you run a session using the different types of pushdown optimization. The
settings in the $$PushdownConfig parameter override the pushdown optimization settings in the session
properties.
Complete the following steps to configure the mapping parameter:
1. Create $$PushdownConfig in the Mapping Designer.
2. When you add the $$PushdownConfig mapping parameter in the Mapping Designer, use the following
values:

Field Value

Name $$PushdownConfig

Type Parameter

Datatype String

Precision or Scale 20

Aggregation n/a

Initial Value None

Description Optional

3. When you configure the session, select $$PushdownConfig for the Pushdown Optimization attribute.
4. Define the parameter in the parameter file.

308 Chapter 18: Using Pushdown Optimization


5. Enter one of the following values for $$PushdownConfig in the parameter file:

Value Description

None Specify to make the Integration Service process all transformation logic for the
session.

Source [Seq View] Specify to make the Integration Service push as much of the transformation logic
to the source database as possible.

Target [Seq View] Specify to make the Integration Service push as much of the transformation logic
to the target database as possible.

Full [Seq View] Specify to make the Integration Service push as much of the transformation logic
to the source and target databases. The Integration Service processes any
transformation logic that it cannot push to a database.

Optionally, specify ‘Seq’ or ‘View’ after ‘Source,’ ‘Target,’ or ‘Full’ to enable the Integration Service to
create sequence or view objects, respectively, in the database. For example, enter ‘Source View’ to use
source-side pushdown optimization and enable the creation of view objects in the source database.

RELATED TOPICS:
♦ “Working with Sequences and Views” on page 304

Configuring Sessions for Pushdown Optimization


You configure a session for pushdown optimization in the session properties. However, you might need to edit
the transformation, mapping, or session configuration to push more transformation logic to the database. Use
the Pushdown Optimization Viewer to examine the transformations that can be pushed to the database.

Pushdown Options
You can configure the following pushdown optimization options in the session properties:
♦ Pushdown Optimization. Select the type of pushdown optimization. If you use the $$PushdownConfig
mapping parameter, ensure that you configured the mapping parameter and defined a value for it in the
parameter file.
♦ Allow Temporary View for Pushdown. Select to allow the Integration Service to create temporary view
objects in the database when it pushes the session to the database. The Integration Service must create a view
in the database when the session contains an SQL override in the Source Qualifier transformation or Lookup
transformation, a filtered lookup, or an unconnected lookup.
♦ Allow Temporary Sequence for Pushdown. Select to allow the Integration Service to create temporary
sequence objects in the database. The Integration Service must create a sequence object in the database if the
session contains a Sequence Generator transformation.
You can use the Pushdown Optimization Viewer to determine if you need to edit the mapping, transformation,
or session configuration to push more transformation logic to the database. The Pushdown Optimization
Viewer informs you whether it can push transformation logic to the database using source-side, target-side, or
full pushdown optimization. If you can push transformation logic to the database, the Pushdown Optimization
Viewer lists all transformations that can be pushed to the database.
You can also select a pushdown option or pushdown group in the Pushdown Optimization Viewer to view the
corresponding SQL statement that is generated for the specified selections.
Note: When you select a pushdown option or pushdown group, you do not change the pushdown configuration.
To change the configuration, you must update the pushdown option in the session properties.

Configuring Sessions for Pushdown Optimization 309


Partitioning
You can push a session with multiple partitions to a database if the partition types are pass-through partitioning
or key range partitioning.

Pushdown Optimization for Pass-Through Partitioning


When you configure pushdown optimization for a session with pass-through partitioning, the database
processes data without redistributing rows among partitions. All rows in a single partition stay in the partition
after crossing a pass-through partition point.
You must configure all partition points for pass-through partitioning to push all transformation logic to the
database. For example, a session has four partition points. You configure the first three partition points for pass-
through partitioning and the last partition point for hash auto-keys partitioning. The Integration Service
pushes all transformation logic to the database, except the transformations at and after the last partition point.
The Integration Service processes the transformations at and after the last partition point.

Pushdown Optimization for Key-Range Partitioning


When you configure pushdown optimization for a session with key-range partitioning at the Source Qualifier
transformation, the Integration Service merges all the rows into the first partition and passes empty data for
each subsequent partition. The Integration Service creates an SQL statement for each partition. If the
Integration Service pushes only part of the transformation logic to the database, it does not redistribute the
rows across partitions when it runs the session.
The session must meet the following criteria to enable the Integration Service to push the all transformation
logic to the database:
♦ The end key range for each partition must equal the start range for the next partition to merge all rows into
the first partition. The end key range cannot overlap with the next partition. For example, if the end range
for the first partition is 3386, then the start range for the second partition must be 3386.
♦ You must configure the partition point at the Source Qualifier transformation to use hash auto-keys
partitioning and all subsequent partition points to use either hash auto-keys or pass-through partitioning.

310 Chapter 18: Using Pushdown Optimization


Example of Pushdown Optimization for Session with Multiple Partitions
Figure 18-2 shows a mapping that contains a Sorter transformation with hash auto-keys partitioning:

Figure 18-2. Sample Mapping with Two Partitions

The first key range is 1313 - 3340, and the second key range is 3340 - 9354. The SQL statement merges all the
data into the first partition:
SELECT ITEMS.ITEM_ID, ITEMS.ITEM_NAME, ITEMS.ITEM_DESC FROM ITEMS1 ITEMS WHERE
(ITEMS.ITEM_ID >= 1313) AND (ITEMS.ITEM_ID < 9354) ORDER BY ITEMS.ITEM_ID

The SQL statement selects items 1313 through 9354, which includes all values in the key range, and merges the
data from both partitions into the first partition.
The SQL statement for the second partition passes empty data:
SELECT ITEMS.ITEM_ID, ITEMS.ITEM_NAME, ITEMS.ITEM_DESC FROM ITEMS1 ITEMS WHERE (1 = 0)
ORDER BY ITEMS.ITEM_ID

Rules and Guidelines


Use the following rules and guidelines when you configure the Integration Service to push sessions with
multiple partitions to a database.
The Integration Service can push a session with multiple partitions to the database in the following situations:
♦ If the session uses pass-through partitioning at the partition point at the Source Qualifier transformation
and all subsequent partition points, the Integration Service can push the transformation logic to the database
using source-side, target-side, or full pushdown optimization.
♦ If the session uses key range partitioning at the Source Qualifier transformation and contains hash auto-keys
or pass-through partitions in downstream partition points, the Integration Service can push the
transformation logic to the database using source-side or full pushdown optimization.

Configuring Sessions for Pushdown Optimization 311


If pushdown optimization merges data from multiple partitions of a transformation into the first partition and
the Integration Service processes the transformation logic for a downstream transformation, the Integration
Service does not redistribute the rows among the partitions in the downstream transformation. It continues to
pass the rows to the first partition and pass empty data in the other partitions.

Target Load Rules


Target load rules can affect whether you can push a session to a database.
Table 18-4 shows how pushdown optimization works with different target load options. Columns marked with
an X indicate the pushdown optimization type is available for the target option:

Table 18-4. Pushdown Optimization with Target Load Options

Target Option Source Target Full

Insert X X X

Delete X X X

Update as update X X X

Update as insert X X X

Update else insert X X

Use the following rules and guidelines when you configure the Integration Service to push the target
transformation logic to a database and configure the target load rules:
♦ If you do not achieve performance gains when you use full pushdown optimization and the source rows are
treated as delete or update, use source-side pushdown optimization.
♦ You cannot use full pushdown optimization and treat source rows as delete or update if the session contains
a Union transformation and the Integration Service pushes transformation logic to a Sybase database.

Viewing Pushdown Groups


When you configure a session for pushdown optimization, the Integration Service generates SQL statements
based on the transformation logic. The group of transformations that can be processed as one SQL statement is
called a pushdown group.
When you push transformation logic to the database, the Integration Service might create multiple pushdown
groups depending on the number of pipelines, sources, targets, and the type of pushdown optimization you use.
If the session has multiple partitions, the Integration Service executes an SQL statement for each partition in
the group. If you join pipelines, transformations in each pipeline merge into one pushdown group. If the same
transformation is part of the transformation logic pushed to two or more targets, the transformation is part of
the pushdown group for each target.
You can view pushdown groups using the Pushdown Optimization Viewer. When you view pushdown groups in
the Pushdown Optimization Viewer, you can identify the transformations that can be pushed to the database
and those that the Integration Service processes. The Pushdown Optimization Viewer also displays messages
that you can use to determine how to edit transformations or mappings to push more transformation logic to
the database. The Pushdown Optimization Viewer cannot display the SQL that runs in the session if you use
mapping variables or if you configure the session to run on a grid.
When you view the generated SQL, the names of temporary view and sequence objects differ from the names of
the view and sequence objects generated during a session. The Integration Service uses a hash function to create
a unique name for each sequence and view object it generates.

312 Chapter 18: Using Pushdown Optimization


Figure 18-3 shows a mapping displayed in the Pushdown Optimization Viewer. It contains two pushdown
groups that can be pushed to the source and target database:

Figure 18-3. Sample Mapping with Two Pushdown Groups

Pipeline 1

Pushdown
Group 1
Pipeline 2

Pushdown
Group 2

Pipeline 1 and Pipeline 2 originate from different sources and contain transformations that are valid for
pushdown optimization. The Integration Service creates a pushdown group for each target, and generates an
SQL statement for each pushdown group. Because the two pipelines are joined, the transformations up to and
including the Joiner transformation are part of both pipelines and are included in both pushdown groups.
To view pushdown groups, open the Pushdown Optimization Viewer. The Pushdown Optimization Viewer
previews the pushdown groups and SQL statements that the Integration Service generates at run time.

To view pushdown groups:

1. In the Workflow Manager, open a session configured for pushdown optimization.


2. On the Mapping tab, select Pushdown Optimization in the left pane or View Pushdown Optimization in
the right pane.
The Pushdown Optimization Viewer displays the pushdown groups and the transformations that comprise
each group. It displays the SQL statement for each partition if you configure multiple partitions in the
pipeline. You can view messages and SQL statements generated for each pushdown group and pushdown
option. Pushdown options include None, To Source, To Target, Full, and $$PushdownConfig.

Configuring Sessions for Pushdown Optimization 313


Figure 18-4 shows a mapping containing one pipeline with two partitions that can be pushed to the source
database:

Figure 18-4. Pushdown Optimization Viewer

Select the type Number indicates the Select pushdown Select a value for the SQL statements are
of optimization. pushdown group. groups to view. connection variable. generated for the
pushdown group.

3. Select a pushdown option in the Pushdown Optimization Viewer to preview the SQL statements.
The pushdown option in the viewer does not affect the optimization that occurs at run time. To change
pushdown optimization for a session, edit the session properties.
4. If you configure the session to use a connection variable, click Preview Result for Connection to select a
connection value to preview.
If the session uses a connection variable, you must choose a connection value each time you open the
Pushdown Optimization Viewer. The Workflow Manager does not save the value you select, and the
Integration Service does not use this value at run time.
If an SQL override contains the $$$SessStartTime variable, the Pushdown Optimization Viewer does not
expand this variable when you preview pushdown optimization.

314 Chapter 18: Using Pushdown Optimization


CHAPTER 19

Pushdown Optimization and


Transformations
This chapter includes the following topics:
♦ Overview, 315
♦ Aggregator Transformation, 316
♦ Expression Transformation, 317
♦ Filter Transformation, 317
♦ Joiner Transformation, 318
♦ Lookup Transformation, 318
♦ Router Transformation, 320
♦ Sequence Generator Transformation, 320
♦ Sorter Transformation, 321
♦ Source Qualifier Transformation, 321
♦ Target, 322
♦ Union Transformation, 323
♦ Update Strategy Transformation, 324

Overview
When you configure pushdown optimization, the Integration Service tries to push each transformation to the
database. The following criteria affects whether the Integration Service can push the transformation to the
database:
♦ Type of transformation
♦ Location of the transformation in the mapping
♦ Mapping and session configuration for the transformation
♦ The expressions contained in the transformation
The criteria might also affect the type of pushdown optimization that the Integration Service can perform and
the type of database to which the transformation can be pushed.

315
The Integration Service can push logic of the following transformations to the database:
♦ Aggregator
♦ Expression
♦ Filter
♦ Joiner
♦ Lookup
♦ Router
♦ Sequence Generator
♦ Sorter
♦ Source Qualifier
♦ Target
♦ Union
♦ Update Strategy

Rules and Guidelines


Use the following rules and guidelines when you configure the Integration Service to push transformation logic
to a database.
The Integration Service processes the transformation logic if any of the following conditions are true:
♦ The transformation logic updates a mapping variable and saves it to the repository database.
♦ The transformation contains a variable port.
♦ The session is configured to override the default values of input or output ports.
♦ The database does not have an equivalent operator, variable, or function that is used in an expression in the
transformation.
♦ The mapping contains too many branches. When you branch a pipeline, the SQL statement required to
represent the mapping logic becomes more complex. The Integration Service cannot generate SQL for a
mapping that contains more than 64 two-way branches, 43 three-way branches, or 32 four-way branches. If
the mapping branches exceed these limitations, the Integration Service processes the downstream
transformations.
The Integration Service processes all transformations in the mapping if any of the following conditions are true:
♦ The session is a data profiling or debug session.
♦ The session is configured to log row errors.

Aggregator Transformation
The following table shows the pushdown types for each database to which you can push the Aggregator
transformation:

Database Pushdown Type

IBM DB2 Source-side, Full

Microsoft SQL Server Source-side, Full

Oracle Source-side, Full

Sybase ASE Source-side, Full

316 Chapter 19: Pushdown Optimization and Transformations


Database Pushdown Type

Teradata Source-side, Full

ODBC Source-side, Full

The Integration Service processes the Aggregator transformation if any of the following conditions are true:
♦ The session and mapping is configured for incremental aggregation.
♦ The transformation contains a nested aggregate function.
♦ The transformation contains a conditional clause in an aggregate expression.
♦ The transformation uses a FIRST(), LAST(), MEDIAN(), or PERCENTILE() function in any port
expression.
♦ An output port is not an aggregate or a part of the group by port.
♦ The transformation is downstream from another Aggregator transformation. The Integration Service pushes
the first Aggregator transformation to the database and processes all downstream Aggregator
transformations.

Expression Transformation
The following table shows the pushdown types for each database to which you can push the Expression
transformation:

Database Pushdown Type

IBM DB2 Source-side, Target-side, Full

Microsoft SQL Server Source-side, Target-side, Full

Oracle Source-side, Target-side, Full

Sybase ASE Source-side, Target-side, Full

Teradata Source-side, Target-side, Full

ODBC Source-side, Target-side, Full

The Integration Service processes the Expression transformation if the transformation calls an unconnected
Stored Procedure.

Filter Transformation
The following table shows the pushdown types for each database to which you can push the Filter
transformation:

Database Pushdown Type

IBM DB2 Source-side, Full

Microsoft SQL Server Source-side, Full

Oracle Source-side, Full

Sybase ASE Source-side, Full

Expression Transformation 317


Database Pushdown Type

Teradata Source-side, Full

ODBC Source-side, Full

The Integration Service processes the Filter transformation if the filter expression cannot be pushed to the
database. For example, if the filter expression contains an operator that cannot be pushed to the database, the
Integration Service does not push the filter expression to the database.

Joiner Transformation
The following table shows the pushdown types for each database to which you can push the Joiner
transformation:

Database Pushdown Type

IBM DB2 Source-side, Full

Microsoft SQL Server Source-side, Full

Oracle Source-side, Full

Sybase ASE Source-side, Full

Teradata Source-side, Full

ODBC Source-side, Full

The Integration Service processes the Joiner transformation if any of the following conditions are true:
♦ The Joiner transformation is downstream from an Aggregator transformation.
♦ The Integration Service cannot push the master and detail pipelines of the Joiner transformation to the
database.
♦ The Joiner transformation is configured with an outer join, and the master or detail source is a multi-table
join. The Integration Service cannot generate SQL to represent an outer join combined with a multi-table
join.
♦ The Joiner transformation is configured with a full outer join and configured for pushdown optimization to
Sybase.
♦ The session is configured to mark all source rows as updates and configured for pushdown optimization to
Teradata.

Lookup Transformation
When you configure a Lookup transformation for pushdown optimization, the database performs a lookup on
the database lookup table. The following table shows the pushdown types for each database to which you can
push the Lookup transformation:

Database Pushdown Type

IBM DB2 Source-side, Target-side, Full

Microsoft SQL Server Source-side, Full

Oracle Source-side, Target-side, Full

318 Chapter 19: Pushdown Optimization and Transformations


Database Pushdown Type

Sybase ASE Source-side, Full

Teradata Source-side, Full

ODBC Source-side, Full

Use the following rules and guidelines when you configure the Integration Service to push Lookup
transformation logic to a database:
♦ The database does not use PowerCenter caches when processing transformation logic.
♦ The Integration Service processes all transformations after a pipeline branch when multiple Lookup
transformations are present in different branches of pipeline, and the branches merge downstream.
♦ A session configured for target-side pushdown optimization fails if the session requires datatype conversion.
The Integration Service processes the Lookup transformation if any of the following conditions are true:
♦ The Lookup transformation is a pipeline lookup.
♦ The transformation uses a dynamic cache.
♦ The transformation is configured to return the first, last, or any matching value. To use pushdown
optimization, you must configure the Lookup transformation to report an error on multiple matches.
♦ The transformation is downstream from an Aggregator transformation.
♦ The session is configured to mark all source rows as updates and configured for pushdown optimization to
Teradata.
♦ The session is configured for source-side pushdown optimization and the lookup table and source table are
on different databases.
♦ The session is configured for target-side pushdown optimization and the lookup table and target table are on
different databases.
♦ The session is configured for full pushdown optimization and the source and target tables are not in the
same database as the lookup table.

Unconnected Lookup Transformation


Use the following rules and guidelines when you configure the Integration Service to push an unconnected
Lookup transformation to a database:
♦ If the Lookup transformation contains an SQL override or a filter, configure the session for pushdown
optimization with a view.
♦ The database might perform slower than the Integration Service if the session contains multiple
unconnected lookups. The generated SQL might be complex because the Integration Service creates an outer
join each time it invokes an unconnected lookup. Test the session with and without pushdown optimization
to determine which session has better performance.
The Integration Service processes an unconnected Lookup transformation if you configure target-side
pushdown optimization for the unconnected lookup.

Lookup Transformation with an SQL Override


Use the following rules and guidelines when you configure the Integration Service to push a Lookup
transformation with an SQL override to a database:
♦ Configure the session for pushdown optimization with a view.
♦ You cannot append an ORDER BY clause to the SQL statement in the lookup override. The session fails if
you append an ORDER BY clause.

Lookup Transformation 319


♦ The order of the columns in the lookup override must match the order of the ports in the Lookup
transformation. If you reverse the order of the ports in the lookup override, the query results transpose the
values.
♦ The session fails if the SELECT statement in the SQL override refers to a database sequence.
The Integration Service processes a Lookup transformation with an SQL override if the transformation contains
Informatica outer join syntax in the SQL override. Use ANSI outer join syntax in the SQL override to push the
transformation to a database.

Router Transformation
The following table shows the pushdown types for each database to which you can push the Router
transformation:

Database Pushdown Type

IBM DB2 Source-side, Full

Microsoft SQL Server Source-side, Full

Oracle Source-side, Full

Sybase ASE Source-side, Full

Teradata Source-side, Full

ODBC Source-side, Full

You can use source-side pushdown when all output groups merge into one transformation that can be pushed to
the source database.
The Integration Service processes the Router transformation if the router expression cannot be pushed to the
database. For example, if the expression contains an operator that cannot be pushed to the database, the
Integration Service does not push the expression to the database.

Sequence Generator Transformation


The following table shows the pushdown types for each database to which you can push the Sequence
Generator transformation:

Database Pushdown Type

IBM DB2 Source-side, Target-side, Full

Oracle Source-side, Target-side, Full

The Integration Service processes the Sequence Generator transformation if any of the following conditions are
true:
♦ The transformation is reusable.
♦ The transformation is connected to multiple targets.
♦ The transformation connects the CURRVAL port.
♦ The Integration Service cannot push all of the logic for the Sequence Generator transformation to the
database. For example, a Sequence Generator transformation creates sequence values that are supplied to two
branches of a pipeline. When you configure pushdown optimization, the database can create sequence values

320 Chapter 19: Pushdown Optimization and Transformations


for only one pipeline branch. When the Integration Service cannot push all of the Sequence Generator logic
to the database, the following message appears:
Pushdown optimization stops at the transformation <transformation name> because the
upstream Sequence Generator <Sequence Generator transformation name> cannot be pushed
entirely to the database.

♦ The pipeline branches before the Sequence Generator transformation and then joins back together after the
Sequence Generator transformation.
♦ The pipeline branches after the Sequence Generator transformation and does not join back together.
♦ A sequence value passes through an Aggregator, a Filter, a Joiner, a Sorter, or a Union transformation.
♦ The Sequence Generator transformation provides sequence values to a transformation downstream from a
Source Qualifier transformation that is configured to select distinct rows.
The Integration Service processes a transformation downstream from the Sequence Generator transformation if
it uses the NEXTVAL port of the Sequence Generator transformation in CASE expressions and is configured
for pushdown optimization to IBM DB2.

Sorter Transformation
The following table shows the pushdown types for each database to which you can push the Sorter
transformation:

Database Pushdown Type

IBM DB2 Source-side, Full

Microsoft SQL Server Source-side, Full

Oracle Source-side, Full

Sybase ASE Source-side, Full

Teradata Source-side, Full

ODBC Source-side, Full

Use the following rules and guidelines when you configure the Integration Service to push Sorter
transformation logic to a database:
♦ The Integration Service pushes the Sorter transformation to the database and processes downstream
transformations when the Sorter transformation is configured for a distinct sort.
♦ The Integration Service processes the Sorter transformation when the Sorter transformation is downstream
from a Union transformation and the port used as a sort key in the Sorter transformation is not projected
from the Union transformation to the Sorter transformation.

Source Qualifier Transformation


The following table shows the pushdown types for each database to which you can push the Source Qualifier
transformation:

Database Pushdown Type

IBM DB2 Source-side, Full

Microsoft SQL Server Source-side, Full

Sorter Transformation 321


Database Pushdown Type

Oracle Source-side, Full

Sybase ASE Source-side, Full

Teradata Source-side, Full

ODBC Source-side, Full

The Integration Service processes the Source Qualifier transformation logic when any of the following
conditions are true:
♦ The transformation contains Informatica outer join syntax in the SQL override or a user-defined join. Use
ANSI outer join syntax in the SQL override to enable the Integration Service to push the Source Qualifier
transformation to the database.
♦ The source is configured for database partitioning.
♦ The source is an Oracle source that uses an XMLType datatype.

Source Qualifier Transformation with an SQL Override


Use the following rules and guidelines when you configure pushdown optimization for a session containing a
Source Qualifier transformation with an SQL override:
♦ The SELECT statement in a custom SQL query must list the port names in the order in which they appear
in the transformation. If the ports are not in the correct order, the session can fail or output unexpected
results.
♦ Configure the session for pushdown optimization with a view.
♦ The session fails if the SELECT statement in the SQL override refers to a database sequence.
♦ The session fails if the SQL override contains an ORDER BY clause and you push the Source Qualifier
transformation logic to an IBM DB2, a Microsoft SQL Server, a Sybase ASE, or a Teradata database.
♦ If a Source Qualifier transformation is configured to select distinct values and contains an SQL override, the
Integration Service ignores the distinct configuration.
♦ If the session contains multiple partitions, specify the SQL override for all partitions.
♦ Test the SQL override query on the source database before you push it to the database because PowerCenter
does not validate the override SQL syntax. The session fails if the SQL syntax is not compatible with the
source database.

Target
The following table shows the pushdown types for each database to which you can push the target logic:

Database Pushdown Type

IBM DB2 Target-side, Full

Microsoft SQL Server Target-side, Full

Oracle Target-side, Full

Sybase ASE Target-side, Full

Teradata Target-side, Full

ODBC Target-side, Full

322 Chapter 19: Pushdown Optimization and Transformations


The Integration Service processes the target logic when you configure the session for full pushdown
optimization and any of the following conditions are true:
♦ The target includes a target update override.
♦ The session is configured for constraint-based loading, and the target load order group contains more than
one target.
♦ The session uses an external loader.
If you configure full pushdown optimization and the target uses a different connection than the source, the
Integration Service cannot push the all transformation logic to one database. Instead, it pushes as much
transformation logic as possible to the source database and pushes any remaining transformation logic to the
target database if it is possible.
The Integration Service processes the target logic when you configure the session for target-side pushdown
optimization and any of the following conditions are true:
♦ The target includes a target update override.
♦ The target is configured for database partitioning.
♦ The session is configured for bulk loading.
♦ The session uses an external loader. Use source-side pushdown optimization with an external loader to
enable the Integration Service to push the transformation logic to the source database.

Union Transformation
The following table shows the pushdown types for each database to which you can push the Union
transformation:

Database Pushdown Type

IBM DB2 Source-side, Full

Microsoft SQL Server Source-side, Full

Oracle Source-side, Full

Sybase ASE Source-side, Full

Teradata Source-side, Full

ODBC Source-side, Full

The Integration Service processes the Union transformation logic when any of the following conditions are
true:
♦ The Integration Service cannot push all input groups to the source database.
♦ The input groups do not originate from the same source.
♦ Some ports in the Union transformation are not connected to the upstream transformation that outputs
distinct rows.

Union Transformation 323


Update Strategy Transformation
The following table shows the pushdown types for each database to which you can push the Update Strategy
transformation:

Database Pushdown Type

IBM DB2 Full

Microsoft SQL Server Full

Oracle Full

Sybase ASE Full

Teradata Full

ODBC Full

Use the following rules and guidelines when you configure the Integration Service to push Update Strategy
transformation logic to a database:
♦ The generated SQL for an Update Strategy transformation with an update operation can be complex. Run
the session with and without pushdown optimization to determine which configuration is faster.
♦ If there are multiple operations to the same row, the Integration Service and database can process the
operations differently. To ensure that new rows are not deleted or updated when pushed to a database, source
rows are processed in the following order: delete transactions, update transactions, and then insert
transactions.
♦ If the Update Strategy transformation contains more than one insert, update, or delete operation, the
Integration Service generates and runs the insert, update, and delete SQL statements serially. The
Integration Service runs the three statements even if they are not required. This might decrease performance.
♦ The Integration Service ignores rejected rows when using full pushdown optimization. It does not write
reject rows to a reject file.
The Integration Service processes the Update Strategy transformation if the update strategy expression cannot
be pushed to the database. For example, if the expression contains an operator that cannot be pushed to the
database, the Integration Service does not push the expression to the database.
The Integration Service processes the Update Strategy transformation if any of the following conditions are true
♦ The transformation is upstream from an Aggregator, a Joiner, or a distinct Sorter transformation and the
session is configured to treat the source rows as data driven.
♦ The transformation uses operations other than the insert operation and the Integration Service cannot push
all transformation logic to the database.
♦ The update strategy expression returns a value that is not numeric and not Boolean.

324 Chapter 19: Pushdown Optimization and Transformations


CHAPTER 20

Running Concurrent Workflows


This chapter includes the following topics:
♦ Overview, 325
♦ Configuring Unique Workflow Instances, 326
♦ Configuring Concurrent Workflows of the Same Name, 327
♦ Using Parameters and Variables, 328
♦ Steps to Configure Concurrent Workflows, 329
♦ Starting and Stopping Concurrent Workflows, 329
♦ Monitoring Concurrent Workflows, 331
♦ Viewing Session and Workflow Logs, 331
♦ Rules and Guidelines for Concurrent Workflows, 332

Overview
Important: Effective in version 8.6, Informatica deprecated the functionality to run a concurrent workflow with
the same instance name. While this functionality is still in the 8.6 version, it will be removed in a future release.
If you have concurrent workflows that use the same instance name, you must manually reconfigure them to use
unique instance names.
A concurrent workflow is a workflow that can run as multiple instances concurrently. A workflow instance is a
representation of a workflow. When you configure a concurrent workflow, you enable the Integration Service to
run one instance of the workflow multiple times concurrently, or you define unique instances of the workflow
that run concurrently.
Configure a concurrent workflow with one of the following workflow options:
♦ Allow concurrent workflows with the same instance name. Configure one workflow instance to run
multiple times concurrently. Each instance has the same source, target, and variables parameters. The
Integration Service identifies each instance by the run ID. The run ID is a number that identifies a workflow
instance that has run. For more information about running the same instance concurrently, see
“Configuring Concurrent Workflows of the Same Name” on page 327.
♦ Configure unique workflow instances to run concurrently. Define each workflow instance name and
configure a workflow parameter file for the instance. You can define different sources, targets, and variables
in the parameter file. For more information about configuring the workflow instances, see “Configuring
Unique Workflow Instances” on page 326.

325
When you run concurrent workflows, the Workflow Monitor displays each workflow by workflow name and
instance name. If the workflow has no unique instance names, the Workflow Monitor displays the same
workflow name for each concurrent workflow run. For more information about viewing concurrent workflows
in the Workflow Monitor, see “Monitoring Concurrent Workflows” on page 331.
The Integration Service appends either an instance name or a run ID and time stamp to the workflow and
session log names to create unique log files for concurrent workflows. For more information about viewing logs
for concurrent workflows, see “Viewing Session and Workflow Logs” on page 331.

Configuring Unique Workflow Instances


You can configure more than one instance of a workflow and run each instance at the same time. When you
configure a workflow instance, you provide a unique name for the instance and configure a workflow parameter
file for the instance.
Configure workflow instances to run a workflow with different sources and targets. For example, your
organization receives sales data from three divisions. You create a workflow that reads the sales data and writes it
to the database. You configure three instances of the workflow. Each instance has a different workflow
parameter file that defines which sales file to process. You can run all instances of the workflow concurrently.
When you start the workflow, you can choose which instances to run. When you configure a concurrent
workflow to run with unique instances, you can run the instances concurrently. To run one instance multiple
times concurrently, configure the workflow to run with the same instance name.

RELATED TOPICS:
♦ “Steps to Configure Concurrent Workflows” on page 329

♦ “Starting and Stopping Concurrent Workflows” on page 329

Recovering Workflow Instances by Instance Name


You can recover workflow instances from the Workflow Monitor or pmcmd. When you enable a workflow for
recovery, the Integration Service appends the workflow run ID to the recovery storage file name.
When you recover a concurrent workflow, identify the instance that you want to recover. In the Workflow
Monitor right-click the instance to recover. When you recover with pmcmd, enter the instance name parameter.
Note: You cannot recover a session from the last checkpoint if the workflow updates a relational target. The
recovery session fails.

Rules and Guidelines


Use the following rules and guidelines when you run concurrent instances with unique instance names:
♦ The Integration Service can persist variables between concurrent workflow runs for concurrent workflows
with unique instance names.
♦ You can stop or abort a workflow by instance name from pmcmd.
♦ The Workflow Monitor displays the instance name for each workflow run. The instance name appears in the
workflow log and the Run Properties panel of the Workflow Monitor.
♦ When you configure a concurrent workflow to run with a unique instance name, the log file names contain
the instance name. You can archive the log files or save them by instance name and time stamp.

326 Chapter 20: Running Concurrent Workflows


Configuring Concurrent Workflows of the Same Name

Important: Effective in version 8.6, Informatica deprecated the functionality to run a concurrent workflow with
the same instance name. While this functionality is still in the 8.6 version, it will be removed in a future release.
If you have concurrent workflows that use the same instance name, you must manually reconfigure them to use
unique instance names.
You can enable a workflow to run concurrently without defining unique instance names. You can run more
than one instance of the same workflow name. The Integration Service distinguishes between each workflow
instance by a run identifier number, or run ID. Each workflow run has a unique run ID. The Integration
Service appends the run ID to the workflow and session log names, recovery file names, and other temporary
file names to create separate files for each workflow.
Run concurrent workflows with the same instance name when the workflows read from a real-time source, such
as a message queue or web service. For example, you manage data from multiple project teams. Create a
workflow that reads data from a message queue that determines the source data and targets. You can run the
instance multiple times concurrently and pass different connection parameters to the workflow instances from
the message queue.

Running Concurrent Web Service Workflows


When you run a web service workflow, the Integration Service can run more than one instance of a workflow to
improve performance. When you configure a workflow to run as a web service, you configure the number of
workflow instances to run on a hub and when to start a new workflow instance.
When you enable a workflow as a web service, the Workflow Designer enables the workflow to run concurrently
with the same workflow name. The Web Services Hub determines when to start a new instance of a web service
workflow based on the Maximum Run Count Per Hub and the Service Time property you configure for the web
service.
When the Web Services Hub starts a web service workflow instance, the instance has the same name as the other
workflow instance.
Note: When you enable a workflow as a web service, the Workflow Designer enables the workflow to run
concurrently by default.

Configuring Workflow Instances of the Same Name


When you enable a workflow to run concurrently with the same instance name, you can also configure
workflow instances and parameter files for the workflow. You can start each instance more than one time
concurrently.
For example, if you define a workflow and create two instances, you can start the workflow and run both
instances. You can start the workflow again to run the same instances concurrently.
The Workflow Monitor Task View shows four instances running concurrently:
wf_sales [Instance1]
wf_sales [Instance2]
wf_sales [Instance1]
wf_sales [Instance2]

RELATED TOPICS:
♦ “Starting and Stopping Concurrent Workflows” on page 329

Recovering Workflow Instances of the Same Name


When you enable the workflow for recovery, the Integration Service appends the run ID to the workflow
recovery storage file. You can recover workflows of the same name from the Workflow Monitor or pmcmd.

Configuring Concurrent Workflows of the Same Name 327


When you recover a concurrent workflow, you must identify which instance to recover. In the Workflow
Monitor right-click the instance to recover. When you recover with pmcmd, you enter the run ID parameter.
Note: You cannot recover a session from the last checkpoint if the workflow updates a relational target. The
recovery session fails with an error message.

Rules and Guidelines


Use the following rules and guidelines when you run concurrent instances of the same instance name:
♦ The Integration Service overwrites variables between concurrent workflow runs when the variables are the
same for each run.
♦ You can stop or abort a workflow by run ID from pmcmd.
♦ You can stop or abort workflow tasks by run ID from pmcmd.
♦ The Workflow Monitor does not display the run ID for each instance. The run ID appears in the workflow
log, session log, and the Run Properties panel of the Workflow Monitor.
♦ When you configure a concurrent workflow to run with the same instance name, the log file names always
contain time stamps.

Using Parameters and Variables


To prevent conflicts, configure a parameter file for each workflow instance.
Table 20-1 lists the parameters to configure for concurrent workflows:

Table 20-1. Session Parameters to Override for Concurrent Workflows

Parameter Type Parameter Name

Database Connection $DBConnectionName

Source File $InputFileName

Target File $OutputFileName

Reject File $BadFileName

Lookup File $LookupFileName

The Integration Service persists workflow variables by workflow run instance name.

Accessing the Run Instance Name or Run ID


When you enable a workflow to run concurrently with unique instance names, the Integration Service
distinguishes between workflow run instances by the run instance name. You can configure the same run
instance name for more than one workflow because each workflow instance is defined by a combination of the
workflow name and the run instance name. When you enable a workflow to run concurrently with the same
instance name, the Integration Service distinguishes between workflow run instances by the run ID.
The built-in variables $PMWorkflowRunInstanceName and $PMWorkflowRunId return the workflow run
instance name and run ID as string values. These variables are read-only. You can access them in the workflow
or the mapping to retrieve the name or run ID of the workflow instance. You can apply these variables to
expressions, file-watch events, or data. You can also use them to configure unique file names.
For example, create a pre-defined Event-Wait task to delete an indicator file after it appears. Define the file
name with $PMWorkflowRunInstance name. When you run two concurrent workflows with unique instance
names, each workflow Event-Wait task waits for and deletes a different indicator file.

328 Chapter 20: Running Concurrent Workflows


Note: When you run a workflow that is not enabled to run concurrently, $PMWorkflowRunInstanceName has
no value.

Steps to Configure Concurrent Workflows


You can enable a workflow for concurrent execution when you create or edit the workflow.

To enable a workflow for concurrent execution:

1. In the Workflow Manager, open the Workflow.


2. On the workflow General tab, enable concurrent execution.
The workflow is enabled to run concurrently with the same instance name.
3. To configure different instance names, click Configure Concurrent Execution.
The Configure Concurrent Execution dialog box appears.
4. Choose one of the following options:
♦ Allow concurrent run only with unique instance name. The Integration Service can run concurrent
workflows if the instance names are unique.
♦ Allow concurrent run with the same instance name. The Integration Service can run concurrent
workflows with the same name.
Important: Effective in version 8.6, Informatica deprecated the functionality to run a concurrent workflow
with the same instance name. While this functionality is still in the 8.6 version, it will be removed in a
future release. If you have concurrent workflows that use the same instance name, you must manually
reconfigure them to use unique instance names.
5. Optionally, click the Add button to add workflow instance names.
The workflow instance name is not case sensitive. The Workflow Designer validates the characters in the
instance name. You cannot use the following special characters in the instance name:
$ . + - = ~ ` ! % ^ & * () [] {} ' \ " ; : / ? , < > \\ | \t \r \n

6. Optionally, enter the path to a workflow parameter file for the instance. To use different sources, targets, or
variables for each workflow instance, configure a parameter file for each instance.
7. Click OK.

Starting and Stopping Concurrent Workflows


You can start concurrent workflows in the Workflow Designer or the Workflow Monitor. You can also start
workflows from pmcmd. To run unique workflow instances choose the instances to run when you start the
workflow.

Starting Workflow Instances from Workflow Designer


You can choose which workflow instances to run when you start a workflow from the Workflow Designer.
Follow these steps to start a workflow that has at least one instance defined.

Steps to Configure Concurrent Workflows 329


To start workflow instances from the Workflow Designer:

1. Open the folder containing the workflow.


2. From the Navigator, select the workflow that you want to start.
3. Right-click the workflow and select Start Workflow Advanced.
4. Choose the workflow run instances to start. By default, all instances are selected. You can clear all the
workflow instances and choose the instances to start.
5. Click OK to start the workflow instances.
The Workflow Monitor displays each concurrent workflow name and instance name.

Starting One Concurrent Workflow


When a concurrent workflow does not have unique instance names or when you do not want to run the
configured instances, you can start a workflow with the Workflow Designer. If you start the concurrent
workflow with the Start Workflow option, the Integration Service runs the workflow with the attributes and
variables you define on the workflow Properties and Variables tabs. The Integration Service does not run any of
the configured workflow instances.

To start one concurrent workflow instance:

1. Open the folder containing the workflow.


2. From the Navigator, select the workflow that you want to start.
3. Right-click the workflow in the Navigator and choose Start Workflow.
The Integration Service runs one instance of the workflow with the attributes from the workflow Properties
and Variables tabs.

Starting Concurrent Workflows from the Command Line


You can start one workflow instance at a time from the command line. The pmcmd startworkflow command has
a parameter for an instance name. When you start a workflow from the command line and you enter an
instance name parameter, the Integration Service runs that instance of the workflow. To run more than one
workflow instance, run the pmcmd startworkflow command multiple times.
If you do not enter an instance name parameter with startworkflow, the Integration Service runs the workflow
with the attributes and variables you define on the workflow Properties and Variables tabs. The Integration
Service does not run any of the configured workflow instances.

Creating Workflow Instances from the Command Line


You can dynamically create an instance when you start the workflow with pmcmd. Enter an instance name and
parameter file name. If the instance name is not configured, the Integration Service generates an instance. The
Integration Service can persist variables for the instance in the repository, but the instance does not appear on
the Concurrent Execution Configuration dialog box for the workflow.

Stopping or Aborting Concurrent Workflows


You can stop or abort a concurrent workflow from the Workflow Monitor or pmcmd. To stop or abort the
workflow from the Workflow Monitor, right-click workflow in the Navigator and select Stop or Abort. The
Workflow Monitor displays the status of the stop or abort command in the Output window.
To stop or abort a concurrent workflow from pmcmd, identify the workflow instance by entering the instance
name or the workflow run ID parameter on the command line. To stop or abort a task in a concurrent
workflow, enter the workflow instance name or run ID of the concurrent workflow that contains the task to
stop. When a concurrent workflow does not have unique instance names, you can find the workflow run ID in
the workflow log or the workflow run properties of the Workflow Monitor.

330 Chapter 20: Running Concurrent Workflows


Monitoring Concurrent Workflows
When you run a concurrent workflow, the Workflow Monitor displays each workflow run by workflow name. If
the workflow has a unique instance name, the Workflow Monitor displays the instance name with the workflow
name.
The following figure shows concurrent workflow and instance names in the Workflow Monitor Task view:

When the workflow has


unique instance names,
the Workflow Monitor
displays the instance
name for each workflow
run.

When you view concurrent workflows in Gantt Chart View, the Workflow Monitor displays one timeline for
each workflow name or workflow instance name. You can scroll the Time Window to view information about
specific workflow runs.

Viewing Session and Workflow Logs


The Integration Service names concurrent workflow session and workflow log files based on the way you
configure concurrency:
♦ Unique instance names. The Integration Service appends the instance name to the log file name.
♦ Instances of the same name. The Integration Service appends a run ID and time stamp to the log file name.
The Integration Service writes the run ID and the workflow type to the workflow log. The workflow type
describes if the workflow is a concurrent workflow.
For example:
Workflow SALES_REV started with run id [108], run instance name [WF_CONCURRENT_SALES1],
run type [Concurrent Run with Unique Instance Name].

Each session log also includes an entry that describes the workflow run ID and instance name:
Workflow: [SALES_REV] Run Instance Name: [WF_CONCURRENT_SALES1] Run Id: [108]

Log Files for Unique Workflow Instances


When you configure a workflow to run concurrently with unique instance names, the Integration Service
creates logs for each instance. Each log file name includes the instance name:
<workflow_name>.<workflow_instance_name>
<session_name>.<workflow_instance_name>

For example if the workflow log file name is wf_store_sales.log, and the instance name is store1_workflow, the
Integration Service creates the following log file names for the binary workflow log file and the text workflow
log file:
wf_store_sales.log.store1_workflow.bin
wf_store_sales.log.store1_workflow

To avoid overwriting the log files, you can archive the log files or save the log files by time stamp.

Monitoring Concurrent Workflows 331


RELATED TOPICS:
♦ Archiving Log Files, 379

Log Files for Workflow Instances of the Same Name


When you configure the workflow to run concurrently with the same instance name, the Integration Service
creates logs for each instance. Each log file name includes a run ID and time stamp by default:
<workflow_name>.<runID>.<timestamp>
<session_name>.<run ID>.<timestamp>

For example if the workflow log file name is wf_store_sales.log, and the run ID is 845, the Integration Service
creates the following log file names for the binary workflow log file and the text workflow log file if workflow
runs on July 12, 2007 at 11:20:45:
wf_store_sales.log.845.20070712112045.bin
wf_store_sales.log.845.20070712112045

When you configure the workflow to run concurrently with the same instance name, and you also define
instance names, the Integration Service appends the instance name and the time stamp to the log file name. For
example:
<workflow_name>.<instance_name>.<run ID>.20070712112045.bin
<session_name>.<instance_name>.<run ID>.20070712112045.bin

The Integration Service writes the instance name and run ID to the workflow log. For example:
Workflow wf_Stores started with run ID[86034], run instance name[Store1_workflow]

Rules and Guidelines for Concurrent Workflows


Use the following rules and guidelines for concurrent workflows:
♦ You cannot reference workflow run instances in parameter files. To use separate parameters for each instance,
you must configure different parameter files.
♦ If you use the same cache file name for more than one concurrent workflow instance, each workflow instance
will be valid. However, sessions will fail if conflicts occur writing to the cache.
♦ You can use pmcmd to restart concurrent workflows by run ID or instance name.
♦ If you configure multiple instances of a workflow and you schedule the workflow, the Integration Service
runs all instances at the scheduled time. You cannot run instances on separate schedules.
♦ Configure a worklet to run concurrently on the worklet General tab.
♦ You must enable a worklet to run concurrently if the parent workflow is enabled to run concurrently.
Otherwise the workflow is invalid.
♦ You can enable a worklet to run concurrently and place it in two non-concurrent workflows. The Integration
Service can run the two worklets concurrently.
♦ Two workflows enabled to run concurrently can run the same worklet. One workflow can run two instances
of the same worklet if the worklet has no persisted variables.
♦ A session in a worklet can run concurrently with a session in another worklet of the same instance name
when the session does not contain persisted variables.
The following transformations have restrictions with concurrent workflows:
♦ Aggregator transformation. You cannot use an incremental aggregation in a concurrent workflow. The
session fails.
♦ Lookup transformation. Use the following rules and guidelines for Lookup transformations in concurrent
workflows:
− You can use static or dynamic lookup cache with concurrent workflows.

332 Chapter 20: Running Concurrent Workflows


− When the cache is non-persistent, the Integration Service adds the workflow run ID as a prefix to the
cache file name.
− When the cache is an unnamed persistent cache, the Integration Service adds the run instance name as a
prefix to the cache file name.
− If the cache is an unnamed persistent cache and the current workflow is configured to allow concurrent
runs with the same instance name, the session fails.
− If the lookup cache name is parameterized, the Integration Service names the cache file with the parameter
value. Pass a different file name for each run instance.
♦ Sequence Generator transformation. To avoid generating the same set of sequence numbers for concurrent
workflows, configure the number of cached values in the Sequence Generator transformation.

Rules and Guidelines for Concurrent Workflows 333


334 Chapter 20: Running Concurrent Workflows
CHAPTER 21

Monitoring Workflows
This chapter includes the following topics:
♦ Overview, 335
♦ Using the Workflow Monitor, 336
♦ Customizing Workflow Monitor Options, 340
♦ Using Workflow Monitor Toolbars, 342
♦ Working with Tasks and Workflows, 342
♦ Workflow and Task Status, 346
♦ Using the Gantt Chart View, 347
♦ Using the Task View, 349
♦ Tips, 350

Overview
You can monitor workflows and tasks in the Workflow Monitor. A workflow is a set of instructions that tells an
Integration Service how to run tasks. Integration Services run on nodes or grids. The nodes, grids, and services
are all part of a domain.
With the Workflow Monitor, you can view details about a workflow or task in Gantt Chart view or Task view.
You can also view details about the Integration Service, nodes, and grids.
The Workflow Monitor displays workflows that have run at least once. You can run, stop, abort, and resume
workflows from the Workflow Monitor. The Workflow Monitor continuously receives information from the
Integration Service and Repository Service. It also fetches information from the repository to display historic
information.
The Workflow Monitor consists of the following windows:
♦ Navigator window. Displays monitored repositories, Integration Services, and repository objects.
♦ Output window. Displays messages from the Integration Service and the Repository Service.
♦ Properties window. Displays details about services, workflows, worklets, and tasks.
♦ Time window. Displays progress of workflow runs.
♦ Gantt Chart view. Displays details about workflow runs in chronological (Gantt Chart) format.
♦ Task view. Displays details about workflow runs in a report format, organized by workflow run.
The Workflow Monitor displays time relative to the time configured on the Integration Service machine. For
example, a folder contains two workflows. One workflow runs on an Integration Service in the local time zone,

335
and the other runs on an Integration Service in a time zone two hours later. If you start both workflows at 9
a.m. local time, the Workflow Monitor displays the start time as 9 a.m. for one workflow and as 11 a.m. for the
other workflow.
Figure 21-1 shows the Workflow Monitor in Gantt Chart view:

Figure 21-1. Workflow Monitor

Navigator
Time Window
Window

Gantt
Chart View

Task View

Output Window Properties Window

Toggle between Gantt Chart view and Task view by clicking the tabs on the bottom of the Workflow Monitor.
You can view and hide the Output and Properties windows in the Workflow Monitor. To view or hide the
Output window, click View > Output. To view or hide the Properties window, click View > Properties View.
You can also dock the Output and Properties windows at the bottom of the Workflow Monitor workspace. To
dock the Output or Properties window, right-click a window and select Allow Docking. If the window is
floating, drag the window to the bottom of the workspace. If you do not allow docking, the windows float in
the Workflow Monitor workspace.

Using the Workflow Monitor


The Workflow Monitor provides options to view information about workflow runs. After you open the
Workflow Monitor and connect to a repository, you can view dynamic information about workflow runs by
connecting to an Integration Service.
You can customize the Workflow Monitor display by configuring the maximum days or workflow runs the
Workflow Monitor shows. You can also filter tasks and Integration Services in both Gantt Chart and Task view.
Complete the following steps to monitor workflows:
1. Open the Workflow Monitor.
2. Connect to the repository containing the workflow.
3. Connect to the Integration Service.
4. Select the workflow you want to monitor.

336 Chapter 21: Monitoring Workflows


5. Select Gantt Chart view or Task view.

Opening the Workflow Monitor


You can open the Workflow Monitor in the following ways:
♦ Select Start > Programs > Informatica PowerCenter [version] > Client >Workflow Monitor from the
Windows Start menu.
♦ Configure the Workflow Manager to open the Workflow Monitor when you run a workflow from the
Workflow Manager.
♦ Click Tools > Workflow Monitor from the Designer, Workflow Manager, or Repository Manager. Or, click
the Workflow Monitor icon on the Tools toolbar. When you use a Tools button to open the Workflow
Monitor, PowerCenter uses the same repository connection to connect to the repository and opens the same
folders.
You can open multiple instances of the Workflow Monitor on one machine using the Windows Start menu.

To open the Workflow Monitor when you start a workflow:

1. In the Workflow Manager, click Tools > Options.


2. In the General tab, select Launch Workflow Monitor When Workflow Is Started.

To open the Workflow Monitor from the Workflow Manager:

1. In the Workflow Manager, connect to a repository.


2. In the Navigator, right-click an Integration Service or a repository and select Run Monitor.
The Workflow Monitor appears.

Connecting to a Repository
When you open the Workflow Monitor, you must connect to a repository. Connect to repositories by clicking
Repository > Connect. Enter the repository name and connection information.
After you connect to a repository, the Workflow Monitor displays a list of Integration Services available for the
repository. The Workflow Monitor can monitor multiple repositories, Integration Services, and workflows at
the same time.
Note: If you are not connected to a repository, you can remove the repository from the Navigator. Select the
repository in the Navigator and click Edit > Delete. The Workflow Monitor displays a message verifying that
you want to remove the repository from the Navigator list. Click Yes to remove the repository. You can connect
to the repository again at any time.

Connecting to an Integration Service


When you connect to a repository, the Workflow Monitor displays all Integration Services associated with the
repository. This includes active and deleted Integration Services. To monitor tasks and workflows that run on
an Integration Service, you must connect to the Integration Service. In the Navigator, the Workflow Monitor
displays a red icon over deleted Integration Services.
To connect to an Integration Service, right-click it and select Connect. When you connect to an Integration
Service, you can view all folders that you have permission for. To disconnect from an Integration Service, right-
click it and select Disconnect. When you disconnect from an Integration Service, or when the Workflow
Monitor cannot connect to an Integration Service, the Workflow Monitor displays disconnected for the
Integration Service status.
The Workflow Monitor is resilient to the Integration Service. If the Workflow Monitor loses connection to the
Integration Service, LMAPI tries to reestablish the connection for the duration of the PowerCenter Client
resilience time-out period.

Using the Workflow Monitor 337


After the connection is reestablished, the Workflow Monitor retrieves the workflow status from the repository.
Depending on your Workflow Monitor advanced settings, you may have to reopen the workflow to view the
latest status of child tasks.
You can also ping an Integration Service to verify that it is running. Right-click the Integration Service in the
Navigator and select Ping Integration Service. You can view the ping response time in the Output window.
Note: You can also open an Integration Service in the Navigator without connecting to it. When you open an
Integration Service, the Workflow Monitor gets workflow run information stored in the repository. It does not
get dynamic workflow run information from currently running workflows.

Filtering Tasks and Integration Services


You can filter tasks and Integration Services in both Gantt Chart view and Task view. Use the Filters menu to
hide tasks and Integration Services you do not want to view in the Workflow Monitor.

Filtering Tasks
You can view all or some workflow tasks. You can filter tasks you do not want to view. For example, if you want
to view only Session tasks, you can hide all other tasks. You can view all tasks at any time.
You can also filter deleted tasks. To filter deleted tasks, click Filters > Deleted Tasks.

To filter tasks:

1. Click Filters > Tasks.


The Filter Tasks dialog box appears.
2. Clear the tasks you want to hide, and select the tasks you want to view.
3. Click OK.
Note: When you filter a task, the Gantt Chart view displays a red link between tasks to indicate a filtered task.
You can double-click the link to view the tasks you hid.

Filtering Integration Services


When you connect to a repository, the Workflow Monitor displays all Integration Services associated with the
repository. You can filter out Integration Services to view only Integration Services you want to monitor.
When you hide an Integration Service, the Workflow Monitor hides the Integration Service from the Navigator
for the Gantt Chart and Task views. You can show the Integration Service again at any time.
You can hide unconnected Integration Services. When you hide a connected Integration Service, the Workflow
Monitor asks if you want to disconnect from the Integration Service and then filter it. You must disconnect
from an Integration Service before hiding it.

To filter Integration Ser vices:

1. In the Navigator, right-click a repository to which you are connected and select Filter Integration Services.
-or-
Connect to a repository and click Filters > Integration Services.
The Filter Integration Services dialog box appears.
2. Select the Integration Services you want to view and clear the Integration Services you want to filter. Click
OK.
If you are connected to an Integration Service that you clear, the Workflow Monitor prompts you to
disconnect from the Integration Service before filtering.
3. Click Yes to disconnect from the Integration Service and filter it.
The Workflow Monitor hides the Integration Service from the Navigator.

338 Chapter 21: Monitoring Workflows


Click No to remain connected to the Integration Service. If you click No, you cannot filter the Integration
Service.
Tip: To filter an Integration Service in the Navigator, right-click it and select Filter Integration Service.

Opening and Closing Folders


You can select which folders to open and close in the Workflow Monitor. When you open a folder, the
Workflow Monitor displays the number of workflow runs that you configured in the Workflow Monitor
options.
You can open and close folders in the Gantt Chart and Task views. When you open a folder, it opens in both
views. To open a folder, right-click it in the Navigator and select Open. Or, you can double-click the folder.

RELATED TOPICS:
♦ “Configuring General Options” on page 340

Viewing Statistics
You can view statistics about the objects you monitor in the Workflow Monitor. Click View > Statistics. The
Statistics window displays the following information:
♦ Number of opened repositories. Number of repositories you are connected to in the Workflow Monitor.
♦ Number of connected Integration Services. Number of Integration Services you connected to since you
opened the Workflow Monitor.
♦ Number of fetched tasks. Number of tasks the Workflow Monitor fetched from the repository during the
period specified in the Time window.
You can also view statistics about nodes and sessions.

RELATED TOPICS:
♦ “Viewing Integration Service Properties” on page 354

♦ “Viewing Session Statistics” on page 357

Viewing Properties
You can view properties for the following items:
♦ Tasks. You can view properties, such as task name, start time, and status. For more information about
viewing Task progress, see “Viewing Task Progress Details” on page 357. For information about viewing
command task details, see “Viewing Command Task Run Properties” on page 358. For information about
viewing session task details, see “Viewing Session Task Run Properties” on page 359.
♦ Sessions. You can view properties about the Session task and session run, such as mapping name and number
of rows successfully loaded. You can also view load statistics about the session run. For more information
about session details, see “Viewing Session Task Run Properties” on page 359. You can also view
performance details about the session run. For more information, see “Viewing Performance Details” on
page 361.
♦ Workflows. You can view properties such as start time, status, and run type. For more information about
viewing workflow details, see “Viewing Workflow Run Properties” on page 356.
♦ Links. When you double-click a link between tasks in Gantt Chart view, you can view tasks that you filtered
out.
♦ Integration Services. You can view properties such as Integration Service version and startup time. You can
also view the sessions and workflows running on the Integration Service. For more information about
viewing Integration Service details, see “Viewing Integration Service Properties” on page 354.
♦ Grid. You can view properties such as the name, Integration Service type, and code page of a node in the
Integration Service grid. You can view these details in the Integration Service Monitor. For more information
about the Integration Service Monitor, see “Viewing Integration Service Properties” on page 354.

Using the Workflow Monitor 339


♦ Folders. You can view properties such as the number of workflow runs displayed in the Time window. For
more information about viewing folder details, see “Viewing Repository Folder Details” on page 355.
To view properties for all objects, right-click the object and select Properties. You can right-click items in the
Navigator or the Time window in either Gantt Chart view or Task view.
To view link properties, double-click the link in the Time window of Gantt Chart view. When you view link
properties, you can double-click a task in the Link Properties dialog box to view the properties for the filtered
task.

Customizing Workflow Monitor Options


You can configure how the Workflow Monitor displays general information, workflows, and tasks. You can
configure general tasks such as the maximum number of days or runs that the Workflow Monitor appears. You
can also configure options specific to Gantt Chart and Task view.
Click Tools > Options to configure Workflow Monitor options.
You can configure the following options in the Workflow Monitor:
♦ General. Customize general options such as the maximum number of workflow runs to display and whether
to receive messages from the Workflow Manager. See “Configuring General Options” on page 340.
♦ Gantt Chart view. Configure Gantt Chart view options such as workspace color, status colors, and time
format. See “Configuring Gantt Chart View Options” on page 341.
♦ Task view. Configure which columns to display in Task view. See “Configuring Task View Options” on
page 341.
♦ Advanced. Configure advanced options such as the number of workflow runs the Workflow Monitor holds
in memory for each Integration Service. See “Configuring Advanced Options” on page 341.

Configuring General Options


You can customize general options such as the maximum number of days to display and which text editor to use
for viewing session and workflow logs.
Table 21-1 describes the options you can configure on the General tab:

Table 21-1. Workflow Monitor General Tab

Setting Description

Maximum Days Number of tasks the Workflow Monitor displays up to a maximum number of days.
Default is 5.

Maximum Workflow Maximum number of workflow runs the Workflow Monitor displays for each folder.
Runs per Folder Default is 200.

Receive Messages Select to receive messages from the Workflow Manager. The Workflow Manager sends
from Workflow messages when you start or schedule a workflow in the Workflow Manager. The
Manager Workflow Monitor displays these messages in the Output window.

Receive Select to receive notification messages in the Workflow Monitor and view them in the
Notifications from Output window. You must be connected to the repository to receive notifications.
Repository Service Notification messages include information about objects that another user creates,
modifies, or delete. You receive notifications about folders and Integration Services.
The Repository Service notifies you of the changes so you know objects you are
working with may be out of date. You also receive notices posted by the user who
manages the Repository Service.

340 Chapter 21: Monitoring Workflows


Configuring Gantt Chart View Options
You can configure Gantt Chart view options such as workspace color, status colors, and time format.
Table 21-2 describes the options you can configure on the Gantt Chart tab:

Table 21-2. Workflow Monitor Gantt Chart Tab

Setting Description

Status Color Select a status and configure the color for the status. The Workflow Monitor displays
tasks with the selected status in the colors you select. You can select two colors to
display a gradient.

Recovery Color Configure the color for the recovery sessions. The Workflow Monitor uses the status color
for the body of the status bar, and it uses and the recovery color as a gradient in the
status bar.

Workspace Color Select a color for each workspace component.

Time Format Select a display format for the time window.

Configuring Task View Options


You can select the columns you want to display in Task view. You can also reorder the columns and specify a
default column width.

Configuring Advanced Options


You can configure advanced options such as the number of workflow runs the Workflow Monitor holds in
memory for each Integration Service.
Table 21-3 describes the options you can configure on the Advanced tab:

Table 21-3. Workflow Monitor Advanced Tab

Setting Description

Expand Running Workflows Automatically Expands running workflows in the Navigator.

Refresh Workflow Tasks When the Refreshes workflow tasks when you reconnect to the
Connection to the Integration Service is Integration Service.
Re-established

Expand Workflow Runs When Opening the Expands workflows when you open the latest run.
Latest Runs

Hide Folders/Workflows That Do Not Hides folders or workflows under the Workflow Run column in
Contain Any Runs When Filtering By the Time window when you filter running or scheduled tasks.
Running/Schedule Runs

Highlight the Entire Row When an Item Is Highlights the entire row in the Time window for selected items.
Selected When you disable this option, the Workflow Monitor highlights
the item in the Workflow Run column in the Time window.

Open Latest 20 Runs At a Time You can open the number of workflow runs. Default is 20.

Minimum Number of Workflow Runs (Per Specifies the minimum number of workflow runs for each
Integration Service) the Workflow Monitor Integration Service that the Workflow Monitor holds in memory
Will Accumulate in Memory before it starts releasing older runs from memory.
When you connect to an Integration Service, the Workflow
Monitor fetches the number of workflow runs specified on the
General tab for each folder you connect to. When the number
of runs is less than the number specified in this option, the
Workflow Monitor stores new runs in memory until it reaches
this number.

Customizing Workflow Monitor Options 341


Using Workflow Monitor Toolbars
The Workflow Monitor toolbars allow you to select tools and tasks quickly. You can perform the following
toolbar operations:
♦ Display or hide a toolbar.
♦ Create a new toolbar.
♦ Add or remove buttons.
By default, the Workflow Monitor displays the following toolbars:
♦ Standard. Contains buttons to connect to and disconnect from repositories, print, view print previews,
search the workspace, show or hide the navigator in task view, and show or hide the output window.
♦ Integration Service. Contains buttons to connect to and disconnect from Integration Services, ping
Integration Service, and perform workflows operations.
♦ View. Contains buttons to configure time increments and show properties, workflow logs, or session logs.
♦ Filters. Contains buttons to display most recent runs, and to filter tasks, Integration Services, and folders.
After a toolbar appears, it displays until you exit the Workflow Monitor or hide the toolbar. You can drag each
toolbar to resize and reposition each toolbar.

Working with Tasks and Workflows


You can perform the following tasks with objects in the Workflow Monitor:
♦ Run a task or workflow.
♦ Resume a suspended workflow.
♦ Restart a task or workflow without recovery.
♦ Stop or abort a task or workflow.
♦ Schedule and unschedule a workflow.
♦ View session logs and workflow logs.
♦ View history names.

Opening and Displaying Previous Workflow Runs


In both the Gantt Chart view and Task View, you can open and view previous workflow runs.

To open previous runs:

1. In the Navigator or Workflow Run List, select the workflow with the runs you want to see.
2. Right-click the workflow and select Open Latest 20 Runs.
The menu option is disabled when the latest 20 workflow runs are already open.
Up to 20 of the latest runs appear.

342 Chapter 21: Monitoring Workflows


To display workflow runs in Task View:

1. Click on the Display Recent Runs icon.

Display
Recent
Runs

2. Select the number of runs you want to display.


The runs appear in the Workflow Run List.

Running a Task, Workflow, or Worklet


The Workflow Monitor displays workflows that have run at least once. In the Workflow Monitor, you can run a
workflow or any task or worklet in the workflow. To run a workflow or part of a workflow, right-click the
workflow or task and select a restart option. When you select restart, the task, workflow, or worklet runs on the
Integration Service you specify in the workflow properties.
You can also run part of a workflow. When you run part of a workflow, the Integration Service runs the
workflow from the selected task to the end of the workflow.
Restart behavior for real-time sessions depends on the real-time source.

RELATED TOPICS:
♦ “Manually Starting a Workflow” on page 79

♦ “Restarting and Recovering Real-time Sessions” on page 202

To run a workflow from the Workflow Monitor:

1. In the Navigator, select the workflow you want to run.


2. Right-click the workflow in the Navigator and select Restart.
-or-
Click Task > Restart.
The Integration Service runs the workflow you specify.

To run a task from the Workflow Monitor:

1. In the Navigator, select the task or worklet you want to run.


2. Right-click the task or worklet in the Navigator and select Restart Task.
The Integration Service runs the task or worklet you specify. It does not run the rest of the workflow.

To run a part of a workflow from the Workflow Monitor:

1. In the Navigator, select the task from which you want to run the workflow.
2. Right-click the task and select Restart Workflow from Task.
-or-
Click Task > Restart.
The Integration Service runs the workflow starting with the task you specify.

Working with Tasks and Workflows 343


Recovering a Workflow or Worklet
In the workflow properties, you can choose to suspend the workflow or worklet if a session fails. After you fix
the errors that caused the session to fail, recover the workflow in the Workflow Monitor. When you recover a
workflow, the Integration Service recovers the failed session, and continues running the rest of the tasks in the
workflow path.
Recovery behavior for real-time sessions depends on the real-time source.

To recover a workflow or worklet:

1. In the Navigator, select the workflow or worklet you want to recover.


2. Click Tasks > Recover.
-or-
Right-click the workflow or worklet in the Navigator and select Recover.
The Workflow Monitor displays Integration Service messages about the recover command in the Output
window.

RELATED TOPICS:
♦ “Restarting and Recovering Real-time Sessions” on page 202

♦ “Recovery Options” on page 225

Restarting a Task or Workflow Without Recovery


You can restart a task or workflow without recovery by using a cold start. Cold start is a start mode that the
Integration Service uses to restart a task or workflow without recovery. When you restart a failed task or
workflow that has recovery enabled, the Integration Service does not process recovery data. The Integration
Service clears the state of operation and the recovery file or table before it restarts the task or workflow. You do
not want to recover data if you already cleaned up the target system.

To restart a task or workflow without recovery:

1. In the Navigator, select the task or workflow you want to restart.


2. Click Tasks > Cold Start Task or Workflows > Cold Start Workflow.
-or-
Right-click the task or workflow in the Navigator and select Cold Start Task or Cold Start Workflow.

Stopping or Aborting Tasks and Workflows


You can stop or abort a task, workflow, or worklet in the Workflow Monitor at any time. When you stop a task
in the workflow, the Integration Service stops processing the task and all other tasks in its path. The Integration
Service continues running concurrent tasks. If the Integration Service cannot stop processing the task, you need
to abort the task. When the Integration Service aborts a task, it kills the DTM process and terminates the task.
Behavior for real-time sessions depends on the real-time source.

To stop or abort workflows, tasks, or worklets in the Workflow Monitor:

1. In the Navigator, select the task, workflow, or worklet you want to stop or abort.
2. Click Tasks > Stop or Tasks > Abort.
-or-
Right-click the task, workflow, or worklet in the Navigator and select Stop or Abort.
The Workflow Monitor displays the status of the stop or abort command in the Output window.

344 Chapter 21: Monitoring Workflows


RELATED TOPICS:
♦ “How the Integration Service Handles Stop and Abort” on page 82
♦ “Stopping Real-time Sessions” on page 202

Scheduling and Unscheduling Workflows


You can schedule and unschedule workflows in the Workflow Monitor. You can schedule any workflow that is
not configured to run on demand. When you try to schedule a run on demand workflow, the Workflow
Monitor displays an error message in the Output window.
When you schedule an unscheduled workflow, the workflow uses its original schedule specified in the workflow
properties. If you want to specify a different schedule for the workflow, you must edit the scheduler in the
Workflow Manager.

To schedule an unscheduled workflow in the Workflow Monitor:

1. Right-click the workflow and select Schedule.


2. The Workflow Monitor displays the workflow status as Scheduled, and displays a message in the Output
window.

To unschedule a scheduled workflow in the Workflow Monitor:

1. Right-click the workflow and select Unschedule.


2. The Workflow Monitor displays the workflow status as Unscheduled and displays a message in the Output
window.

RELATED TOPICS:
♦ “Scheduling a Workflow” on page 73

Viewing Session Logs and Workflow Logs


You can view session and workflow logs from the Workflow Monitor. To view workflow or session logs, right-
click the session or workflow in the Workflow Monitor and select Get Session Log or Get Workflow Log. You
can view the most recent session or workflow log.
When you open a session or workflow log, the Log Events window sends a request to the Log Agent. The Log
Agent retrieves logs from each node that ran the session or workflow. The Log Events window displays the logs
by node.
If you want to view past session or workflow logs, you can configure the session or workflow to save log files.
When you configure the workflow to save log files, the workflow creates a text file and the binary file that
displays in the Log Events window. You can save log files by timestamp or by workflow or session runs. You can
configure how many workflow or session runs to save.
To view past session or workflow log files, configure the session or workflow to save logs by timestamp.

To view a session or workflow log file:

1. Right-click a session or workflow in the Navigator or Time window.


2. Select Get Session Log or Get Workflow Log.
The most recent session or workflow log file opens in the Log Events window.
Tip: When the Workflow Monitor retrieves the session or workflow log, you can press Esc to cancel the
process.

RELATED TOPICS:
♦ “Session and Workflow Logs” on page 375

Working with Tasks and Workflows 345


Viewing History Names
If you rename a task, workflow, or worklet, the Workflow Monitor can show a history of names. When you start
a renamed task, workflow, or worklet, the Workflow Monitor displays the current name. To view a list of
historical names, select the task, workflow, or worklet in the Navigator. Right-click and select Show History
Names.

Workflow and Task Status


The Workflow Monitor displays the status of workflows and tasks.
Table 21-4 describes the different statuses for workflows and tasks:

Table 21-4. Workflow and Task Status

Status
Status for Description
Name

Aborted Workflows Integration Service aborted the workflow or task. The Integration Service
Tasks kills the DTM process when you abort a workflow or task.

Aborting Workflows Integration Service is in the process of aborting the workflow or task.
Tasks

Disabled Workflows You select the Disabled option in the workflow or task properties. The
Tasks Integration Service does not run the disabled workflow or task until you clear
the Disabled option.

Failed Workflows Integration Service failed the workflow or task due to errors.
Tasks

Preparing to Workflows Integration Service is waiting for an execution lock for the workflow.
Run

Running Workflows Integration Service is running the workflow or task.


Tasks

Scheduled Workflows You schedule the workflow to run at a future date. The Integration Service
runs the workflow for the duration of the schedule.

Stopped Workflows You choose to stop the workflow or task in the Workflow Monitor. The
Tasks Integration Service stopped the workflow or task.

Stopping Workflows Integration Service is in the process of stopping a workflow or task.


Tasks

Succeeded Workflows Integration Service successfully completed the workflow or task.


Tasks

Suspended Workflows Integration Service suspends the workflow because a task fails and no other
Worklets tasks are running in the workflow. This status is available when you select
the Suspend on Error option.

Suspending Workflows A task fails in the workflow when other tasks are still running. The Integration
Worklets Service stops executing the failed task and continues executing tasks in
other paths. This status is available when you select the Suspend on Error
option.

Terminated Workflows Integration Service terminated unexpectedly when it was running this
Tasks workflow or task.

Terminating Workflows Integration Service is stopping or aborting the workflow or task.


Tasks

346 Chapter 21: Monitoring Workflows


Table 21-4. Workflow and Task Status

Status
Status for Description
Name

Unknown Workflows The Integration Service cannot determine the status of the workflow or task,
Status Tasks or the Integration Service does not respond to a ping from the Workflow
Monitor. The status may be changing. For example, the status may have
been Running but is changing to Stopping.

Unscheduled Workflows You removed a workflow from the schedule.

Waiting Workflows Integration Service is waiting for available resources so it can run the
Tasks workflow or task. For example, you may set the maximum number of running
Session and Command tasks allowed for each Integration Service process
on the node to 10. If the Integration Service is already running 10 concurrent
sessions, all other workflows and tasks have the Waiting status until the
Integration Service is free to run more tasks.

To see a list of tasks by status, view the workflow in Task view and sort by status. Or, click Edit > List Tasks in
Gantt Chart view.

Using the Gantt Chart View


You can view chronological details of workflow runs with the Gantt Chart view. The Gantt Chart view displays
the following information:
♦ Task name. Name of the task in the workflow.
♦ Duration. The length of time the Integration Service spends running the most recent task or workflow.
♦ Status. The status of the most recent task or workflow.
♦ Connection between objects. The Workflow Monitor shows links between objects in the Time window.

RELATED TOPICS:
♦ “Workflow and Task Status” on page 346

Listing Tasks and Workflows


The Workflow Monitor lists tasks and workflows in all repositories you connect to. You can view tasks and
workflows by status, such as failed or succeeded. You can highlight the task in Gantt Chart view by double-
clicking the task in the list.

To view a list of tasks and workflows by status:

1. Open the Gantt Chart view and click Edit > List Tasks.
2. In the List What field, select the type of task status you want to list.
For example, select Failed to view a list of failed tasks and workflows.
3. Click List to view the list.
Tip: Double-click the task name in the List Tasks dialog box to highlight the task in Gantt Chart view.

Navigating the Time Window in Gantt Chart View


You can scroll through the Time window in Gantt Chart view to monitor the workflow runs. To scroll the Time
window, use one of the following methods:
♦ Use the scroll bars.
♦ Right-click the task or workflow and click Go To Next Run or Go To Previous Run.

Using the Gantt Chart View 347


♦ Click View > Organize to select the date you want to display.
When you click View > Organize, the Go To field appears above the Time window. Click the Go To field to
view a calendar and select the date you want to display. When you select a date, the Workflow Monitor displays
that date beginning at 12:00 a.m.

Zooming the Gantt Chart View


You can change the zoom settings in Gantt Chart view. By default, the Workflow Monitor shows the Time
window in increments of one hour. You can change the time increments to zoom the Time window.
Figure 21-2 shows the Time window in 30 minute increments:

Figure 21-2. Zooming the Gantt Chart View

Zoom

30 Minute
Increments
Solid Line
for Hour
Increments
Dotted Line
for Half Hour
Increments

To zoom the Time window in Gantt Chart view, click View > Zoom, and then select the time increment. You
can also select the time increment in the Zoom button on the toolbar.

Performing a Search
Use the search tool in the Gantt Chart view to search for tasks, workflows, and worklets in all repositories you
connect to. The Workflow Monitor searches for the word you specify in task names, workflow names, and
worklet names. You can highlight the task in Gantt Chart view by double-clicking the task after searching.

To perform a search:

1. Open the Gantt Chart view and click Edit > Find.
The Find Object dialog box appears.
2. In the Find What field, enter the keyword you want to find.
3. Click Find Now.
The Workflow Monitor displays a list of tasks, workflows, and worklets that match the keyword.
Tip: Double-click the task name in the Find Object dialog box to highlight the task in Gantt Chart view.

348 Chapter 21: Monitoring Workflows


Opening All Folders
You can open all folders that you have permission for in a repository. To open all the folders in the Gantt Chart
view, right-click the Integration Service you want to view, and select Open All Folders. The Workflow Monitor
displays workflows and tasks in the folders.

Using the Task View


The Task view displays information about workflow runs in a report format. The Task view provides a
convenient way to compare and filter details of workflow runs. Task view displays the following information:
♦ Workflow run list. The list of workflow runs. The workflow run list contains folder, workflow, worklet, and
task names. The Workflow Monitor displays workflow runs chronologically with the most recent run at the
top. It displays folders and Integration Services alphabetically.
♦ Status message. Message from the Integration Service regarding the status of the task or workflow.
♦ Run type. The method you used to start the workflow. You might manually start the workflow or schedule
the workflow to start.
♦ Node. Node of the Integration Service that ran the task.
♦ Start time. The time that the Integration Service starts running the task or workflow.
♦ Completion time. The time that the Integration Service finishes executing the task or workflow.
♦ Status. The status of the task or workflow.
You can perform the following tasks in Task view:
♦ Filter tasks. Use the Filter menu to select the tasks you want to display or hide.
♦ Hide and view columns. Hide or view an entire column in Task view.
♦ Hide and view the Navigator. You can hide the Navigator in Task view. Click View > Navigator to hide or
view the Navigator.
To view the tasks in Task view, select the Integration Service you want to monitor in the Navigator.

Filtering in Task View


In Task view, you can view all or some workflow tasks. You can filter tasks in the following ways:
♦ By task type. You can filter out tasks you do not want to view. For example, if you want to view only Session
tasks, you can filter out all other tasks.
♦ By nodes in the Navigator. You can filter the workflow runs in the Time window by selecting different
nodes in the Navigator. For example, when you select a repository name in the Navigator, the Time window
displays all workflow runs that ran on the Integration Services registered to that repository. When you select
a folder name in the Navigator, the Time window displays all workflow runs in that folder.
♦ By the most recent runs. To display by the most recent runs, click Filters > Most Recent Runs and select the
number of runs you want to display.
♦ By Time window columns. You can click Filters > Auto Filter and filter by properties you specify in the
Time window columns.

Using the Task View 349


To filter by Time view columns:

1. Click Filters > Auto Filter.


The Filter button appears in the some columns of the Time Window in Task view.

Filter Button

Select the
workflows you
want to display.

2. Click the Filter button in a column in the Time window.


3. Select the properties you want to filter.
Tip: If you want to view all tasks, select All.

When you click the Filter button in either the Start Time or Completion Time column, you can select a
custom time to filter.
4. Select Custom for either Start Time or Completion Time.
The Filter Start Time or Custom Completion Time dialog box appears.
5. Choose to show tasks before, after, or between the time you specify.
6. Select the date and time. Click OK.

RELATED TOPICS:
♦ “Filtering Tasks and Integration Services” on page 338

Opening All Folders


You can open all folders that you have permission for in a repository. To open all folders in the Task view, right-
click the Integration Service with the folders you want to view, and select Open All Folders. The Workflow
Monitor displays workflows and tasks in the folders.

Tips
Reduce the size of the Time window.
When you reduce the size of the Time window, the Workflow Monitor refreshes the screen faster, reducing
flicker.

350 Chapter 21: Monitoring Workflows


Use the Repository Manager to truncate the list of workflow logs.
If the Workflow Monitor takes a long time to refresh from the repository or to open folders, truncate the list of
workflow logs. When you configure a session or workflow to archive session logs or workflow logs, the
Integration Service saves those logs in local directories. The repository also creates an entry for each saved
workflow log and session log. If you move or delete a session log or workflow log from the workflow log
directory or session log directory, truncate the lists of workflow and session logs to remove the entries from the
repository. The repository always retains the most recent workflow log entry for each workflow.

Tips 351
352 Chapter 21: Monitoring Workflows
CHAPTER 22

Viewing Workflow Monitor Details


This chapter includes the following topics:
♦ Overview, 353
♦ Viewing Repository Service Details, 354
♦ Viewing Integration Service Properties, 354
♦ Viewing Repository Folder Details, 355
♦ Viewing Workflow Run Properties, 356
♦ Viewing Worklet Run Properties, 357
♦ Viewing Command Task Run Properties, 358
♦ Viewing Session Task Run Properties, 359
♦ Viewing Performance Details, 361

Overview
The Workflow Monitor displays information that you can use to troubleshoot and analyze workflows. You can
view details about services, workflows, worklets, and tasks in the Properties window of the Workflow Monitor.
You can view the following details in the Workflow Monitor:
♦ Repository Service details. View information about repositories, such as the number of connected
Integration Services. For more information, see “Viewing Repository Service Details” on page 354.
♦ Integration Service properties. View information about the Integration Service, such as the Integration
Service Version. You can also view system resources that running workflows consume, such as the system
swap usage at the time of the running workflow. For more information, see “Viewing Integration Service
Properties” on page 354.
♦ Repository folder details. View information about a repository folder, such as the folder owner. For more
information, see “Viewing Repository Folder Details” on page 355.
♦ Workflow run properties. View information about a workflow, such as the start and end time. For more
information, see “Viewing Workflow Run Properties” on page 356.
♦ Worklet run properties. View information about a worklet, such as the execution nodes on which the
worklet is run. For more information, see “Viewing Worklet Run Properties” on page 357.
♦ Command task run properties. View the information about Command tasks in a running workflow, such as
the start and end time. For more information, see “Viewing Command Task Run Properties” on page 358.
♦ Session task run properties. View information about Session tasks in a running workflow, such as details on
session failures. For more information, see “Viewing Session Task Run Properties” on page 359.

353
♦ Performance details. View counters that help you understand the session and mapping efficiency, such as
information on the data cache size for an Aggregator transformation. For more information, see “Viewing
Performance Details” on page 361.

Viewing Repository Service Details


To view details about a repository, right-click on the repository and choose Properties.
Table 22-1 describes the attributes that appear in the Repository Details area:

Table 22-1. Workflow Monitor Repository Details Attributes

Attribute Name Description

Repository Name Name of the repository.

Is Opened Yes, if you are connected to the repository. Otherwise, value is No.

User Name Name of the user connected to the repository. Attribute appears if you are connected
to the repository.

Number of Connected Number of Integration Services you are connected to in the Workflow Monitor.
Integration Services Attribute appears if you are connected to the repository.

Is Versioning Enabled Indicates whether repository versioning is enabled.

Viewing Integration Service Properties


When you view Integration Service properties, the following areas appear in the Properties window:
♦ Integration Service Details. Displays information about the Integration Service.
♦ Integration Service Monitor. Displays system resource usage information about nodes associated with the
Integration Service.

Viewing Integration Service Details


To view details about the Integration Service, right-click an Integration Service and choose Properties.
Table 22-2 describes the attributes that appear in the Integration Service Details area:

Table 22-2. Workflow Monitor Integration Service Details Attributes

Attribute Name Description

Integration Service Name of the Integration Service.


Name

Integration Service PowerCenter version and build. Appears if you are connected to the Integration Service
Version in the Workflow Monitor.

Integration Service Data movement mode of the Integration Service. Appears if you are connected to the
Mode Integration Service in the Workflow Monitor.

Integration Service The operating mode of the Integration Service. Appears if you are connected to the
OperatingMode Integration Service in the Workflow Monitor.

Startup Time Time the Integration Service started. Startup Time appears in the following format:
MM/DD/YYYY HH:MM:SS AM|PM. Appears if you are connected to the Integration
Service in the Workflow Monitor.

354 Chapter 22: Viewing Workflow Monitor Details


Table 22-2. Workflow Monitor Integration Service Details Attributes

Attribute Name Description

Current Time Current time of the Integration Service.

Last Updated Time Time the Integration Service was last updated. Last Updated Time appears in the
following format: MM/DD/YYYY HH:MM:SS AM|PM. Appears if you are connected to
the Integration Service in the Workflow Monitor.

Grid Assigned Grid the Integration Service is assigned to. Attribute appears if the Integration Service
is assigned to a grid. Appears if you are connected to the Integration Service in the
Workflow Monitor.

Node(s) Names of nodes configured to run Integration Service processes. Appears if you are
connected to the Integration Service in the Workflow Monitor.

Is Connected Appears if you are not connected to the Integration Service.

Is Registered Displays one of the following values:


- Yes if the Integration Service is associated with a repository.
- No if the Integration Service is not associated with a repository.
Appears if you are not connected to the Integration Service.

Viewing Integration Service Monitor


The Integration Service Monitor displays system resource usage information about nodes associated with the
Integration Service. This window also displays system resource usage information about tasks running on the
node.
To view the Integration Service Monitor, right-click an Integration Service and choose Properties. The
Integration Service Monitor area appears if you are connected to an Integration Service. You can view the
Integration Service type and code page for each node the Integration Service is running on. To view the tool tip
for the Integration Service type and code page, move the pointer over the node name.
Table 22-3 describes the attributes that appear in the Integration Service Monitor area:

Table 22-3. Workflow Monitor Integration Service Monitor Attributes

Attribute Name Description

Node Name Name of the node on which the Integration Service is running.

Folder Folder that contains the running workflow.

Workflow Name of the running workflow.

Task/Partition Name of the session and partition that is running. Or, name of Command task that is
running.

Status Status of the task.

Process ID Process ID of the task.

CPU % For a node, this is the percent of CPU usage of processes running on the node. For a
task, this is the percent of CPU usage by the task process.

Memory Usage For a node, this is the memory usage of processes running on the node. For a task, this
is the memory usage of the task process.

Swap Usage Amount of swap space usage of processes running on the node.

Viewing Repository Folder Details


To view information about a repository folder, right-click the folder and choose Properties.

Viewing Repository Folder Details 355


Table 22-4 describes the attributes that appear in the Folder Details area:

Table 22-4. Workflow Monitor Folder Details Attributes

Attribute Name Description

Folder Name Name of the repository folder.

Is Opened Indicates if the folder is open.

Number of Workflow Number of workflows that have run in the time window during which the Workflow
Runs Within Time Monitor displays workflow statistics. For more information about configuring a time
Window window for workflows, see “Configuring General Options” on page 340.

Number of Fetched Number of workflow runs displayed during the time window.
Workflow Runs

Workflows Fetched Time period during which the Integration Service fetched the workflows.
Between Appears as, DD/MM/YYYT HH:MM:SS and DD/MM/YYYT HH:MM:SS.

Deleted Indicates if the folder is deleted.

Owner Repository folder owner.

Viewing Workflow Run Properties


The Workflow Run Properties window displays information about workflows, such as the name of the
Integration Service assigned to the workflow and workflow run details.
When you view workflow properties, the following areas appear in the Properties window:
♦ Workflow details. View information about the workflow.
♦ Task progress details. View information about the tasks in the workflow.
♦ Session statistics. View information about the session.

Viewing Workflow Details


To view workflow details in the Properties window, right-click on a workflow and choose Get Run Properties.
In the Properties window, you can click Get Workflow Log to view the Log Events window for the workflow.
Table 22-5 describes the attributes that appear in the Workflow Details area:

Table 22-5. Workflow Monitor Workflow Details Attributes

Attribute Name Description

Task Name Name of the workflow.

Concurrent Type

OS Profile Name of the operating system profile assigned to the workflow. Value is empty if an
operating system profile is not assigned to the workflow.

Task Type Task type is Workflow.

Integration Service Name of the Integration Service assigned to the workflow.


Name

User Name Name of the user running the workflow.

Start Time Start time of the workflow.

End Time End time of the workflow.

Recovery Time(s) Times of recovery workflows.

356 Chapter 22: Viewing Workflow Monitor Details


Table 22-5. Workflow Monitor Workflow Details Attributes

Attribute Name Description

Status Status of the workflow.

Status Message Message about the workflow status.

Run Type Method used to start the workflow.

Deleted Yes if the workflow is deleted from the repository. Otherwise, value is No.

Version Number Version number of the workflow.

Execution Node(s) Nodes on which workflow tasks run.

Viewing Task Progress Details


The Task Progress Details area displays the Gantt Chart view of Session and Command tasks in a running
workflow.

Viewing Session Statistics


The Session Statistics area displays information about sessions, such as the session run time and the number or
rows loaded to the targets.
Table 22-6 describes the attributes that appear in the Session Statistics area:

Table 22-6. Workflow Monitor Session Statistics Attributes

Attribute Name Description

Session Name of the session.

Source Success Rows Number of rows the Integration Service successfully read from the source.

Source Failed Rows Number of rows the Integration Service failed to read from the source.

Target Success Rows Number of rows the Integration Service wrote to the target.

Target Failed Rows Number of rows the Integration Service failed to write the target.

Total Transformation Errors Number of transformation errors in the session.

Start Time Start time of the session.

End Time End time of the session.

Viewing Worklet Run Properties


The Worklet Details Run Properties window displays information about worklets, such as the name of the
Integration Service assigned to the workflow and worklet run details.
When you view worklet properties, the following areas appear in the Properties window:
♦ Worklet details. View information about the worklet.
♦ Session statistics. View information about the session.

Viewing Worklet Details


To view worklet details in the Properties window, right-click on a worklet and choose Get Run Properties.

Viewing Worklet Run Properties 357


Table 22-7 describes the attributes that appear in the Worklet Details area:

Table 22-7. Workflow Monitor Worklet Details Attributes

Attribute Name Description

Instance Name Name of the worklet instance in the workflow.

Task Type Task type is Worklet.

Integration Service Name Name of the Integration Service assigned to the workflow associated with the
worklet.

Start Time Start time of the worklet.

End Time End time of the worklet.

Recovery Time(s) Time of the recovery worklet run.

Status Status of the worklet.

Status Message Message about the worklet status.

Deleted Indicates if the worklet is deleted from the repository.

Version Number Version number of the worklet.

Execution Node(s) Nodes on which worklet tasks run.

Viewing Command Task Run Properties


The Task Run Properties window for Command tasks displays information about Command tasks, such as the
start time and end time. To view command task details in the Properties window, right-click on a Command
task and choose Get Run Properties.
Table 22-8 describes the attributes that appear in the Task Details area:

Table 22-8. Workflow Monitor Command Task Details Attributes

Attribute Name Description

Instance Name Command task name.

Task Type Task type is Command.

Integration Service Name Name of the Integration Service assigned to the workflow associated with the
Command task.

Node(s) Nodes on which the commands in the Command task run.

Start Time Start time of the Command task.

End Time End time of the Command task.

Recovery Time(s) Time of the recovery run.

Status Status of the Command task.

Status Message Message about the Command task status.

Deleted Indicates if the Command task is deleted.

Version Number Version number of the Command task.

358 Chapter 22: Viewing Workflow Monitor Details


Viewing Session Task Run Properties
When the Integration Service runs a session, the Workflow Monitor creates session details that provide load
statistics for each target in the mapping. You can view session details when the session runs or after the session
completes.
When you view session task properties, the following areas display in the Properties window:
♦ Failure information. View information about session failures.
♦ Task details. View information about the session.
♦ Source and target statistics. View information about the number of rows the Integration Service read from
the source and wrote to the target.
♦ Partition details. View information about partitions in a session.
♦ Performance. View information about session performance.
To view session details, right-click a session in the Workflow Monitor and choose Get Run Properties.
When you load data to a target with multiple groups, such as an XML target, the Integration Service provides
session details for each group.

Viewing Failure Information


The Failure Information area displays information about session errors.
Table 22-9 describes the attributes that appear in the Failure Information area:

Table 22-9. Workflow Monitor Failure Information Attributes

Attribute Name Description

First Error Code Error code for the first error.

First Error First error message.

Viewing Session Task Details


The Task Details area displays information about session task.
Table 22-10 describes the attributes that appear in the Task Details area for Session tasks:

Table 22-10. Workflow Monitor Task Details Attributes

Attribute Name Description

Instance Name Name of the session.

Task Type Task type is Session.

Integration Service Name of the Integration Service assigned to the workflow associated with the
Name session.

Node(s) Node on which the session is running.

Start Time Start time of the session.

End Time End time of the session.

Recovery Time(s) Time of the recovery session run.

Status Status of the session.

Status Message Message about the session status.

Deleted Indicates if the session is deleted from the repository.

Version Number Version number of the session.

Viewing Session Task Run Properties 359


Table 22-10. Workflow Monitor Task Details Attributes

Attribute Name Description

Mapping Name Name of the mapping associated with the session.

Source Success Number of rows the Integration Service successfully read from the source.
Rows

Source Failed Rows Number of rows the Integration Service failed to read from the source.

Target Success Number of rows the Integration Service wrote to the target.
Rows*

Target Failed Rows Number of rows the Integration Service failed to write the target.

Total Transformation Number of transformation errors in the session.


Errors
* For a recovery session, this value lists the number of rows the Integration Service processed after recovery. To determine
the number of rows processed before recovery, see the session log.

Viewing Source and Target Statistics


The Source/Target Statistics area displays information about the rows the Integration Service read from the
sources and the rows it loaded to the target.
Table 22-11 describes the attributes that appear in the Source/Target Statistics area:

Table 22-11. Workflow Monitor Source and Target Statistics Attributes

Attribute Name Description

Transformation Name Name of the source qualifier instance or the target instance in the mapping. If you
create multiple partitions in the source or target, the Instance Name displays the
partition number. If the source or target contains multiple groups, the Instance Name
displays the group name.

Node Node running the transformation.

Applied Rows* For sources, shows the number of rows the Integration Service successfully read
from the source. For targets, shows the number of rows the Integration Service
successfully applied to the target.

Affected Rows* For sources, shows the number of rows the Integration Service successfully read
from the source.
For targets, shows the number of rows affected by the specified operation. For
example, you have a table with one column called SALES_ID and five rows contain
the values 1, 2, 3, 2, and 2. You mark rows for update where SALES_ID is 2. The
writer affects three rows, even though there was one update request. Or, if you mark
rows for update where SALES_ID is 4, the writer affects 0 rows.

Rejected Rows Number of rows the Integration Service dropped when reading from the source, or
the number of rows the Integration Service rejected when writing to the target.

Throughput Rate at which the Integration Service read rows from the source or wrote data into
(Rows/Sec) the target per second.

Throughput Estimated rate at which the Integration Service read data from the source and wrote
(Bytes/Sec) data to the target in bytes per second. Throughput (Bytes/Sec) is based on the
Throughput (Rows/Sec) and the row size. The row size is based on the number of
columns the Integration Service read from the source and wrote to the target, the
data movement mode, column metadata, and if you enabled high precision for the
session. The calculation is not based on the actual data size in each row.

Bytes Total bytes the Integration Service read from source or wrote to target. It is obtained
by multiplying rows/sec and the row size.

Last Error Code Error message code of the most recent error message written to the session log. If
you view details after the session completes, this field displays the last error code.

Last Error Message Most recent error message written to the session log. If you view details after the
session completes, this field displays the last error message.

360 Chapter 22: Viewing Workflow Monitor Details


Table 22-11. Workflow Monitor Source and Target Statistics Attributes

Attribute Name Description

Start Time Time the Integration Service started to read from the source or write to the target.
The Workflow Monitor displays time relative to the Integration Service.

End Time Time the Integration Service finished reading from the source or writing to the target.
The Workflow Monitor displays time relative to the Integration Service.
* For a recovery session, this value lists the number of rows the Integration Service affected or applied to the target after
recovery. To determine the number of rows processed before recovery, see the session log.

Viewing Partition Details


The Partition Details area displays information about partitions in a session. When you create multiple
partitions in a session, the Integration Service provides session details for each partition. Use these details to
determine if the data is evenly distributed among the partitions. For example, if the Integration Service moves
more rows through one target partition than another, or if the throughput is not evenly distributed, you might
want to adjust the data range for the partitions.
Table 22-12 describes the attributes that appear in the Partition Details area:

Table 22-12. Workflow Monitor Partition Details Attributes

Attribute Name Description

Partition Name Name of the partition.

Node Node running the partition.

Transformations Transformations in the partition pipeline.

Process ID Process ID of the partition.

CPU % Percent of the CPU the partition is consuming during the current session run.

CPU Seconds Amount of process time in seconds the CPU is taking to process the data in the
partition during the current session run.

Memory Usage Amount of memory the partition consumes during the current session run.

Input Rows Number of input rows for the partition.

Output Rows* Number of output rows for the partition.


* For a recovery session, this value lists the number of rows the Integration Service processed after recovery. To determine the
number of rows processed before recovery, see the session log.

Viewing Performance Details


The performance details provide counters that help you understand the session and mapping efficiency. Each
source qualifier and target definition appears in the performance details, along with counters that display
performance information about each transformation.
You can view session performance details in the following formats:
♦ Workflow Monitor. When you configure the session to collect performance details, you can view them in
the Workflow Monitor. When you configure the session to store performance details, you can view the
details for previous sessions.
♦ Performance details file. The Integration Service creates a performance detail file for the session when it
completes. Use a text editor to view the performance details file.
By evaluating the final performance details, you can determine where session performance slows down. The
Workflow Monitor also provides session-specific details that can help tune the following memory settings:

Viewing Performance Details 361


♦ Buffer block size
♦ Index and data cache size for Aggregator, Rank, Lookup, and Joiner transformations

To view performance details in the Workflow Monitor:

1. Right-click a session in the Workflow Monitor and choose Get Run Properties.
2. Click the Performance area in the Properties window.
The following table describes the attributes that appear in the Performance area:

Attribute Name Description

Performance Counter Name of the performance counter.

Counter Value Value of the performance counter.

When you create multiple partitions, the Performance Area displays a column for each partition. The
columns display the counter values for each partition.
3. Click OK.

To view the performance details file:

1. Locate the performance details file.


The Integration Service names the file session_name.perf, and stores it in the same directory as the session
log. If there is no session-specific directory for the session log, the Integration Service saves the file in the
default log files directory.
2. Open the file in any text editor.

RELATED TOPICS:
♦ “Configuring Performance Details” on page 126

Understanding Performance Counters


All transformations have some basic counters that indicate the number of input rows, output rows, and error
rows.
Source Qualifier, Normalizer, and target transformations have additional counters that indicate the efficiency of
data moving into and out of buffers. Use these counters to locate performance bottlenecks.
Some transformations have counters specific to their functionality. For example, each Lookup transformation
has a counter that indicates the number of rows stored in the lookup cache.
When you view the performance details file, the first column displays the transformation name as it appears in
the mapping, the second column contains the counter name, and the third column holds the resulting number
or efficiency percentage. If you use a Joiner transformation, the first column shows two instances of the Joiner
transformation:
♦ <Joiner transformation> [M]. Displays performance details about the master pipeline of the Joiner
transformation.
♦ <Joiner transformation> [D]. Displays performance details about the detail pipeline of the Joiner
transformation.
When you create multiple partitions, the Integration Service generates one set of counters for each partition.
The following performance counters illustrate two partitions for an Expression transformation:
Transformation Name Counter Name Counter Value
EXPTRANS [1] Expression_input rows 8
Expression_output rows 8
EXPTRANS [2] Expression_input rows 16

362 Chapter 22: Viewing Workflow Monitor Details


Transformation Name Counter Name Counter Value
Expression_output rows 16

Note: When you increase the number of partitions, the number of aggregate or rank input rows may be different
from the number of output rows from the previous transformation.
Table 22-13 describes the counters that may appear in the Session Performance Details area or in the
performance details file:

Table 22-13. Performance Counters

Transformation Counters Description

Aggregator/Rank_inputrows Number of rows passed into the transformation.

Aggregator/Rank_outputrows Number of rows sent out of the transformation.

Aggregator/Rank_errorrows Number of rows in which the Integration Service


encountered an error.

Aggregator/Rank_readfromcache Number of times the Integration Service read


from the index or data cache.

Aggregator/Rank_writetocache Number of times the Integration Service wrote to


the index or data cache.
Aggregator and
Rank Aggregator/Rank_readfromdisk Number of times the Integration Service read
Transformations from the index or data file on the local disk,
instead of using cached data.

Aggregator/Rank_writetodisk Number of times the Integration Service wrote to


the index or data file on the local disk, instead of
using cached data.

Aggregator/Rank_newgroupkey Number of new groups the Integration Service


created.

Aggregator/Rank_oldgroupkey Number of times the Integration Service used


existing groups.

Lookup_inputrows Number of rows passed into the transformation.

Lookup_outputrows Number of rows sent out of the transformation.


Lookup
Transformation Lookup_errorrows Number of rows in which the Integration Service
encountered an error.

Lookup_rowsinlookupcache Number of rows stored in the lookup cache.

Viewing Performance Details 363


Table 22-13. Performance Counters

Transformation Counters Description

Joiner_inputMasterRows Number of rows the master source passed into


the transformation.

Joiner_inputDetailRows Number of rows the detail source passed into the


transformation.

Joiner_outputrows Number of rows sent out of the transformation.

Joiner_errorrows Number of rows in which the Integration Service


encountered an error.

Joiner_readfromcache Number of times the Integration Service read


from the index or data cache.

Joiner_writetocache Number of times the Integration Service wrote to


the index or data cache.

Joiner_readfromdisk* Number of times the Integration Service read


from the index or data files on the local disk,
instead of using cached data.
Joiner Joiner_writetodisk* Number of times the Integration Service wrote to
Transformation the index or data files on the local disk, instead of
(Master and using cached data.
Detail)
Joiner_readBlockFromDisk** Number of times the Integration Service read
from the index or data files on the local disk,
instead of using cached data.

Joiner_writeBlockToDisk** Number of times the Integration Service wrote to


the index or data cache.

Joiner_seekToBlockInDisk** Number of times the Integration Service


accessed the index or data files on the local disk.

Joiner_insertInDetailCache* Number of times the Integration Service wrote to


the detail cache. The Integration Service
generates this counter if you join data from a
single source.

Joiner_duplicaterows Number of duplicate rows the Integration Service


found in the master relation.

Joiner_duplicaterowsused Number of times the Integration Service used the


duplicate rows in the master relation.

Transformation_inputrows Number of rows passed into the transformation.

All Other Transformation_outputrows Number of rows sent out of the transformation.


Transformations
Transformation_errorrows Number of rows in which the Integration Service
encountered an error.
*The Integration Service generates this counter when you use sorted input for the Joiner transformation.
**The Integration Service generates this counter when you do not use sorted input for the Joiner transformation.

If you have multiple source qualifiers and targets, evaluate them as a whole. For source qualifiers and targets, a
high value is considered 80-100 percent. Low is considered 0-20 percent.

364 Chapter 22: Viewing Workflow Monitor Details


CHAPTER 23

Running Workflows and Sessions


on a Grid
This chapter includes the following topics:
♦ Overview, 365
♦ Running Workflows on a Grid, 366
♦ Running Sessions on a Grid, 366
♦ Working with Partition Groups, 367
♦ Grid Connectivity and Recovery, 369
♦ Configuring a Workflow or Session to Run on a Grid, 369

Overview
When a PowerCenter domain contains multiple nodes, you can configure workflows and sessions to run on a
grid. When you run a workflow on a grid, the Integration Service runs a service process on each available node
of the grid to increase performance and scalability. When you run a session on a grid, the Integration Service
distributes session threads to multiple DTM processes on nodes in the grid to increase performance and
scalability.
You create the grid and configure the Integration Service in the Administration Console. To run a workflow on
a grid, you configure the workflow to run on the Integration Service associated with the grid. To run a session
on a grid, configure the session to run on the grid.
Figure 23-1 shows the relationship between the workflow and nodes when you run a workflow on a grid:

Figure 23-1. Running a Workflow on a Grid

Node 1

Workflow Integration Grid


Service Node 2

Node 3

Assign a workflow The Integration The grid is assigned The workflow runs on
to run on an Service is associated to multiple nodes. the nodes in the grid.
Integration Service. with a grid.

Overview 365
The Integration Service distributes workflow tasks and session threads based on how you configure the
workflow or session to run:
♦ Running workflows on a grid. The Integration Service distributes workflows across the nodes in a grid. It
also distributes the Session, Command, and predefined Event-Wait tasks within workflows across the nodes
in a grid.
♦ Running sessions on a grid. The Integration Service distributes session threads across nodes in a grid.
Note: To run workflows on a grid, you must have the Server grid option. To run sessions on a grid, you must
have the Session on Grid option.

Running Workflows on a Grid


When you run a workflow on a grid, the master service process runs the workflow and all tasks except Session,
Command, and predefined Event-Wait tasks, which it may distribute to other nodes. The master service process
is the Integration Service process that runs the workflow, monitors service processes running on other nodes,
and runs the Load Balancer. The Scheduler runs on the master service process node, so it uses the date and time
for the master service process node to start scheduled workflows.
The Load Balancer is the component of the Integration Service that dispatches Session, Command, and
predefined Event-Wait tasks to the nodes in the grid. The Load Balancer distributes tasks based on node
availability. If the Integration Service is configured to check resources, the Load Balancer also distributes tasks
based on resource availability.
For example, a workflow contains a Session task, a Decision task, and a Command task. You specify a resource
requirement for the Session task. The grid contains four nodes, and Node 4 is unavailable. The master service
process runs the Start and Decision tasks. The Load Balancer distributes the Session and Command tasks to
nodes on the grid based on resource availability and node availability.
Figure 23-2 shows a workflow distributed to the nodes in a grid:

Figure 23-2. Workflow Distributed to the Nodes in a Grid

Node 1 Node 2 Node 1 Node 3

Start task runs on the Session task runs on Decision task runs Command task runs on
master service process node where resources on master service node where resources
node. are available. process node. are available.

Running Sessions on a Grid


When you run a session on a grid, the master service process runs the workflow and all tasks except Session,
Command, and predefined Event-Wait tasks as it does when you run a workflow on a grid. The Scheduler runs
on the master service process node, so it uses the date and time for the master service process node to start
scheduled workflows. In addition, the Load Balancer distributes session threads to DTM processes running on
different nodes.

366 Chapter 23: Running Workflows and Sessions on a Grid


When you run a session on a grid, the Load Balancer distributes session threads based on the following factors:
♦ Node availability. The Load Balancer verifies which nodes are currently running, enabled, and available for
task dispatch.
♦ Resource availability. If the Integration Service is configured to check resources, it identifies nodes that have
resources required by mapping objects in the session.
♦ Partitioning configuration. The Load Balancer dispatches groups of session threads to separate nodes based
on the partitioning configuration.
You might want to configure a session to run on a grid when the workflow contains a session that takes a long
time to run.
For example, a workflow contains a session with one partition. To balance the load, you configure the session to
run on a grid and configure the Integration Service to check resources. The Load Balancer distributes the
reader, writer, and transformation threads to DTM processes running on the nodes in the grid. The reader
threads require a resource, so the Load Balancer distributes them to a DTM process on the node where
resources are available.
Figure 23-3 shows session threads distributed to DTM processes running on nodes in a grid:

Figure 23-3. Session Threads Distributed to DTM Processes Running on Nodes in a Grid

Node 1 Node 2 Node 3 Node 4

Transformation Writer threads Reader threads run Node 4 is


threads run on run on available on node where unavailable so no
available node. node. resources are threads run on it.
available.

RELATED TOPICS:
♦ “Assigning Resources to Tasks” on page 372

Working with Partition Groups


When you run a session on a grid, the Data Transformation Manager process (DTM) forms groups of session
threads called partition groups. A partition group is a group of reader, writer, or transformation threads that run
in a single DTM process. A partition group might include one or more pipeline stages. A pipeline stage is the
section of a pipeline executed between any two partition points. Some transformations are not partitionable
across a grid. When a transformation is not partitionable across a grid, the DTM creates a single partition group
for the transformation threads and runs those threads on a single node.

Forming Partition Groups Without Resource Requirements


If the session has more than one partition, the DTM forms partition groups based on the partitioning
configuration. For example, a session is configured with two partitions. The DTM creates partition groups for
the threads in each partition, and the Load Balancer distributes the groups to two nodes.

Working with Partition Groups 367


Figure 23-4 shows partition groups distributed for a session containing two partitions:

Figure 23-4. Partition Groups Distributed Based on Partitioning Configuration

Partition 1 Reader 1 Transformation 1 Writer 1 Node 1 Partition Group 1


runs on Node 1.

Partition 2 Reader 2 Transformation 2 Writer 2 Node 2 Partition Group 2


runs on Node 2.

Forming Partition Groups With Resource Requirements


When you specify resource requirements for a mapping object, the DTM process creates partition groups based
on the resources available on a particular node. For example, if the source files for the session are available on a
particular node and you specified a resource requirement for the Source Qualifier transformation, the DTM
process forms partition groups based on this requirement.
To meet the resource requirements of the Source Qualifier transformation, the DTM process creates a partition
group from the reader threads. The Load Balancer distributes the reader threads to the node where the resource
is available.
Note: To cause the Load Balancer to distribute threads to nodes where required resources are available, you must
configure the Integration Service to check resources.
Figure 23-5 shows partition groups distributed based on partitioning configuration and resource availability:

Figure 23-5. Partition Groups Distributed Based on Resource Availability

Partition 1 Reader 1 Transformation 1 Writer 1

Partition 2 Reader 2 Transformation 2 Writer 2

Node 1 Node 2

Partition Group 1 runs on Node 1 Partition Group 2 runs on Node 2.


where resources are available.

Rules and Guidelines


The Integration Service uses the following rules and guidelines to create partition groups:
♦ The Integration Service limits the number of partition groups to the number of nodes in a grid.
♦ When a transformation is partitionable locally, the DTM process forms one partition group for the
transformation threads, and runs that group in one DTM process. The following transformations are
partitioned locally:
− Custom transformation configured to partition locally
− External Procedure transformation
− Cached Lookup transformation
− Unsorted Joiner transformation
− SDK Reader or Writer transformation configured to partition locally

Working with Caches


The Integration Service creates index and data caches for the Aggregator, Rank, Joiner, Sorter, and Lookup
transformations. When the session contains more than one partition, the transformation threads may be

368 Chapter 23: Running Workflows and Sessions on a Grid


distributed to more than one node in the grid. To create a single data and index cache for these transformation
threads, verify that the root directory and cache directory point to the same location for all nodes in the grid.
When the Integration Service creates a cache for a Lookup transformation in a shared location, it builds a cache
for the first partition group, and subsequent partition groups use this cache. When you do not configure a
shared location for the Lookup transformation cache files, each service process on a separate node fetches data
from the database or source files to create a cache. If the source data changes frequently, the caches created on
separate nodes can be inconsistent.

RELATED TOPICS:
♦ “Session Caches” on page 455

Grid Connectivity and Recovery


When you run a workflow or session on a grid, service processes and DTM processes run on different nodes.
Network failures can cause connectivity loss between processes running on separate nodes. Services may shut
down unexpectedly, or you may disable the Integration Service or service processes while a workflow or session
is running. The Integration Service failover and recovery behavior in these situations depends on the service
process that is disabled, shuts down, or loses connectivity. Recovery behavior also depends on the following
factors:
♦ High availability option. When you have high availability, workflows fail over to another node if the node or
service shuts down. If you do not have high availability, you can manually restart a workflow on another
node to recover it.
♦ Recovery strategy. You can configure a workflow to suspend on error. You configure a recovery strategy for
tasks within the workflow. When a workflow suspends, the recovery behavior depends on the recovery
strategy you configure for each task in the workflow.
♦ Shutdown mode. When you disable an Integration Service or service process, you can specify that the service
completes, aborts, or stops processes running on the service. Behavior differs when you disable the
Integration Service or you disable a service process. Behavior also differs when you disable a master service
process or a worker service process. The Integration Service or service process may also shut down
unexpectedly. In this case, the failover and recovery behavior depend on which service process shuts down
and the configured recovery strategy.
♦ Running mode. If the workflow runs on a grid, the Integration Service can recover workflows and tasks on
another node. If a session runs on a grid, you cannot configure a resume recovery strategy.
♦ Operating mode. If the Integration Service runs in safe mode, recovery is disabled for sessions and
workflows.
Note: You cannot configure an Integration Service to fail over in safe mode if it runs on a grid.

RELATED TOPICS:
♦ “Recovering Workflows” on page 221

Configuring a Workflow or Session to Run on a Grid


Before you can run a session or workflow on a grid, the grid must be assigned to multiple nodes, and the
Integration Service must be configured to run on the grid. You create the grid and assign the Integration Service
in the PowerCenter Administration Console. You may need to verify these settings with the domain
administrator.

Grid Connectivity and Recovery 369


To run a workflow or session on a grid, configure the following properties and settings:
♦ Workflow properties. On the General tab of the workflow properties, assign an Integration Service to run
the workflow. Verify that the Integration Service is configured to run on a grid.
♦ Session properties. To run a session on a grid, enable the session to run on a grid in the Config Object tab of
the session properties.
♦ Resource requirements. You configure resource requirements on the General tab of the Session, Command,
and predefined Event-Wait tasks.

RELATED TOPICS:
♦ “Assigning Resources to Tasks” on page 372

Rules and Guidelines


Use the following rules and guidelines when you configure a session or workflow to run on a grid:
♦ If you override a service process variable, ensure that the Integration Service can access input files, caches,
logs, storage and temporary directories, and source and target file directories.
♦ To ensure that a Session, Command, or predefined Event-Wait task runs on a particular node, configure the
Integration Service to check resources and specify a resource requirement for a the task.
♦ To ensure that session threads for a mapping object run on a particular node, configure the Integration
Service to check resources and specify a resource requirement for the object.
♦ When you run a session that creates cache files, configure the root and cache directory to use a shared
location to ensure consistency between cache files.
♦ Ensure the Integration Service builds the cache in a shared location when you add a partition point at a
Joiner transformation and the transformation is configured for 1:n partitioning. The cache for the Detail
pipeline must be shared.
♦ Ensure the Integration Service builds the cache in a shared location when you add a partition point at a
Lookup transformation, and the partition type is not hash auto-keys.
♦ When you run a session that uses dynamic partitioning, and you want to distribute session threads across all
nodes in the grid, configure dynamic partitioning for the session to use the “Based on number of nodes in
the grid” method.
♦ You cannot run a debug session on a grid.
♦ You cannot configure a resume recovery strategy for a session that you run on a grid.
♦ Configure the session to run on a grid when you work with sessions that take a long time to run.
♦ Configure the workflow to run on a grid when you have multiple concurrent sessions.
♦ You can run a persistent profile session on a grid, but you cannot run a temporary profile session on a grid.
♦ When you use a Sequence Generator transformation, increase the number of cached values to reduce the
communication required between the master and worker DTM processes and the repository.
♦ To ensure that the Log Viewer can accurately order log events when you run a workflow or session on a grid,
use time synchronization software to ensure that the nodes of a grid use a synchronized date/time.
♦ If the workflow uses an Email task in a Windows environment, configure the same Microsoft Outlook
profile on each node to ensure the Email task can run.

370 Chapter 23: Running Workflows and Sessions on a Grid


CHAPTER 24

Working with the Load Balancer


This chapter includes the following topics:
♦ Overview, 371
♦ Assigning Service Levels to Workflows, 371
♦ Assigning Resources to Tasks, 372

Overview
The Load Balancer dispatches tasks to Integration Service processes running on nodes. When you run a
workflow, the Load Balancer dispatches the Session, Command, and predefined Event-Wait tasks within the
workflow. If the Integration Service is configured to check resources, the Load Balancer matches task
requirements with resource availability to identify the best node to run a task. It may dispatch tasks to a single
node or across nodes.
To identify the nodes that can run a task, the Load Balancer matches the resources required by the task with the
resources available on each node. It dispatches tasks in the order it receives them. When the Load Balancer has
more Session and Command tasks to dispatch than the Integration Service can run at the time, the Load
Balancer places the tasks in the dispatch queue. When nodes become available, the Load Balancer dispatches the
waiting tasks from the queue in the order determined by the workflow service level.
You assign resources and service levels using the Workflow Manager. You can perform the following tasks:
♦ Assign service levels. You assign service levels to workflows. Service levels establish priority among workflow
tasks that are waiting to be dispatched.
♦ Assign resources. You assign resources to tasks. Session, Command, and predefined Event-Wait tasks require
PowerCenter resources to succeed. If the Integration Service is configured to check resources, the Load
Balancer dispatches these tasks to nodes where the resources are available.

Assigning Service Levels to Workflows


Service levels determine the order in which the Load Balancer dispatches tasks from the dispatch queue. When
multiple tasks are waiting to be dispatched, the Load Balancer dispatches high priority tasks before low priority
tasks. You create service levels and configure the dispatch priorities in the Administration Console.
You assign service levels to workflows on the General tab of the workflow properties.

371
Assigning Resources to Tasks
PowerCenter resources are the database connections, files, directories, node names, and operating system types
required by a task to make the task succeed. The Load Balancer may use resources to dispatch tasks. If the
Integration Service is not configured to run on a grid or check resources, the Load Balancer ignores resource
requirements. It dispatches all tasks to the master Integration Service process running on the node.
If the Integration Service runs on a grid and is configured to check resources, the Load Balancer uses resources
to dispatch tasks. The Integration Service matches the resources required by tasks in a workflow with the
resources available on each node in the grid to determine which nodes can run the tasks. The Load Balancer
distributes the Session, Command, and predefined Event-Wait tasks to nodes with available resources. For
example, if a session requires a file resource for a reserved words file, the Load Balancer dispatches the session to
nodes that have access to the file. A task fails if the Integration Service cannot identify a node where the
required resource is available.
In the Administration Console, you define the resources that are available to each node. Resources are either
predefined or user-defined. Predefined resources include connections available to a node, node name, and
operating system type. User-defined resources include file/directory resources and custom resources.
In the task properties, you assign PowerCenter resources to nonreusable tasks that require those resources. You
cannot assign resources to reusable tasks.
Table 24-1 lists resource types and the repository objects to which you can assign them:

Table 24-1. Resource Types and Associated Repository Objects

Resource Predefined/
Repository Objects that Use Resources
Type User-Defined

Custom User-defined Session, Command, and predefined Event-Wait task instances and all
mapping objects within a session.

File/Directory User-defined Session, Command, and predefined Event-Wait task instances, and the
following mapping objects within a session:
- Source qualifiers
- Aggregator transformation
- Custom transformation
- External Procedure transformation
- Joiner transformation
- Lookup transformation
- Sorter transformation
- Custom transformation
- Java transformation
- HTTP transformation
- SQL transformation
- Union transformation
- Targets

Node Name Predefined Session, Command, and predefined Event-Wait task instances and all
mapping objects within a session.

Operating Predefined Session, Command, and predefined Event-Wait task instances and all
System Type mapping objects within a session.

If you try to assign a resource type that does not apply to a repository object, the Workflow Manager displays
the following error message:
The selected resource cannot be applied to this type of object. Please select a different
resource.

The Workflow Manager assigns connection resources. When you use a relational, FTP, or external loader
connection, the Workflow Manager assigns the connection resource to sources, targets, and transformations in a
session instance. You cannot manually assign a connection resource in the Workflow Manager.

372 Chapter 24: Working with the Load Balancer


To assign resources to a task instance:

1. Open the task properties in the Worklet or Workflow Designer.


If the task is an Event-Wait task, you can assign resources only if the task waits for a predefined event.
2. On the General tab, click Edit.
3. In the Edit Resources dialog box, click the Add button to add a resource.
4. In the Select Resource dialog box, choose an object you want to assign a resource to. The Resources list
shows the resources available to the nodes where the Integration Service runs.
5. Select the resource to assign and click Select.
6. In the Edit Resources dialog box, click OK.

Assigning Resources to Tasks 373


374 Chapter 24: Working with the Load Balancer
CHAPTER 25

Session and Workflow Logs


This chapter includes the following topics:
♦ Overview, 375
♦ Log Events, 376
♦ Log Events Window, 377
♦ Working with Log Files, 378
♦ Workflow Logs, 382
♦ Session Logs, 383
♦ Viewing Log Events, 384

Overview
The Service Manager provides accumulated log events from each service in the domain and for sessions and
workflows. To perform the logging function, the Service Manager runs a Log Manager and a Log Agent. The
Log Manager runs on the master gateway node. The Integration Service generates log events for workflows and
sessions. The Log Agent runs on the nodes to collect and process log events for sessions and workflows.
Log events for workflows include information about tasks performed by the Integration Service, workflow
processing, and workflow errors. Log events for sessions include information about the tasks performed by the
Integration Service, session errors, and load summary and transformation statistics for the session.
You can view log events for workflows with the Log Events window in the Workflow Monitor. The Log Events
window displays information about log events including severity level, message code, run time, workflow name,
and session name. For session logs, you can set the tracing level to log more information. All log events display
severity regardless of tracing level.
The following steps describe how the Log Manager processes session and workflow logs:
1. The Integration Service writes binary log files on the node. It sends information about the sessions and
workflows to the Log Manager.
2. The Log Manager stores information about workflow and session logs in the domain configuration
database. The domain configuration database stores information such as the path to the log file location,
the node that contains the log, and the Integration Service that created the log.
3. When you view a session or workflow in the Log Events window, the Log Manager retrieves the
information from the domain configuration database to determine the location of the session or workflow
logs.

375
4. The Log Manager dispatches a Log Agent to retrieve the log events on each node to display in the Log
Events window.
To access log events for more than the last workflow run, you can configure sessions and workflows to archive
logs by time stamp. You can also configure a workflow to produce text log files. You can archive text log files by
run or by time stamp. When you configure the workflow or session to produce text log files, the Integration
Service creates the binary log and the text log file.

Log Events
You can view log events in the Workflow Monitor Log Events window and you can view them as text files. The
Log Events window displays log events in a tabular format.

Log Codes
Use log events to determine the cause of workflow or session problems. To resolve problems, locate the relevant
log codes and text prefixes in the workflow and session log. For more information about the error, see the
PowerCenter Message Reference.
The Integration Service precedes each workflow and session log event with a thread identification, a code, and a
number. The code defines a group of messages for a process. The number defines a message. The message can
provide general information or it can be an error message.
Some log events are embedded within other log events. For example, a code CMN_1039 might contain
informational messages from Microsoft SQL Server.

Message Severity
The Log Events window categorizes workflow and session log events into severity levels. It prioritizes error
severity based on the embedded message type. The error severity level appears with log events in the Log Events
window in the Workflow Monitor. It also appears with messages in the workflow and session log files.
Table 25-1 describes message severity levels:

Table 25-1. Message Severity Levels

Severity Level Description

FATAL Fatal error occurred. Fatal error messages have the highest severity level.

ERROR Indicates the service failed to perform an operation or respond to a request from a client
application. Error messages have the second highest severity level.

WARNING Indicates the service is performing an operation that may cause an error. This can cause
repository inconsistencies. Warning messages have the third highest severity level.

INFO Indicates the service is performing an operation that does not indicate errors or problems.
Information messages have the third lowest severity level.

TRACE Indicates service operations at a more specific level than Information. Tracing messages
are generally record message sizes. Trace messages have the second lowest severity
level.

DEBUG Indicates service operations at the thread level. Debug messages generally record the
success or failure of service operations. Debug messages have the lowest severity level.

Writing Logs
The Integration Service writes the workflow and session logs as binary files on the node where the service
process runs. It adds a .bin extension to the log file name you configure in the session and workflow properties.

376 Chapter 25: Session and Workflow Logs


When you run a session on a grid, the Integration Service creates one session log for each DTM process. The
log file on the primary node has the configured log file name. The log file on a worker node has a .w<Partition
Group Id> extension:
<session or workflow name>.w<Partition Group ID>.bin

For example, if you run the session s_m_PhoneList on a grid with three nodes, the session log files use the
names, s_m_PhoneList.bin, s_m_PhoneList.w1.bin, and s_m_PhoneList.w2.bin.
When you rerun a session or workflow, the Integration Service overwrites the binary log file unless you choose
to save workflow logs by time stamp. When you save workflow logs by time stamp, the Integration Service adds
a time stamp to the log file name and archives them.
To view log files for more than one run, configure the workflow or session to create log files.
A workflow or session continues to run even if there are errors while writing to the log file after the workflow or
session initializes. If the log file is incomplete, the Log Events window cannot display all the log events.
The Integration Service starts a new log file for each workflow and session run. When you recover a workflow or
session, the Integration Service appends a recovery.time stamp extension to the file name for the recovery run.
For real-time sessions, the Integration Service overwrites the log file when you restart a session in cold start
mode or when you restart a JMS or WebSphere MQ session that does not have recovery data. The Integration
Service appends the log file when you restart a JMS or WebSphere MQ session that has recovery data.
To convert the binary file to a text file, use the infacmd convertLog or the infacmd GetLog command.

Passing Session Events to an External Library


You can pass session event messages to an external procedure to handle. You write the procedure according to
how you want to handle the events and compile it into a shared library. The shared library must implement a set
of functions in the Session Log Interface provided by PowerCenter. In the Administration Console, you
configure the Integration Service to use the shared library to handle the session logs.
The Session Log Interface lets you pass session event messages, but not workflow event messages, to an external
shared library.

Log Events Window


View log events in the Log Events window. The Log Events window displays the following information for each
session and workflow:
♦ Severity. Lists the type of message, such as informational or error.
♦ Time stamp. Date and time the log event reached the Log Agent.
♦ Node. Node on which the Integration Service process is running.
♦ Thread. Thread ID for the workflow or session.
♦ Process ID. Windows or UNIX process identification numbers. Displays in the Output window only.
♦ Message Code. Message code and number.
♦ Message. Message associated with the log event.
By default, the Log Events window displays log events according to the date and time the Integration Service
writes the log event on the node. The Log Events window displays logs consisting of multiple log files by node
name. When you run a session on a grid, log events for the partition groups are ordered by node name and
grouped by log file.
You can perform the following tasks in the Log Events window:
♦ Save log events to file. Click Save As to save log events as a binary, text, or XML file.

Log Events Window 377


♦ Copy log event text to a file. Click Copy to copy one or more log events and paste them into a text file.
♦ Sort log events. Click a column heading to sort log events.
♦ Search for log events. Click Find to search for text in log events.
♦ Refresh log events. Click Refresh to view updated log events during a workflow or session run.
Note: When you view a log larger than 2 GB, the Log Events window displays a warning that the file might be
too large for system memory. If you continue, the Log Events window might shut down unexpectedly.

Searching for Log Events


Search for log events based on any information in the Log Events window. For example, you can search for text
in a message or search for messages based on the date and time of the log event.

To search for log events:

1. Open the Workflow Monitor.


2. Connect to a repository in the Navigator.
3. Select an Integration Service.
4. Right-click a workflow and select Get Workflow Log.
The Log Events window displays.
5. In the Log Events window, click Find.
The Query Area appears.
6. Enter the text you want to find.
7. Optionally, click Match Case if you want the query to be case sensitive.
8. Select one of the following options:
♦ Message. Select to search text in the Message field.
♦ All Fields. Select to search text in all fields.
9. Click Find Next to search for the next instance of the text in the Log Events. Or, click Find Previous to
search for the previous instance of the text in the Log Events.

Keyboard Shortcuts for the Log Events Window


The following table lists shortcuts that you can use to search for Log Events:

To Press

Open the Query Area. Ctrl+ F

Find next instance of the text. F3

Find the previous instance of the text. Shift+F3

Working with Log Files


Configure a workflow or session to write log events to log files in the workflow or session properties. The
Integration Service writes information about the workflow or session run to a text file in addition to writing log
events to a binary file. If you configure workflow or session properties to create log files, you can open the text
files with any text editor or import the binary files to view logs in the Log Events window.

378 Chapter 25: Session and Workflow Logs


By default, the Integration Service writes log files based on the Integration Service code page. If you enable the
LogInUTF8 option in the Advanced Properties for the Integration Service, the Integration Service writes to the
logs using the UTF-8 character set. If you configure the Integration Service to run in ASCII mode, it sorts all
character data using a binary sort order even if you select a different sort order in the session properties.
Optimize performance by disabling the option to create text log files.

Writing to Log Files


When you create a workflow or session log, you can configure log options in the workflow or session properties.
You can configure the following information for a workflow or session log:
♦ Write Backward Compatible Log File. Select this option to create a text file for workflow or session logs. If
you do not select the option, the Integration Service creates the binary log only.
♦ Log File Directory. The directory where you want the log file created. By default, the Integration Service
writes the workflow log file in the directory specified in the service process variable, $PMWorkflowLogDir.
It writes the session log file in the directory specified in the service process variable, $PMSessionLogDir. If
you enter a directory name that the Integration Service cannot access, the workflow or session fails.
Table 25-2 shows the default location for each type of log file and the associated service process variables:

Table 25-2. Log File Default Locations and Associated Service Process Variables

Default Directory Default Value for Service


Log File Type
(Service Process Variable) Process Variable

Workflow logs $PMWorkflowLogDir $PMRootDir/WorkflowLogs

Session logs $PMSessionLogDir $PMRootDir/SessLogs

♦ Name. The name of the log file. You must configure a name for the log file or the workflow or session is
invalid. You can use a service, service process, or user-defined workflow or worklet variable for the log file
name.
Note: The Integration Service stores the workflow and session log names in the domain configuration database.
If you want to use Unicode characters in the workflow or session log file names, the domain configuration
database must be a Unicode database.

Archiving Log Files


By default, when you configure a workflow or session to create log files, the Integration Service creates one log
file for the workflow or session. The Integration Service overwrites the log file when you run the workflow
again.
To create a log file for more than one workflow or session run, configure the workflow or session to archive logs
in the following ways:
♦ By run. Archive text log files by run. Configure a number of text logs to save.
♦ By time stamp. Archive binary logs and text files by time stamp. The Integration Service saves an unlimited
number of logs and labels them by time stamp. When you configure the workflow or session to archive by
time stamp, the Integration Service always archives binary logs.
Note: When you run concurrent workflows with the same instance name, the Integration Service appends a
timestamp to the log file name, even if you configure the workflow to archive logs by run.

Archiving Logs by Run


If you archive log files by run, specify the number of text log files you want the Integration Service to create.
The Integration Service creates the number of historical log files you specify, plus the most recent log file. If you
specify five runs, the Integration Service creates the most recent workflow log, plus historical logs zero to four,
for a total of six logs. You can specify up to 2,147,483,647 historical logs. If you specify zero logs, the
Integration Service creates only the most recent workflow log file.

Working with Log Files 379


The Integration Service uses the following naming convention to create historical logs:
<session or workflow name>.n

where n=0 for the first historical log. The variable increments by one for each workflow or session run.
If you run a session on a grid, the worker service processes use the following naming convention for a session:
<session name>.n.w<DTM ID>

Archiving Log Files by Time Stamp


When you archive logs by time stamp, the Integration Service creates an unlimited number of binary and text
file logs. The Integration Service adds a time stamp to the text and binary log file names. It appends the year,
month, day, hour, and minute of the workflow or session completion to the log file. The resulting log file name
is <session or workflow log name>.yyyymmddhhmi, where:
♦ yyyy = year
♦ mm = month, ranging from 01-12
♦ dd = day, ranging from 01-31
♦ hh = hour, ranging from 00-23
♦ mi = minute, ranging from 00-59
Binary logs use the .bin suffix.
To prevent filling the log directory, periodically purge or back up log files when using the time stamp option.
If you run a session on a grid, the worker service processes use the following naming convention for sessions:
<session name>.yyyymmddhhmi.w<DTM ID>
<session name>.yyyymmddhhmi.w<DTM ID>.bin

When you archive text log files, view the logs by navigating to the workflow or session log folder and viewing
the files in a text reader. When you archive binary log files, you can view the logs by navigating to the workflow
or session log folder and importing the files in the Log Events window. You can archive binary files when you
configure the workflow or session to archive logs by time stamp. You do not have to create text log files to
archive binary files. You might need to archive binary files to send to Informatica Global Customer Support for
review.

Configuring Workflow Log File Information


You can configure workflow log information on the workflow Properties tab.

To configure workflow log information:

1. Select the Properties tab of a workflow.


2. Enter the following workflow log options:

Option Name Description

Write Backward Writes workflow logs to a text log file. Select this option if you want to create a
Compatible Workflow log file in addition to the binary log for the Log Events window.
Log File

Workflow Log File Enter a file name or a file name and directory. You can use a service, service
Name process, or user-defined workflow or worklet variable for the workflow log file
name.
The Integration Service appends this value to that entered in the Workflow Log
File Directory field. For example, if you have $PMWorkflowLogDir\ in the
Workflow Log File Directory field, enter “logname.txt” in the Workflow Log File
Name field, the Integration Service writes logname.txt to the
$PMWorkflowLogDir\ directory.

380 Chapter 25: Session and Workflow Logs


Option Name Description

Workflow Log File Location for the workflow log file. By default, the Integration Service writes the
Directory log file in the process variable directory, $PMWorkflowLogDir.
If you enter a full directory and file name in the Workflow Log File Name field,
clear this field.

Save Workflow Log You can create workflow logs according to the following options:
By - By Runs. The Integration Service creates a designated number of workflow
logs. Configure the number of workflow logs in the Save Workflow Log for
These Runs option. The Integration Service does not archive binary logs.
- By time stamp. The Integration Service creates a log for all workflows,
appending a time stamp to each log. When you save workflow logs by time
stamp, the Integration Service archives binary logs and workflow log files.
You can also use the $PMWorkflowLogCount service variable to create the
configured number of workflow logs for the Integration Service.

Save Workflow Log Number of historical workflow logs you want the Integration Service to create.
for These Runs The Integration Service creates the number of historical logs you specify, plus
the most recent workflow log.

3. Click OK.

Configuring Session Log File Information


You can configure session log information on the session Properties tab and the Config Object tab.

To configure session log information:

1. Select the Properties tab of a session.


2. Enter the following session log options:

Option Name Description

Write Backward Writes session logs to a log file. Select this option if you want to create a log
Compatible Session file in addition to the binary log for the Log Events window.
Log File

Session Log File Enter a file name or a file name and directory. The Integration Service appends
Name this value to that entered in the Session Log File Directory field. For example, if
you have $PMSessionLogDir\ in the Session Log File Directory field, enter
“logname.txt” in the Session Log File Name field, the Integration Service writes
logname.txt to the $PMSessionLogDir\ directory.

Session Log File Location for the session log file. By default, the Integration Service writes the
Directory log file in the process variable directory, $PMSessionLogDir.
If you enter a full directory and file name in the Session Log File Name field,
clear this field.

3. Click the Config Object tab.

Working with Log Files 381


4. Enter the following session log options:

Option Name Description

Save Session You can create session logs according to the following options:
Log By - Session Runs. The Integration Service creates a designated number of session log
files. Configure the number of session logs in the Save Session Log for These Runs
option. The Integration Service does not archive binary logs.
- Session Time Stamp. The Integration Service creates a log for all sessions,
appending a time stamp to each log. When you save a session log by time stamp,
the Integration Service archives the binary logs and text log files.
You can also use the $PMSessionLogCount service variable to create the configured
number of session logs for the Integration Service.

Save Session Number of historical session logs you want the Integration Service to create.
Log for These The Integration Service creates the number of historical logs you specify, plus the
Runs most recent session log.

5. Click OK.

Workflow Logs
Workflow logs contain information about the workflow runs. You can view workflow log events in the Log
Events window of the Workflow Monitor. You can also create an XML, text, or binary log file for workflow log
events.
A workflow log contains the following information:
♦ Workflow name
♦ Workflow status
♦ Status of tasks and worklets in the workflow
♦ Start and end times for tasks and worklets
♦ Results of link conditions
♦ Errors encountered during the workflow and general information
♦ Some session messages and errors

Workflow Log Events Window


Use the Log Events window in the Workflow Monitor to view log events for a workflow. The Log Events
window displays all log events for a workflow. Select a log event to view more information about the log event.

Workflow Log Sample


A workflow log file provides the same information as the Log Events window for a workflow. You can view a
workflow log file in a text editor.
The following sample shows a section of a workflow log file:
INFO : LM_36435 [Mon Apr 03 15:10:20 2006] : (3060|3184) Starting execution of workflow [Wk_Java] in folder
[EmployeeData] last saved by user [ellen].
INFO : LM_36330 [Mon Apr 03 15:10:20 2006] : (3060|3184) Start task instance [Start]: Execution started.
INFO : LM_36318 [Mon Apr 03 15:10:20 2006] : (3060|3184) Start task instance [Start]: Execution succeeded.
INFO : LM_36505 : (3060|3184) Link [Start --> s_m_jtx_hier_useCase]: empty expression string, evaluated to
TRUE.
INFO : LM_36388 [Mon Apr 03 15:10:20 2006] : (3060|3184) Session task instance [s_m_jtx_hier_useCase] is
waiting to be started.
INFO : LM_36682 [Mon Apr 03 15:10:20 2006] : (3060|3184) Session task instance [s_m_jtx_hier_useCase]:
started a process with pid [148] on node [garnet].
INFO : LM_36330 [Mon Apr 03 15:10:20 2006] : (3060|3184) Session task instance [s_m_jtx_hier_useCase]:
Execution started.

382 Chapter 25: Session and Workflow Logs


INFO : LM_36488 [Mon Apr 03 15:10:22 2006] : (3060|3180) Session task instance [s_m_jtx_hier_useCase] :
[TM_6793 Fetching initialization properties from the Integration Service. : (Mon Apr 03 15:10:21 2006)]
INFO : LM_36488 [Mon Apr 03 15:10:22 2006] : (3060|3180) Session task instance [s_m_jtx_hier_useCase] :
[DISP_20305 The [Preparer] DTM with process id [148] is running on node [garnet].
: (Mon Apr 03 15:10:21 2006)]
INFO : LM_36488 [Mon Apr 03 15:10:22 2006] : (3060|3180) Session task instance [s_m_jtx_hier_useCase] :
[PETL_24036 Beginning the prepare phase for the session.]
INFO : LM_36488 [Mon Apr 03 15:10:22 2006] : (3060|3180) Session task instance [s_m_jtx_hier_useCase] :
[TM_6721 Started [Connect to Repository].]

Session Logs
Session logs contain information about the tasks that the Integration Service performs during a session, plus
load summary and transformation statistics. By default, the Integration Service creates one session log for each
session it runs. If a workflow contains multiple sessions, the Integration Service creates a separate session log for
each session in the workflow. When you run a session on a grid, the Integration Service creates one session log
for each DTM process.
In general, a session log contains the following information:
♦ Allocation of heap memory
♦ Execution of pre-session commands
♦ Creation of SQL commands for reader and writer threads
♦ Start and end times for target loading
♦ Errors encountered during the session and general information
♦ Execution of post-session commands
♦ Load summary of reader, writer, and DTM statistics
♦ Integration Service version and build number

Log Events Window


Use the Log Events window in the Workflow Monitor to view log events for a session. The Log Events window
displays all log events for a session. Select a log event to view more information about the log event.

Session Log File Sample


A session log file provides most of the same information as the Log Events window for a session. The session log
file does not include severity or DTM prepare messages.
The following sample shows a section of a session log file:
DIRECTOR> PETL_24044 The Master DTM will now connect and fetch the prepared session from the Preparer DTM.
DIRECTOR> PETL_24047 The Master DTM has successfully fetched the prepared session from the Preparer DTM.
DIRECTOR> DISP_20305 The [Master] DTM with process id [2968] is running on node [sapphire].
: (Mon Apr 03 16:19:47 2006)
DIRECTOR> TM_6721 Started [Connect to Repository].
DIRECTOR> TM_6722 Finished [Connect to Repository]. It took [0.656233] seconds.
DIRECTOR> TM_6794 Connected to repository [HR_80] in domain [StonesDomain] user [ellen]
DIRECTOR> TM_6014 Initializing session [s_PromoItems] at [Mon Apr 03 16:19:48 2006]
DIRECTOR> TM_6683 Repository Name: [HR_80]
DIRECTOR> TM_6684 Server Name: [Copper]
DIRECTOR> TM_6686 Folder: [Snaps]
DIRECTOR> TM_6685 Workflow: [wf_PromoItems]
DIRECTOR> TM_6101 Mapping name: m_PromoItems [version 1]
DIRECTOR> SDK_1805 Recovery cache will be deleted when running in normal mode.
DIRECTOR> SDK_1802 Session recovery cache initialization is complete.

The session log file includes the Integration Service version and build number.
DIRECTOR> TM_6703 Session [s_PromoItems] is run by 32-bit Integration Service [sapphire], version [8.1.0],
build [0329].

Session Logs 383


Setting Tracing Levels
The amount of detail that logs contain depends on the tracing level that you set. You can configure tracing
levels for each transformation or for the entire session. By default, the Integration Service uses tracing levels
configured in the mapping.
Setting a tracing level for the session overrides the tracing levels configured for each transformation in the
mapping. If you select a normal tracing level or higher, the Integration Service writes row errors into the session
log, including the transformation in which the error occurred and complete row data. If you configure the
session for row error logging, the Integration Service writes row errors to the error log instead of the session log.
If you want the Integration Service to write dropped rows to the session log also, configure the session for
verbose data tracing.
Set the tracing level on the Config Object tab in the session properties.
Table 25-3 describes the session log tracing levels:

Table 25-3. Session Log Tracing Levels

Tracing Level Description

None Integration Service uses the tracing level set in the mapping.

Terse Integration Service logs initialization information, error messages, and notification of
rejected data.

Normal Integration Service logs initialization and status information, errors encountered, and
skipped rows due to transformation row errors. Summarizes session results, but not at the
level of individual rows.

Verbose In addition to normal tracing, the Integration Service logs additional initialization details,
Initialization names of index and data files used, and detailed transformation statistics.

Verbose Data In addition to verbose initialization tracing, the Integration Service logs each row that
passes into the mapping. Also notes where the Integration Service truncates string data to
fit the precision of a column and provides detailed transformation statistics.
When you configure the tracing level to verbose data, the Integration Service writes row
data for all rows in a block when it processes a transformation.

You can also enter tracing levels for individual transformations in the mapping. When you enter a tracing level
in the session properties, you override tracing levels configured for transformations in the mapping.

Viewing Log Events


You can view the following types of log files:
♦ Most recent session or workflow log. View the session or workflow log in the Log Events window for the
last run workflow.
♦ Archived binary log files. View archived binary log files in the Log Events window.
♦ Archived text log files. View archived text log files in any text editor.

To view the Log Events window for a session or workflow:

1. In the Workflow Monitor, right-click the workflow or session.


2. Select Get Session Log or Get Workflow Log.

384 Chapter 25: Session and Workflow Logs


To view an archived binary log file in the Log Events window:

1. If you do not know the session or workflow log file name and location, check the Log File Name and Log
File Directory attributes on the Session or Workflow Properties tab.
If you are running the Integration Service on UNIX and the binary log file is not accessible on the
Windows machine where the PowerCenter client is running, you can transfer the binary log file to the
Windows machine using FTP.
2. In the Workflow Monitor, click Tools > Import Log.
3. Navigate to the session or workflow log file directory.
4. Select the binary log file you want to view.
5. Click Open.

To view a text log file:

1. If you do not know the session or workflow log file name and location, check the Log File Name and Log
File Directory attributes on the Session or Workflow Properties tab.
2. Navigate to the session or workflow log file directory.
The session and workflow log file directory contains the text log files and the binary log files. If you archive
log files, check the file date to find the latest log file for the session.
3. Open the log file in any text editor.

Viewing Log Events 385


386 Chapter 25: Session and Workflow Logs
CHAPTER 26

Working with the Session Log


Interface
This chapter includes the following topics:
♦ Overview, 387
♦ Implementing the Session Log Interface, 387
♦ Functions in the Session Log Interface, 388
♦ Session Log Interface Example, 391

Overview
By default, the Integration Service writes session events to binary log files on the node where the service process
runs. In addition, the Integration Service can pass the session event information to an external library. In the
external shared library, you can provide the procedure for how the Integration Service writes the log events.
PowerCenter provides access to the session event information through the Session Log Interface. When you
create the shared library, you implement the functions provided in the Session Log Interface.
When the Integration Service writes the session events, it calls the functions specified in the Session Log
Interface. The functions in the shared library you create must match the function signatures defined in the
Session Log Interface.

Implementing the Session Log Interface


To configure the Integration Service to use a custom procedure for handling session event information,
complete the following steps:
1. Create a shared library that implements the Session Log Interface.
2. When you configure the Integration Service properties on the Administration Console, set the
ExportSessionLogLibName property to the name of the shared library that you create.

The Integration Service and the Session Log Interface


When you set the ExportSessionLogLibName property of the Integration Service to the name of a shared
library, the Integration Service performs the procedures defined in the shared library in addition to creating the
event log files.

Overview 387
The Integration Service uses the shared library in the following manner:
1. The Integration Service loads the shared library and calls the INFA_InitSessionLog() function before it logs
the first event in the session.
2. Each time the Integration Service logs an event to the session log file, it calls the
INFA_OutputSessionLog() function to pass the message, codes, and session information to the shared
library.
3. When the session completes and the last event is logged, the Integration Service calls the
INFA_EndSessionLog() and then unloads the shared library.
To ensure that the shared library can be correctly called by the Integration Service, follow the guidelines for
writing the shared library.

Rules and Guidelines for Implementing the Session Log Interface


Use the following rules and guidelines when you write the code to implement the Session Log Interface:
♦ You must implement all the functions in the Session Log Interface.
♦ All calls from the Integration Service to the functions in the Session Log Interface are serialized except for
abnormal termination. The Integration Service makes the calls to the functions as it logs events to the
session log. Therefore, when you implement the functions in the Session Log Interface, you do not need to
use mutex objects to ensure that only one thread executes a section of code at a time.
♦ When you implement the Session Log Interface in UNIX, do not perform any signal handling within the
functions. This ensures that the functions do not interfere with the way that PowerCenter handles signals.
Do not register or unregister any signal handlers.
♦ Since the Integration Service is a multi-threaded process, you must compile the shared library as a multi-
threaded library so that it can be loaded correctly.

Functions in the Session Log Interface


The functions in the Session Log Interface do not return values. Therefore, a session cannot fail because of an
Integration Service call to a function in the Session Log Interface.
Table 26-1 describes the functions in the Session Log Interface:

Table 26-1. Functions in the Session Log Interface

Function Description

INFA_InitSessionLog Provides information about the session for which the Integration Service
will write the event logs.

INFA_OutputSessionLogMsg Called each time an event is logged. Passes the information about the
event.

INFA_OutputSessionLogFatalMsg Called when the last event is logged before an abnormal termination.

INFA_EndSessionLog Called after the last message is sent to the session log and the session
terminates normally.

INFA_AbnormalSessionTermination Called after the last message is sent to the session log and the session
terminates abnormally.

The functions described in this section use the time structures declared in the standard C header file time.h.
The functions also assume the following declarations:
typedef int INFA_INT32;
typedef unsigned int INFA_UINT32;
typedef unsigned short INFA_UNICHAR;

388 Chapter 26: Working with the Session Log Interface


typedef char INFA_MBCSCHAR;
typedef int INFA_MBCS_CODEPAGE_ID;

INFA_InitSessionLog
void INFA_InitSessionLog(void ** dllContext,
const INFA_UNICHAR * sServerName,
const INFA_UNICHAR * sFolderName,
const INFA_UNICHAR * sWorkflowName,
const INFA_UNICHAR * sessionHierName[]);

The Integration Service calls the INFA_InitSessionLog function before it writes any session log event. The
parameters passed to this function provide information about the session for which the Integration Service will
write the event logs.
INFA_InitSessionLog has the following parameters:

Parameter Data Type Description

dllContext Unspecified User-defined information specific to the shared library. This


parameter is passed to all functions in subsequent function
calls. You can use this parameter to store information related
to network connection or to allocate memory needed during
the course of handling the session log output. The shared
library must allocate and deallocate any memory associated
with this parameter.

sServerName unsigned short Name of the Integration Service running the session.

sFolderName unsigned short Name of the folder that contains the session.

sWorkflowName unsigned short Name of the workflow associated with the session

sessionHierName[] unsigned short Array that contains the session hierarchy. The array includes
array the repository, workflow, and worklet (if any) to which the
session belongs. The size of the array divided by the size of
the pointer equals the number of array elements.

INFA_OutputSessionLogMsg
void INFA_OutputSessionLogMsg(
void * dllContext,
time_t curTime,
INFA_UINT32 severity,
const INFA_UNICHAR * msgCategoryName,
INFA_UINT32 msgCode,
const INFA_UNICHAR * msg,
const INFA_UNICHAR * threadDescription);

The Integration Service calls this function each time it logs an event. The parameters passed to the function
include the different elements of the log event message. You can use the parameters to customize the format for
the log output or to filter out messages.
INFA_OutputSessionLogMsg has the following parameters:

Parameter Data Type Description

dllContext Unspecified User-defined information specific to the shared library. You can use
this parameter to store information related to network connection
or to allocate memory needed during the course of handling the
session log output. The shared library must allocate and deallocate
any memory associated with this parameter.

curTime time_t Time that the Integration Service logs the event.

Functions in the Session Log Interface 389


Parameter Data Type Description

severity unsigned int Code that indicates the type of the log event message. The event
logs use the following severity codes:
32: Debug Messages
8: Informational Messages
2: Error Messages

msgCategoryName constant Code prefix that indicates the category of the log event message.
unsigned short In the following example message, the string BLKR is the value
passed in the msgCategoryName parameter.
READER_1_1_1> BLKR_16003 Initialization
completed successfully.

msgCode unsigned int Number that identifies the log event message.
In the following example message, the string 16003 is the value
passed in the msgCode parameter.
READER_1_1_1> BLKR_16003 Initialization
completed successfully.

msg constant Text of the log event message.


unsigned short In the following example message, the string Initialization
completed successfully is the value passed in the msg parameter.
READER_1_1_1> BLKR_16003 Initialization
completed successfully.

threadDescription constant Code that indicates which thread is generating the event log.
unsigned short In the following example message, the string READER_1_1_1 is
the value passed in the threadDescription parameter.
READER_1_1_1> BLKR_16003 Initialization
completed successfully.

INFA_OutputSessionLogFatalMsg
void INFA_OutputSessionLogFatalMsg(void * dllContext, const char * msg);

The Integration Service calls this function to log the last event before an abnormal termination. The parameter
msg is MBCS characters in the Integration Service code page.
When you implement this function in UNIX, make sure that you call only asynchronous signal safe functions
from within this function.
INFA_OutputSessionLogFatalMsg has the following parameters:

Parameter Data Type Description

dllContext Unspecified User-defined information specific to the shared library. You can use
this parameter to store information related to network connection
or to allocate memory needed during the course of handling the
session log output. The shared library must allocate and deallocate
any memory associated with this parameter.

msg constant char Text of the error message. Typically, these messages are assertion
error messages or operating system error messages.

INFA_EndSessionLog
void INFA_EndSessionLog(void * dllContext);

The Integration Service calls this function after the last message is sent to the session log and the session
terminates normally. You can use this function to perform clean up operations and release memory and
resources.

390 Chapter 26: Working with the Session Log Interface


INFA_EndSessionLog has the following parameter:

Parameter Data Type Description

dllContext Unspecified User-defined information specific to the shared library. You can
use this parameter to store information related to network
connection or to allocate memory needed during the course of
handling the session log output. The shared library must
allocate and deallocate any memory associated with this
parameter.

INFA_AbnormalSessionTermination
void INFA_AbnormalSessionTermination(void * dllContext);

The Integration Service calls this function after the last message is sent to the session log and the session
terminates abnormally. The Integration Service calls this function after it calls the
INFA_OutputSessionLogFatalMsg function. If the Integration Service calls this function, then it does not call
INFA_EndSessionLog.
For example, the Integration Service calls this function when the DTM aborts or times out. In UNIX, the
Integration Service calls this function when a signal exception occurs.
Include only minimal functionality when you implement this function. In UNIX, make sure that you call only
asynchronous signal safe functions from within this function.
INFA_AbnormalSessionTermination has the following parameter:

Parameter Data Type Description

dllContext Unspecified User-defined information specific to the shared library. You can
use this parameter to store information related to network
connection or to allocate memory needed during the course of
handling the session log output. The shared library must
allocate and deallocate any memory associated with this
parameter.

Session Log Interface Example


Informatica provides a sample program that uses the Session Log Interface. The sample program sends session
log events to a text file called sesslog.log. You can view the sample program to gain more understanding about
how to use the Session Log Interface to handle session log events based on your requirements. You can also
compile the sample program and build an external library to send session log events to a text file.
The session log sample program is available when you install the PowerCenter SDK files from the Informatica
Development Platform installer. By default, the session log sample program is installed in the following
directory:
<SDKInstallationDir>/SessionLog_API/samples

Building the External Session Log Library


Use the make files provided with the sample program demo_sesslog.cpp to build the external library. The
command to compile the library depends on the platform on which you build it.

Session Log Interface Example 391


Building the Library in UNIX
The following table shows the command to build the library on the different platforms:

Platform Compiler Command

Solaris CC make -f makefile_sol

HP-UX aCC make -f makefile_hpux

HP-UX 64 bit aCC make -f makefile_hpux64

AIX xCl_r make -f makefile_aix

AIX 64 bit xCl_r make -f makefile_aix64

Linux g++ make -f makefile_linux

Building the Library in Windows


Use Microsoft Visual C++ 6.0 to build the sample session log library in Windows. Open the sample program
demo_sesslog.dsw in Visual C++ 6.0 and build the project.

Using the External Session Log Library


After you build the library, you can use it to write the output of the session log into a file.
To use the sample external session log library, complete the following steps:
1. Log in to the Administration Console and select the Integration Service for which you want to set up the
session log file.
2. On the Properties tab of the Integration Service, edit the configuration properties.
3. Set the ExportSessionLogLibName property to the path and file name of the session log library you created
from the session log sample program.

392 Chapter 26: Working with the Session Log Interface


CHAPTER 27

Row Error Logging


This chapter includes the following topics:
♦ Overview, 393
♦ Understanding the Error Log Tables, 394
♦ Understanding the Error Log File, 399
♦ Configuring Error Log Options, 401

Overview
When you configure a session, you can log row errors in a central location. When a row error occurs, the
Integration Service logs error information that lets you determine the cause and source of the error. The
Integration Service logs information such as source name, row ID, current row data, transformation,
timestamp, error code, error message, repository name, folder name, session name, and mapping information.
You can log row errors into relational tables or flat files. When you enable error logging, the Integration Service
creates the error tables or an error log file the first time it runs the session. Error logs are cumulative. If the error
logs exist, the Integration Service appends error data to the existing error logs.
You can log source row data. Source row data includes row data, source row ID, and source row type from the
source qualifier where an error occurs. The Integration Service cannot identify the row in the source qualifier
that contains an error if the error occurs after a non pass-through partition point with more than one partition
or one of the following active sources:
♦ Aggregator
♦ Custom, configured as an active transformation
♦ Joiner
♦ Normalizer (pipeline)
♦ Rank
♦ Sorter
By default, the Integration Service logs transformation errors in the session log and reject rows in the reject file.
When you enable error logging, the Integration Service does not generate a reject file or write dropped rows to
the session log. Without a reject file, the Integration Service does not log Transaction Control transformation
rollback or commit errors. If you want to write rows to the session log in addition to the row error log, you can
enable verbose data tracing.
Note: When you log row errors, session performance may decrease because the Integration Service processes one
row at a time instead of a block of rows at once.

393
Error Log Code Pages
The Integration Service writes data to the error log file differently depending on the Integration Service process
operating system:
♦ UNIX. The Integration Service writes data to the error log file using the Integration Service process code
page. However, you can configure the Integration Service to write to the error log file using UTF-8 by
enabling the LogsInUTF8 Integration Service property.
♦ Windows. The Integration Service writes all characters in the error log file using the UTF-8 encoding
format.
The code page for the relational database where the error tables exist must be a subset of the target code page. If
the error log table code page is not a subset of the target code page, the Integration Service might write
inconsistent data in the error log tables.

Understanding the Error Log Tables


When you choose relational database error logging, the Integration Service creates the following error tables the
first time you run a session:
♦ PMERR_DATA. Stores data and metadata about a transformation row error and its corresponding source
row.
♦ PMERR_MSG. Stores metadata about an error and the error message.
♦ PMERR_SESS. Stores metadata about the session.
♦ PMERR_TRANS. Stores metadata about the source and transformation ports, such as name and datatype,
when a transformation error occurs.
You specify the database connection to the database where the Integration Service creates these tables. If the
error tables exist for a session, the Integration Service appends row errors to these tables.
Relational database error logging lets you collect row errors from multiple sessions in one set of error tables. To
do this, you specify the same error log table name prefix for all sessions. You can issue select statements on the
generated error tables to retrieve error data for a particular session.
You can specify a prefix for the error tables. The error table names can have up to eleven characters. Do not
specify a prefix that exceeds 19 characters when naming Oracle, Sybase, or Teradata error log tables, as these
databases have a maximum length of 30 characters for table names. You can use a parameter or variable for the
table name prefix. Use any parameter or variable type that you can define in the parameter file. For example,
you can use a session parameter, $ParamMyErrPrefix, as the error log table name prefix, and set
$ParamMyErrPrefix to the table prefix in a parameter file.
The Integration Service creates the error tables without specifying primary and foreign keys. However, you can
specify key columns.

PMERR_DATA
When the Integration Service encounters a row error, it inserts an entry into the PMERR_DATA table. This
table stores data and metadata about a transformation row error and its corresponding source row.
The following table describes the structure of the PMERR_DATA table:

Column Name Datatype Description

REPOSITORY_GID Varchar Unique identifier for the repository.

WORKFLOW_RUN_ID Integer Unique identifier for the workflow.

394 Chapter 27: Row Error Logging


Column Name Datatype Description

WORKLET_RUN_ID Integer Unique identifier for the worklet. If a session is not part of a
worklet, this value is “0”.

SESS_INST_ID Integer Unique identifier for the session.

TRANS_MAPPLET_INST Varchar Name of the mapplet where an error occurred.

TRANS_NAME Varchar Name of the transformation where an error occurred.

TRANS_GROUP Varchar Name of the input group or output group where an error
occurred. Defaults to either “input” or “output” if the
transformation does not have a group.

TRANS_PART_INDEX Integer Specifies the partition number of the transformation where an


error occurred.

TRANS_ROW_ID Integer Specifies the row ID generated by the last active source.

TRANS_ROW_DATA Long Varchar Delimited string containing all column data, including the
column indicator. Column indicators are:
D - valid
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column
indicator is colon ( : ). The delimiter between the columns is
pipe ( | ). You can override the column delimiter in the error
handling settings.

The Integration Service converts all column data to text


string in the error table. For binary data, the Integration
Service uses only the column indicator.

This value can span multiple rows. When the data exceeds
2000 bytes, the Integration Service creates a new row. The
line number for each row error entry is stored in the
LINE_NO column.

SOURCE_ROW_ID Integer Value that the source qualifier assigns to each row it reads. If
the Integration Service cannot identify the row, the value is
-1.

SOURCE_ROW_TYPE Integer Row indicator that tells whether the row was marked for
insert, update, delete, or reject.
0 - Insert
1 - Update
2 - Delete
3 - Reject

Understanding the Error Log Tables 395


Column Name Datatype Description

SOURCE_ROW_DATA Long Varchar Delimited string containing all column data, including the
column indicator. Column indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column
indicator is colon ( : ). The delimiter between the columns is
pipe ( | ). You can override the column delimiter in the error
handling settings.

The Integration Service converts all column data to text


string in the error table or error file. For binary data, the
Integration Service uses only the column indicator.

This value can span multiple rows. When the data exceeds
2000 bytes, the Integration Service creates a new row. The
line number for each row error entry is stored in the
LINE_NO column.

LINE_NO Integer Specifies the line number for each row error entry in
SOURCE_ROW_DATA and TRANS_ROW_DATA that spans
multiple rows.
Note: Use the column names in bold to join tables.

PMERR_MSG
When the Integration Service encounters a row error, it inserts an entry into the PMERR_MSG table. This
table stores metadata about the error and the error message.
The following table describes the structure of the PMERR_MSG table:

Column Name Datatype Description

REPOSITORY_GID Varchar Unique identifier for the repository.

WORKFLOW_RUN_ID Integer Unique identifier for the workflow.

WORKLET_RUN_ID Integer Unique identifier for the worklet. If a session is not part of a
worklet, this value is “0”.

SESS_INST_ID Integer Unique identifier for the session.

MAPPLET_INST_NAME Varchar Mapplet to which the transformation belongs. If the


transformation is not part of a mapplet, this value is n/a.

TRANS_NAME Varchar Name of the transformation where an error occurred.

TRANS_GROUP Varchar Name of the input group or output group where an error
occurred. Defaults to either “input” or “output” if the
transformation does not have a group.

TRANS_PART_INDEX Integer Specifies the partition number of the transformation where an


error occurred.

TRANS_ROW_ID Integer Specifies the row ID generated by the last active source.

ERROR_SEQ_NUM Integer Counter for the number of errors per row in each
transformation group. If a session has multiple partitions, the
Integration Service maintains this counter for each partition.
For example, if a transformation generates three errors in
partition 1 and two errors in partition 2, ERROR_SEQ_NUM
generates the values 1, 2, and 3 for partition 1, and values 1
and 2 for partition 2.

396 Chapter 27: Row Error Logging


Column Name Datatype Description

ERROR_TIMESTAMP Date/Time Timestamp of the Integration Service when the error


occurred.

ERROR_UTC_TIME Integer Coordinated Universal Time, called Greenwich Mean Time,


of when an error occurred.

ERROR_CODE Integer Error code that the error generates.

ERROR_MSG Long Varchar Error message, which can span multiple rows. When the data
exceeds 2000 bytes, the Integration Service creates a new
row. The line number for each row error entry is stored in the
LINE_NO column.

ERROR_TYPE Integer Type of error that occurred. The Integration Service uses the
following values:
1 - Reader error
2 - Writer error
3 - Transformation error

LINE_NO Integer Specifies the line number for each row error entry in
ERROR_MSG that spans multiple rows.
Note: Use the column names in bold to join tables.

PMERR_SESS
When you choose relational database error logging, the Integration Service inserts entries into the
PMERR_SESS table. This table stores metadata about the session where an error occurred.
The following table describes the structure of the PMERR_SESS table:

Column Name Datatype Description

REPOSITORY_GID Varchar Unique identifier for the repository.

WORKFLOW_RUN_ID Integer Unique identifier for the workflow.

WORKLET_RUN_ID Integer Unique identifier for the worklet. If a session is not part of a
worklet, this value is “0”.

SESS_INST_ID Integer Unique identifier for the session.

SESS_START_TIME Date/Time Timestamp of the Integration Service when a session starts.

SESS_START_UTC_TIME Integer Coordinated Universal Time, called Greenwich Mean Time,


of when the session starts.

REPOSITORY_NAME Varchar Repository name where sessions are stored.

FOLDER_NAME Varchar Specifies the folder where the mapping and session are
located.

WORKFLOW_NAME Varchar Specifies the workflow that runs the session being logged.

TASK_INST_PATH Varchar Fully qualified session name that can span multiple rows.
The Integration Service creates a new line for the session
name. The Integration Service also creates a new line for
each worklet in the qualified session name. For example, you
have a session named WL1.WL2.S1. Each component of the
name appears on a new line:
WL1
WL2
S1
The Integration Service writes the line number in the
LINE_NO column.

MAPPING_NAME Varchar Specifies the mapping that the session uses.

Understanding the Error Log Tables 397


Column Name Datatype Description

LINE_NO Integer Specifies the line number for each row error entry in
TASK_INST_PATH that spans multiple rows.
Note: Use the column names in bold to join tables.

PMERR_TRANS
When the Integration Service encounters a transformation error, it inserts an entry into the PMERR_TRANS
table. This table stores metadata, such as the name and datatype of the source and transformation ports.
The following table describes the structure of the PMERR_TRANS table:

Column Name Datatype Description

REPOSITORY_GID Varchar Unique identifier for the repository.

WORKFLOW_RUN_ID Integer Unique identifier for the workflow.

WORKLET_RUN_ID Integer Unique identifier for the worklet. If a session is not part of a
worklet, this value is “0”.

SESS_INST_ID Integer Unique identifier for the session.

TRANS_MAPPLET_INST Varchar Specifies the instance of a mapplet.

TRANS_NAME Varchar Name of the transformation where an error occurred.

TRANS_GROUP Varchar Name of the input group or output group where an error
occurred. Defaults to either “input” or “output” if the
transformation does not have a group.

TRANS_ATTR Varchar Lists the port names and datatypes of the input or output
group where the error occurred. Port name and datatype
pairs are separated by commas, for example:
portname1:datatype, portname2:datatype.

This value can span multiple rows. When the data exceeds
2000 bytes, the Integration Service creates a new row for the
transformation attributes and writes the line number in the
LINE_NO column.

SOURCE_MAPPLET_INS Varchar Name of the mapplet in which the source resides.


T

SOURCE_NAME Varchar Name of the source qualifier. n/a appears when a row error
occurs downstream of an active source that is not a source
qualifier or a non pass-through partition point with more than
one partition.

SOURCE_ATTR Varchar Lists the connected field(s) in the source qualifier where an
error occurred. When an error occurs in multiple fields, each
field name is entered on a new line. Writes the line number in
the LINE_NO column.

LINE_NO Integer Specifies the line number for each row error entry in
TRANS_ATTR and SOURCE_ATTR that spans multiple
rows.
Note: Use the column names in bold to join tables.

398 Chapter 27: Row Error Logging


Understanding the Error Log File
You can create an error log file to collect all errors that occur in a session. This error log file is a column
delimited line sequential file. By specifying a unique error log file name, you can create a separate log file for
each session in a workflow. When you want to analyze the row errors for one session, use an error log file.
In an error log file, double pipes “||” delimit error logging columns. By default, pipe “|” delimits row data. You
can change this row data delimiter by setting the Data Column Delimiter error log option.
Error log files have the following structure:
[Session Header]
[Column Header]
[Column Data]

Session header contains session run information similar to the information stored in the PMERR_SESS table.
Column header contains data column names. Column data contains row data and error message information.
The following sample error log file contains a session header, column header, and column data:
**********************************************************************
Repository GID: fe4817ab-7d87-465f-9110-354222424df0
Repository: CustomerInfo
Folder: Row_Error_Logging
Workflow: wf_basic_REL_errors_AGG_case
Session: s_m_basic_REL_errors_AGG_case
Mapping: m_basic_REL_errors_AGG_case
Workflow Run ID: 1310
Worklet Run ID: 0
Session Instance ID: 19
Session Start Time: 08/03/2004 16:57:01
Session Start Time (UTC): 1067126221
**********************************************************************

Transformation||Transformation Mapplet Name||Transformation Group||Partition Index||Transformation Row


ID||Error Sequence||Error Timestamp||Error UTC Time||Error Code||Error Message||Error Type||Transformation
Data||Source Mapplet Name||Source Name||Source Row ID||Source Row Type||Source Data
agg_REL_basic||N/A||Input||1||1||1||08/03/2004 16:57:03||1067126223||11019||Port [CUST_ID_NULL]: Default
value is: ERROR(<<Expression Error>> [ERROR]: [AGG] CUST_ID - NULL detected on input.\n... nl:ERROR(s:'[AGG]
CUST_ID - NULL detected on input.')).||3||D:1221|N:|N:|N:|D:Kauai Dive Shoppe|D:4-976 Sugarloaf Hwy|D:Kapaa
Kauai|D:HI|D:94766|D:[AGG] DEFAULT SID VALUE.|D:01/01/2001
00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||1||0||D:1221|D:Kauai Dive Shoppe|D:4-976 Sugarloaf
Hwy|D:Kapaa Kauai|D:HI|D:94766
agg_REL_basic||N/A||Input||1||4||1||08/03/2004 16:57:03||1067126223||11019||Port [CITY_IN]: Default value
is: ERROR(<<Expression Error>> [ERROR]: [AGG] Null detected for City_IN.\n... nl:ERROR(s:'[AGG] Null
detected for City_IN.')).||3||D:1354|N:|N:|D:1354|T:Cayman Divers World|D:PO Box 541|N:|D:Gr|N:|D:[AGG]
DEFAULT SID VALUE.|D:01/01/2001 00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||4||0||D:1354|D:Cayman
Divers World Unlim|D:PO Box 541|N:|D:Gr|N:
agg_REL_basic||N/A||Input||1||5||1||08/03/2004 16:57:03||1067126223||11131||Transformation [agg_REL_basic]
had an error evaluating variable column [Var_Divide_by_Price]. Error message is [<<Expression Error>> [/]:
divisor is zero\n... f:(f:2 / f:(f:1 - f:TO_FLOAT(i:1)))].||3||D:1356|N:|N:|D:1356|T:Tom Sawyer Diving
C|T:632-1 Third Frydenh|D:Christiansted|D:St|D:00820|D:[AGG] DEFAULT SID VALUE.|D:01/01/2001
00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||5||0||D:1356|D:Tom Sawyer Diving Centre|D:632-1 Third
Frydenho|D:Christiansted|D:St|D:00820

The following table describes the columns in an error log file:

Log File Column


Description
Header

Transformation Name of the transformation used by a mapping where an error occurred.

Transformation Mapplet Name of the mapplet that contains the transformation. n/a appears when this
Name information is not available.

Transformation Group Name of the input or output group where an error occurred. Defaults to either
“input” or “output” if the transformation does not have a group.

Partition Index Specifies the partition number of the transformation partition where an error
occurred.

Transformation Row ID Specifies the row ID for the error row.

Understanding the Error Log File 399


Log File Column
Description
Header

Error Sequence Counter for the number of errors per row in each transformation group. If a session
has multiple partitions, the Integration Service maintains this counter for each
partition.
For example, if a transformation generates three errors in partition 1 and two
errors in partition 2, ERROR_SEQ_NUM generates the values 1, 2, and 3 for
partition 1, and values 1 and 2 for partition 2.

Error Timestamp Timestamp of the Integration Service when the error occurred.

Error UTC Time Coordinated Universal Time, called Greenwich Mean Time, when the error
occurred.

Error Code Error code that corresponds to the error message.

Error Message Error message.

Error Type Type of error that occurred. The Integration Service uses the following values:
1 - Reader error
2 - Writer error
3 - Transformation error

Transformation Data Delimited string containing all column data, including the column indicator. Column
indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column indicator is a colon ( : ). The
delimiter between the columns is a pipe ( | ). You can override the column delimiter
in the error handling settings.

The Integration Service converts all column data to text string in the error file. For
binary data, the Integration Service uses only the column indicator.

Source Name Name of the source qualifier. N/A appears when a row error occurs downstream of
an active source that is not a source qualifier or a non pass-through partition point
with more than one partition.

Source Row ID Value that the source qualifier assigns to each row it reads. If the Integration
Service cannot identify the row, the value is -1.

Source Row Type Row indicator that tells whether the row was marked for insert, update, delete, or
reject.
0 - Insert
1 - Update
2 - Delete
3 - Reject

Source Data Delimited string containing all column data, including the column indicator. Column
indicators are:
D - valid
O - overflow
N - null
T - truncated
B - binary
U - data unavailable
The fixed delimiter between column data and column indicator is a colon ( : ). The
delimiter between the columns is a pipe ( | ). You can override the column delimiter
in the error handling settings.

The Integration Service converts all column data to text string in the error table or
error file. For binary data, the Integration Service uses only the column indicator.

400 Chapter 27: Row Error Logging


Configuring Error Log Options
You configure error logging for each session on the Config Object tab of the sessions properties. When you
enable error logging, you can choose to create the error log in a relational database or as flat file. If you do not
enable error logging, the Integration Service does not create an error log.
Tip: Use the Workflow Manager to create a reusable set of attributes for the Config Object tab.

To configure error logging options:

1. Double-click the Session task to open the session properties.


2. Select the Config Object tab.
3. Specify the error log type.
The following table describes the error logging settings of the Config Object tab:

Error Log Options Description

Error Log Type Specifies the type of error log to create. You can specify relational database, flat
file, or none. By default, the Integration Service does not create an error log.

Error Log DB Specifies the database connection for a relational log. This option is required
Connection when you enable relational database logging.

Error Log Table Name Specifies the table name prefix for relational logs. The Integration Service
Prefix appends 11 characters to the prefix name. Oracle and Sybase have a 30
character limit for table names. If a table name exceeds 30 characters, the
session fails.
You can use a parameter or variable for the error log table name prefix. Use any
parameter or variable type that you can define in the parameter file.

Error Log File Specifies the directory where errors are logged. By default, the error log file
Directory directory is $PMBadFilesDir\. This option is required when you enable flat file
logging.

Error Log File Name Specifies error log file name. The character limit for the error log file name is 255.
By default, the error log file name is PMError.log. This option is required when
you enable flat file logging.

Log Row Data Specifies whether or not to log transformation row data. When you enable error
logging, the Integration Service logs transformation row data by default. If you
disable this property, n/a or -1 appears in transformation row data fields.

Log Source Row Data If you choose not to log source row data, or if source row data is unavailable, the
Integration Service writes an indicator such as n/a or -1, depending on the
column datatype.
If you do not need to capture source row data, consider disabling this option to
increase Integration Service performance.

Data Column Delimiter for string type source row data and transformation group row data. By
Delimiter default, the Integration Service uses a pipe ( | ) delimiter. Verify that you do not
use the same delimiter for the row data as the error logging columns. If you use
the same delimiter, you may find it difficult to read the error log file.

4. Click OK.

Configuring Error Log Options 401


402 Chapter 27: Row Error Logging
CHAPTER 28

Parameter Files
This chapter includes the following topics:
♦ Overview, 403
♦ Parameter and Variable Types, 404
♦ Where to Use Parameters and Variables, 405
♦ Parameter File Structure, 410
♦ Configuring the Parameter File Name and Location, 412
♦ Parameter File Example, 415
♦ Guidelines for Creating Parameter Files, 415
♦ Troubleshooting, 416
♦ Tips, 417

Overview
A parameter file is a list of parameters and variables and their associated values. These values define properties
for a service, service process, workflow, worklet, or session. The Integration Service applies these values when
you run a workflow or session that uses the parameter file.
Parameter files provide you with the flexibility to change parameter and variable values each time you run a
session or workflow. You can include information for multiple services, service processes, workflows, worklets,
and sessions in a single parameter file. You can also create multiple parameter files and use a different file each
time you run a session or workflow. The Integration Service reads the parameter file at the start of the workflow
or session to determine the start values for the parameters and variables defined in the file. You can create a
parameter file using a text editor such as WordPad or Notepad.
Consider the following information when you use parameter files:
♦ Types of parameters and variables. You can define different types of parameters and variables in a parameter
file. These include service variables, service process variables, workflow and worklet variables, session
parameters, and mapping parameters and variables. For more information, see “Parameter and Variable
Types” on page 404.
♦ Properties you can set in parameter files. Use parameters and variables to define many properties in the
Designer and Workflow Manager. For example, you can enter a session parameter as the update override for
a relational target instance, and set this parameter to the UPDATE statement in the parameter file. The
Integration Service expands the parameter when the session runs. For information about the properties you
can set in a parameter file, see “Where to Use Parameters and Variables” on page 405.

403
♦ Parameter file structure. Assign a value for a parameter or variable in the parameter file by entering the
parameter or variable name and value on a single line in the form name=value. Groups of parameters and
variables must be preceded by a heading that identifies the service, service process, workflow, worklet, or
session to which the parameters or variables apply. For more information about the structure of a parameter
file, see “Parameter File Structure” on page 410.
♦ Parameter file location. Specify the parameter file to use for a workflow or session. You can enter the
parameter file name and directory in the workflow or session properties or in the pmcmd command line. For
more information, see “Configuring the Parameter File Name and Location” on page 412.

Parameter and Variable Types


A parameter file can contain different types of parameters and variables. When you run a session or workflow
that uses a parameter file, the Integration Service reads the parameter file and expands the parameters and
variables defined in the file.
You can define the following types of parameter and variable in a parameter file:
♦ Service variables. Define general properties for the Integration Service such as email addresses, log file
counts, and error thresholds. $PMSuccessEmailUser, $PMSessionLogCount, and
$PMSessionErrorThreshold are examples of service variables. The service variable values you define in the
parameter file override the values that are set in the Administration Console.
♦ Service process variables. Define the directories for Integration Service files for each Integration Service
process. $PMRootDir, $PMSessionLogDir, and $PMBadFileDir are examples of service process variables.
The service process variable values you define in the parameter file override the values that are set in the
Administration Console. If the Integration Service uses operating system profiles, the operating system user
specified in the operating system profile must have access to the directories you define for the service process
variables.
♦ Workflow variables. Evaluate task conditions and record information in a workflow. For example, you can
use a workflow variable in a Decision task to determine whether the previous task ran properly. In a
workflow, $TaskName.PrevTaskStatus is a predefined workflow variable and $$VariableName is a user-
defined workflow variable.
♦ Worklet variables. Evaluate task conditions and record information in a worklet. You can use predefined
worklet variables in a parent workflow, but you cannot use workflow variables from the parent workflow in
a worklet. In a worklet, $TaskName.PrevTaskStatus is a predefined worklet variable and $$VariableName is
a user-defined worklet variable.
♦ Session parameters. Define values that can change from session to session, such as database connections or
file names. $PMSessionLogFile and $ParamName are user-defined session parameters.
♦ Mapping parameters. Define values that remain constant throughout a session, such as state sales tax rates.
When declared in a mapping or mapplet, $$ParameterName is a user-defined mapping parameter.
♦ Mapping variables. Define values that can change during a session. The Integration Service saves the value
of a mapping variable to the repository at the end of each successful session run and uses that value the next
time you run the session. When declared in a mapping or mapplet, $$VariableName is a mapping variable.
You cannot define the following types of variables in a parameter file:
♦ $Source and $Target connection variables. Define the database location for a relational source, relational
target, lookup table, or stored procedure.
♦ Email variables. Define session information in an email message such as the number of rows loaded, the
session completion time, and read and write statistics.
♦ Local variables. Temporarily store data in variable ports in Aggregator, Expression, and Rank
transformations.
♦ Built-in variables. Variables that return run-time or system information, such as Integration Service name or
system date.

404 Chapter 28: Parameter Files


♦ Transaction control variables. Define conditions to commit or rollback transactions during the processing
of database rows.
♦ ABAP program variables. Represent SAP structures, fields in SAP structures, or values in the ABAP
program.

Where to Use Parameters and Variables


You can use parameters and variables to assign values to properties in the Designer and Workflow Manager and
to override some service and service process properties. For example, you can use a parameter to specify the
Lookup cache file name prefix or the default remote directory for an FTP connection.
If the property is a SQL statement or command, you can either use parameters and variables within the
statement or command, or you can enter a parameter or variable in the input field for the property, and set the
parameter or variable to the entire statement or command in the parameter file.
For example, you want to use a parameter or variable in a relational target override. You can enter a parameter
or variable within the UPDATE statement of a relational target override and define the parameter or variable
below the appropriate heading in the parameter file. Or, to define the UPDATE statement in a parameter file,
complete the following steps:
1. In the Designer, edit the target instance, enter session parameter $ParamMyOverride in the Update
Override field, and save the mapping.
2. In the Workflow Manager, configure the workflow or session to use a parameter file. For more
information, see “Configuring the Parameter File Name and Location” on page 412.
3. Set $ParamMyOverride to the SQL UPDATE statement below the appropriate heading in the parameter
file. For more information about parameter file headings, see “Parameter File Structure” on page 410.
You can also use a parameter file to override service and service process properties defined in the Administration
Console. For example, you can override the session log directory, $PMSessionLogDir. To do this, configure the
workflow or session to use a parameter file and set $PMSessionLogDir to the new file path in the parameter file.
You can specify parameters and variables for the following PowerCenter objects:
♦ Sources. You can use parameters and variables in input fields related to sources. For more information, see
Table 28-1 on page 406.
♦ Targets. You can use parameters and variables in input fields related to targets. For more information, see
Table 28-2 on page 406.
♦ Transformations. You can use parameters and variables in input fields related to transformations. For more
information, see Table 28-3 on page 406.
♦ Tasks. You can use parameters and variables in input fields related to tasks in the Workflow Manager. For
more information, see Table 28-4 on page 407.
♦ Sessions. You can use parameters and variables in input fields related to Session tasks. For more
information, see Table 28-5 on page 407.
♦ Workflows. You can use parameters and variables in input fields related to workflows. For more
information, see Table 28-6 on page 409.
♦ Connections. You can use parameters and variables in input fields related to connection objects. For more
information, see Table 28-7 on page 409.
♦ Data profiling objects. You can use parameters and variables in input fields related to data profiling. For
more information, see Table 28-8 on page 410.

Where to Use Parameters and Variables 405


Table 28-1 lists the input fields related to sources where you can specify parameters and variables:

Table 28-1. Source Input Fields that Accept Parameters and Variables

Source Type Field Valid Parameter and Variable Types

Relational Source Table Name* Workflow variables, worklet variables, session


parameters, mapping parameters, and mapping
variables

PeopleSoft SetID, All


Effective date,
Tree name,
Set control value,
Extract date

TIBCO TIB/Adapter SDK repository URL Service and service process variables

Web Service Endpoint URL* Mapping parameters and variables


* You can specify parameters and variables in this field when you override it in the session properties (Mapping tab) in the
Workflow Manager.

Table 28-2 lists the input fields related to targets where you can specify parameters and variables:

Table 28-2. Target Input Fields that Accept Parameters and Variables

Target Type Field Valid Parameter and Variable Types

Relational Update override*, All


Pre- and post-session SQL
commands*

Relational Target Table Name* Workflow variables, worklet variables, session


parameters, mapping parameters, and mapping
variables

XML Cache directory* Service and service process variables

TIBCO TIB/Adapter SDK repository URL Service and service process variables

Web Service Endpoint URL* Mapping parameters and variables


* You can specify parameters and variables in this field when you override it in the session properties (Mapping tab) in the
Workflow Manager.

Table 28-3 lists the input fields related to transformations where you can specify parameters and variables:

Table 28-3. Transformation Input Fields that Accept Parameters and Variables

Transformation
Field Valid Parameter and Variable Types
Type

Transformations that Transformation expressions Mapping parameters and variables


allow you to use the
Expression Editor

Aggregator, Joiner, Cache directory* Service and service process variables


Lookup, Rank,
XML Generator

Custom, Runtime location* Service and service process variables


External Procedure,
HTTP, XML Parser

External Procedure Initialization properties Service and service process variables

HTTP Base URL Mapping parameters and variables

Lookup SQL override*, All


Cache file name prefix*

406 Chapter 28: Parameter Files


Table 28-3. Transformation Input Fields that Accept Parameters and Variables

Transformation
Field Valid Parameter and Variable Types
Type

Lookup Connection information* Session parameters $DBConnectionName and


$AppConnectionName, connection variables
$Source and $Target

Sorter Default work directory* Service and service process variables

Source Qualifier SQL query*, All


(relational source) User-defined join*,
Source filter condition*,
Pre- and post-session SQL
commands*

SQL Script file name* Mapping parameters and variables

Stored Procedure Call text (unconnected Stored All


Procedure)*

Stored Procedure Connection information* Session parameter $DBConnectionName,


connection variables $Source and $Target

Web Services Endpoint URL* Mapping parameters and variables


Consumer
* You can specify parameters and variables in this field when you override it in the session properties (Mapping tab) in the
Workflow Manager.

Table 28-4 lists the input fields related to Workflow Manager tasks where you can specify parameters and
variables:

Table 28-4. Task Input Fields that Accept Parameters and Variables

Task Type Field Valid Parameter and Variable Types

Assignment task Assignment (user defined Workflow and worklet variables


variables and expression)

Command task Command (name and command) Service, service process, workflow, and worklet
variables

Command task Pre- and post-session shell All


commands

Decision task Decision name (condition to be Workflow and worklet variables


evaluated)

Email task Email user name, subject, and Service, service process, workflow, and worklet
text variables

Event-Wait task File watch name (predefined Service, service process, workflow, and worklet
events) variables

Link Link condition Service, service process, workflow, and worklet


variables

Session See Table 28-5.

Timer task Absolute time: Workflow date- Workflow and worklet variables
time variable to calculate the wait

Table 28-5 lists the input fields related to sessions where you can specify parameters and variables:

Table 28-5. Session Input Fields that Accept Parameters and Variables

Tab Field Valid Parameter and Variable Types

Properties tab Session log file name Built-in session parameter $PMSessionLogFile

Properties tab Session log file directory Service and service process variables

Where to Use Parameters and Variables 407


Table 28-5. Session Input Fields that Accept Parameters and Variables

Tab Field Valid Parameter and Variable Types

Properties tab Parameter file name Workflow and worklet variables

Properties tab $Source and $Target connection Session parameters $DBConnectionName and
values $AppConnectionName, connection variables
$Source and $Target

Properties tab Pushdown optimization session Mapping parameter $$PushdownConfig


property

Config Object tab Session log count Service variable $PMSessionLogCount

Config Object tab Session error threshold Service variable $PMSessionErrorThreshold

Config Object tab Table name prefix for relational All


error logs

Config Object tab Error log file name and directory Service variables, service process variables,
workflow variables, worklet variables, session
parameters

Config Object tab Number of partitions for dynamic Built-in session parameter
partitioning $DynamicPartitionCount

Mapping tab Transformation properties that Varies according to property. For more
override properties you configure information, see Table 28-2 on page 406 and
in a mapping Table 28-3 on page 406.

Mapping tab Relational connection values Session parameter $DBConnectionName,


connection variables $Source and $Target

Mapping tab Queue connection values* Session parameter $QueueConnectionName

Mapping tab FTP connection values* Session parameter $FTPConnectionName

Mapping tab Application connection values* Session parameter $AppConnectionName

Mapping tab External loader connection Session parameter $LoaderConnectionName


values*

Mapping tab FTP remote file name All

Mapping tab Lookup source file name and Service variables, service process variables,
directory workflow variables, worklet variables, session
parameters

Mapping tab Pre- and post-session SQL All


commands (source and target)

Mapping tab Code page for file sources and Workflow variables, worklet variables, session
targets parameter $ParamName

Mapping tab Source input file name and Service variables, service process variables,
directory workflow variables, worklet variables, session
parameters

Mapping tab Source input file command Service variables, service process variables,
workflow variables, worklet variables, session
parameters

Mapping tab Table owner name for relational All


sources

Mapping tab Target merge file name and Service variables, service process variables,
directory workflow variables, worklet variables, session
parameters

Mapping tab Target merge command Service variables, service process variables,
workflow variables, worklet variables, session
parameters

408 Chapter 28: Parameter Files


Table 28-5. Session Input Fields that Accept Parameters and Variables

Tab Field Valid Parameter and Variable Types

Mapping tab Target header and footer Service variables, service process variables,
commands workflow variables, worklet variables, session
parameters

Mapping tab Target output file name and Service variables, service process variables,
directory workflow variables, worklet variables, session
parameters

Mapping tab Target reject file name and Service variables, service process variables,
directory workflow variables, worklet variables, session
parameters

Mapping tab Target table name prefix All

Mapping tab Teradata FastExport temporary Service and service process variables
file

Mapping tab Control file content override for All


Teradata external loaders

Mapping tab Recovery cache directory for Service and service process variables
WebSphere MQ, JMS, SAP ALE
IDoc, TIBCO, webMethods, Web
Service Provider sources

Mapping tab Durable Subscription Name Session parameter $ParamName

Mapping tab MQ Source Qualifier filter All


condition

Mapping tab SAP stage file name and Service and service process variables
directory

Mapping tab SAP source file directory Service and service process variables

Components tab Post-session email (user name, All


subject, and text)

Components tab Post-session email attachment All


file name
* You can override connection attributes for this connection type in the parameter file.

Table 28-6 lists the input fields related to workflows where you can specify parameters and variables:

Table 28-6. Workflow Input Fields that Accept Parameters and Variables

Tab Field Valid Parameter and Variable Types

Properties tab Workflow log file name and Service, service process, workflow, and worklet
directory variables

Properties tab Workflow log count Service variable $PMWorkflowLogCount

General tab Suspension email (user name, Service, service process, workflow, and worklet
subject, and text) variables

Table 28-7 lists the input fields related to connection objects in the Workflow Manager where you can specify
parameters and variables:

Table 28-7. Connection Object Input Fields that Accept Parameters and Variables

Connection Type Field Valid Parameter and Variable Types

Relational Database user name, password* Session parameter $ParamName

Relational: Source, Connection and transaction All


Target, Lookup, environment SQL
Stored Procedure

Where to Use Parameters and Variables 409


Table 28-7. Connection Object Input Fields that Accept Parameters and Variables

Connection Type Field Valid Parameter and Variable Types

FTP User name, password* for host Session parameter $ParamName


machine

FTP Default remote directory All

Application Application user name, Session parameter $ParamName


password*

Application: Web Endpoint URL Session parameter $ParamName, mapping


Services Consumer parameters and variables

Application: HTTP Base URL Session parameter $ParamName

Application: JMS JMS Destination Session parameter $ParamName

Loader Database user name, password* Session parameter $ParamName


* Encrypt the password in the parameter file using the pmpasswd command line program with the CRYPT_DATA encryption
type.

Table 28-8 lists the input fields related to data profiling where you can specify parameters and variables:

Table 28-8. Data Profiling Input Fields that Accept Parameters and Variables

Object Field Valid Parameter and Variable Types

Data Profile domain Data profiling domain value Service and service process variables

Parameter File Structure


A parameter file contains a list of parameters and variables with assigned values. You group parameters and
variables in different sections of the parameter file. Each section is preceded by a heading that identifies the
Integration Service, Integration Service process, workflow, worklet, or session to which you want to define
parameters or variables. You define parameters and variables directly below the heading, entering each
parameter or variable on a new line. You can list parameters and variables in any order within a section.
Enter the parameter or variable definition in the form name=value. For example, the following lines assign a
value to service variable $PMSuccessEmailUser and session parameter $ParamTgtOverride:
$PMSuccessEmailUser=rsmith@mail.com
$ParamTgtOverride=UPDATE T_SALES SET DATE_SHIPPED = :TU.DATE_SHIPPED, TOTAL_SALES =
:TU.TOTAL_SALES WHERE :TU.EMP_NAME = EMP_NAME and EMP_NAME = 'MIKE SMITH'

The Integration Service interprets all characters between the beginning of the line and the first equals sign as
the parameter name and all characters between the first equals sign and the end of the line as the parameter
value. Therefore, if you enter a space between the parameter name and the equals sign, the Integration Service
interprets the space as part of the parameter name. If a line contains multiple equals signs, the Integration
Service interprets all equals signs after the first one as part of the parameter value.

Parameter File Sections


You can define parameters and variables in any section in the parameter file. If you define a service or service
process variable in a workflow, worklet, or session section, the variable applies to the service process that runs
the task. Similarly, if you define a workflow variable in a session section, the value of the workflow variable
applies only when the session runs.

410 Chapter 28: Parameter Files


Table 28-9 describes the parameter file headings that define each section in the parameter file and the scope of
the parameters and variables that you define in each section:

Table 28-9. Parameter File Headings

Heading Scope

[Global] All Integration Services, Integration Service processes,


workflows, worklets, and sessions.

[Service:service name] The named Integration Service and workflows,


worklets, and sessions that this service runs.

[Service:service name.ND:node name] The named Integration Service process and workflows,
worklets, and sessions that this service process runs.

[folder name.WF:workflow name] The named workflow and all sessions within the
workflow.

[folder name.WF:workflow name.WT:worklet name] The named worklet and all sessions within the worklet.

[folder name.WF:workflow name.WT:worklet The nested worklet and all sessions within the nested
name.WT:worklet name...] worklet.

[folder name.WF:workflow name.ST:session name] The named session.


-or-
[folder name.WF:workflow name.WT:worklet
name.ST:session name]
-or-
[folder name.session name]
-or-
[session name]

Create each heading only once in the parameter file. If you specify the same heading more than once in a
parameter file, the Integration Service uses the information in the section below the first heading and ignores
the information in the sections below subsequent identical headings. For example, a parameter file contains the
following identical headings:
[HET_TGTS.WF:wf_TCOMMIT1]
$$platform=windows
...
[HET_TGTS.WF:wf_TCOMMIT1]
$$platform=unix
$DBConnection_ora=Ora2

In workflow wf_TCOMMIT1, the value for mapping parameter $$platform is “windows,” not “unix,” and
session parameter $DBConnection_ora is not defined.
If you define the same parameter or variable in multiple sections in the parameter file, the parameter or variable
with the smallest scope takes precedence over parameters or variables with larger scope. For example, a
parameter file contains the following sections:
[HET_TGTS.WF:wf_TGTS_ASC_ORDR]
$DBConnection_ora=Ora2
[HET_TGTS.WF:wf_TGTS_ASC_ORDR.ST:s_TGTS_ASC_ORDR]
$DBConnection_ora=Ora3

In session s_TGTS_ASC_ORDR, the value for session parameter $DBConnection_ora is “Ora3.” In all other
sessions in the workflow, it is “Ora2.”

Comments
You can include comments in parameter files. The Integration Service ignores lines that are not valid headings
and do not contain an equals sign character (=). The following lines are examples of parameter file comments:
---------------------------------------
Created 10/11/06 by JSmith.

Parameter File Structure 411


*** Update the parameters below this line when you run this workflow on Integration
Service Int_01. ***
; This is a valid comment because this line contains no equals sign.

Null Values
You can assign null values to parameters and variables in the parameter file. When you assign null values to
parameters and variables, the Integration Service obtains the value from the following places, depending on the
parameter or variable type:
♦ Service and service process variables. The Integration Service uses the value set in the Administration
Console.
♦ Workflow and worklet variables. The Integration Service uses the value saved in the repository (if the
variable is persistent), the user-specified default value, or the datatype default value.
♦ Session parameters. Session parameters do not have default values. If the Integration Service cannot find a
value for a session parameter, it may fail the session, take an empty string as the default value, or fail to
expand the parameter at run time. For example, the Integration Service fails a session where the session
parameter $DBConnectionName is not defined.
♦ Mapping parameters and variables. The Integration Service uses the value saved in the repository (mapping
variables only), the configured initial value, or the datatype default value.
To assign a null value, set the parameter or variable value to “<null>” or leave the value blank. For example, the
following lines assign null values to service process variables $PMBadFileDir and $PMCacheDir:
$PMBadFileDir=<null>
$PMCacheDir=

Sample Parameter File


The following text is an excerpt from a parameter file that contains service variables for one Integration Service
and parameters for four workflows:
----------------------------------------
File created by RSmith 11/12/2005
----------------------------------------
[Service:IntSvs_01]
$PMSuccessEmailUser=pcadmin@mail.com
$PMFailureEmailUser=pcadmin@mail.com
[HET_TGTS.WF:wf_TCOMMIT_INST_ALIAS]
$$platform=unix
[HET_TGTS.WF:wf_TGTS_ASC_ORDR.ST:s_TGTS_ASC_ORDR]
$$platform=unix
$DBConnection_ora=Ora2
$ParamAscOrderOverride=UPDATE T_SALES SET CUST_NAME = :TU.CUST_NAME, DATE_SHIPPED =
:TU.DATE_SHIPPED, TOTAL_SALES = :TU.TOTAL_SALES WHERE CUST_ID = :TU.CUST_ID
[ORDERS.WF:wf_PARAM_FILE.WT:WL_PARAM_Lvl_1]
$$DT_WL_lvl_1=02/01/2005 01:05:11
$$Double_WL_lvl_1=2.2
[ORDERS.WF:wf_PARAM_FILE.WT:WL_PARAM_Lvl_1.WT:NWL_PARAM_Lvl_2]
$$DT_WL_lvl_2=03/01/2005 01:01:01
$$Int_WL_lvl_2=3
$$String_WL_lvl_2=ccccc

Configuring the Parameter File Name and Location


When you start a workflow or session, use a parameter file to pass parameter and variable values to the
Integration Service. You can specify the parameter file name and directory in the workflow or session properties
or in the pmcmd command line. If the Integration Service uses operating system profiles, the operating system
user specified in the operating system profile must have access to the parameter file.

412 Chapter 28: Parameter Files


The parameter file you use with pmcmd overrides the parameter file in the workflow or session properties. If you
do not enter a parameter file name in the pmcmd command line, the Integration Service uses the parameter file
you specify in the workflow properties for the workflow and all sessions in the workflow. If you do not enter a
parameter file name in the pmcmd command line or the workflow properties, the Integration Service uses the
parameter file you specify in the session properties.

Using a Parameter File with Workflows or Sessions


You can specify the parameter file name and directory in the workflow or session properties. Specify a workflow
or worklet variable as the session parameter file name if you configure a workflow to run concurrently, and you
want to use different parameter files for the sessions in each workflow run instance.
If you specify a parameter file for a workflow or session, and the Integration Service cannot locate the parameter
file, it fails the workflow or session.

To enter a parameter file in the workflow properties:

1. Open a Workflow in the Workflow Manager.


2. Click Workflows > Edit.
The Edit Workflow dialog box appears.
3. Click the Properties tab.
4. Enter the parameter file location and name in the Parameter Filename field.
You can enter either a direct path or a service process variable. Use the appropriate delimiter for the
Integration Service operating system. If you configured the PowerCenter environment for high availability,
include the service process variable in the path
5. Click OK.

To enter a parameter file in the session properties:

1. Open a session in the Workflow Manager.


The Edit Tasks dialog box appears.
2. Click the Properties tab, and open the General Options settings.
3. Enter the parameter file location and name in the Parameter Filename field.
You can enter a direct path or a service process variable. Use the appropriate delimiter for the Integration
Service operating system. If you configured the PowerCenter environment for high availability, include the
service process variable in the path.
You can also enter a user-defined workflow or worklet variable. Enter a workflow or worklet variable to
define the session parameter file name in the workflow parameter file.
4. Click OK.

Using Variables to Specify Session Parameter Files


You can enter a workflow or worklet variable as the session parameter file name. Enter a workflow or worklet
variable as the session parameter file name if you configure a workflow to run concurrently and you want to
define different parameter and variable values for the sessions in each workflow run instance.
When you define a workflow parameter file and a session parameter file for a session within the workflow, the
Integration Service uses the workflow parameter file, and ignores the session parameter file. To use a variable to
define the session parameter file name, you must define the session parameter file name and set
$PMMergeSessParamFile=TRUE in the workflow parameter file. The $PMMergeSessParamFile property
causes the Integration Service to read both the session and workflow parameter files.

Configuring the Parameter File Name and Location 413


For example, you configured a workflow to run two concurrent instances that contains three sessions:

For the first and second workflow instances, you want the sessions to use the following session parameter files:

Session Parameter File Name Session Parameter File Name


Session
(First workflow run instance) (Second workflow run instance)

s_1 s_1Inst1.txt s_1Inst2.txt

s_2 s_2Inst1.txt s_2Inst2.txt

s_3 s_3Inst1.txt s_3Inst2.txt

Create workflow variables to store the session parameter file names. For example, you create user-defined
workflow variables $$s_1ParamFileName, $$s_2ParamFileName, and $$s_3ParamFileName. In the session
properties for each session, set the parameter file name to a workflow variable:

Session Parameter File Name


Session
in Session Properties

s_1 $$s_1ParamFileName

s_2 $$s_2ParamFileName

s_3 $$s_3ParamFileName

In the workflow parameter file for each workflow instance, set each workflow variable to the correct session
parameter file name, and set $PMMergeSessParamFile=TRUE.
If you use a variable as the session parameter file name, and you define the same parameter or variable in both
the session and workflow parameter files, the Integration Service sets parameter and variable values according to
the following rules:
♦ When a parameter or variable is defined in the same section of the workflow and session parameter files, the
Integration Service uses the value in the workflow parameter file.
♦ When a parameter or variable is defined in both the session section of the session parameter file and the
workflow section of the workflow parameter file, the Integration Service uses the value in the session
parameter file.

Using a Parameter File with pmcmd


Use parameter files with the pmcmd startworkflow or starttask commands. These commands allow you to
specify the parameter file to use when you start a workflow or session.
The pmcmd -paramfile option defines which parameter file to use when a session or workflow runs. The
-localparamfile option defines a parameter file on a local machine that you can reference when you do not have
access to parameter files on the Integration Service machine.
The following command starts workflowA using the parameter file, myfile.txt:
pmcmd startworkflow -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w wSalesAvg -
paramfile '\$PMRootDir/myfile.txt' workflowA

The following command starts taskA using the parameter file, myfile.txt:
pmcmd starttask -uv USERNAME -pv PASSWORD -s SALES:6258 -f east -w wSalesAvg -paramfile
'\$PMRootDir/myfile.txt' taskA

414 Chapter 28: Parameter Files


Parameter File Example
The example in this section describes a session in which you may want to use a parameter file. The session can
be rerun with different state and time values. The example shows the parameters and variables you may want to
set, lists the parameter and variable values, and explains what to change when you rerun the session.
You have a session, s_MonthlyCalculations, in the Production folder. The session uses session parameters to
connect to source files and target databases and to write a session log file. If the session fails, the Integration
Service sends an email message to pcadmin@mail.com. The session uses a string mapping parameter, $$State,
that you set to “MA,” and a datetime mapping variable, $$Time. $$Time has an initial value of “9/30/2005
05:04:00” in the repository, but you override this value to “10/1/2005 05:04:11.”
The following table describes the parameters and variables for the s_MonthlyCalculations session:

Parameter or Variable Type Parameter or Variable Name Definition

Service Variable $PMFailureEmailUser pcadmin@mail.com

String Mapping Parameter $$State MA

Datetime Mapping Variable $$Time 10/1/2005 05:04:11

Source File (Session Parameter) $InputFile1 Sales.txt

Database Connection (Session Parameter) $DBConnection_Target Sales

Session Log File (Session Parameter) $PMSessionLogFile d:/session logs/firstrun.txt

The parameter file for the session includes the folder and session name and each parameter and variable:
[Production.s_MonthlyCalculations]
$PMFailureEmailUser=pcadmin@mail.com
$$State=MA
$$Time=10/1/2005 05:04:11
$InputFile1=sales.txt
$DBConnection_target=sales
$PMSessionLogFile=D:/session logs/firstrun.txt

The next time you run the session, you might edit the parameter file to change the state to MD and delete the
$$Time variable. This allows the Integration Service to use the value for the variable that the previous session
stored in the repository.

Guidelines for Creating Parameter Files


Use the following rules and guidelines when you create parameter files:
♦ List all session parameters. Session parameters do not have default values. If the Integration Service cannot
find a value for a session parameter, it may fail the session, take an empty string as the default value, or fail
to expand the parameter at run time. Session parameter names are not case sensitive.
♦ List all necessary mapping parameters and variables. Mapping parameter and variable values become start
values for parameters and variables in a mapping. Mapping parameter and variable names are not case
sensitive.
♦ Enter folder names for non-unique session names. When a session name exists more than once in a
repository, enter the folder name to indicate the location of the session.
♦ Precede parameters and variables in mapplets with the mapplet name. Use the following format:
mapplet_name.parameter_name=value
mapplet2_name.variable_name=value

Parameter File Example 415


♦ Use multiple parameter files. You assign parameter files to workflows, worklets, and sessions individually.
You can specify the same parameter file for all of these tasks or create multiple parameter files.
♦ When defining parameter values, do not use unnecessary line breaks or spaces. The Integration Service
interprets additional spaces as part of a parameter name or value.
♦ Use correct date formats for datetime values. Use the following date formats for datetime values:
− MM/DD/RR
− MM/DD/YYYY
− MM/DD/RR HH24:MI
− MM/DD/YYYY HH24:MI
− MM/DD/RR HH24:MI:SS
− MM/DD/YYYY HH24:MI:SS
− MM/DD/RR HH24:MI:SS.MS
− MM/DD/YYYY HH24:MI:SS.MS
− MM/DD/RR HH24:MI:SS.US
− MM/DD/YYYY HH24:MI:SS.US
− MM/DD/RR HH24:MI:SS.NS
− MM/DD/YYYY HH24:MI:SS.NS
You can use the following separators: dash (-), slash (/), backslash (\), colon (:), period (.), and space. The
Integration Service ignores extra spaces. You cannot use one- or three-digit values for year or the “HH12”
format for hour.
♦ Do not enclose parameter or variable values in quotes. The Integration Service interprets everything after
the first equals sign as part of the value.
♦ Use a parameter or variable value of the proper length for the error log table name prefix. If you use a
parameter or variable for the error log table name prefix, do not specify a prefix that exceeds 19 characters
when naming Oracle, Sybase, or Teradata error log tables. The error table names can have up to 11
characters, and Oracle, Sybase, and Teradata databases have a maximum length of 30 characters for table
names. The parameter or variable name can exceed 19 characters.

Troubleshooting
I have a section in a parameter file for a session, but the Integration Service does not seem to read it.
Make sure to enter folder and session names as they appear in the Workflow Manager. Also, use the appropriate
prefix for all user-defined session parameters.

I am trying to use a source file parameter to specify a source file and location, but the Integration Service cannot find
the source file.
Make sure to clear the source file directory in the session properties. The Integration Service concatenates the
source file directory with the source file name to locate the source file.
Also, make sure to enter a directory local to the Integration Service and to use the appropriate delimiter for the
operating system.

I am trying to run a workflow with a parameter file and one of the sessions keeps failing.
The session might contain a parameter that is not listed in the parameter file. The Integration Service uses the
parameter file to start all sessions in the workflow. Check the session properties, and then verify that all session
parameters are defined correctly in the parameter file.

416 Chapter 28: Parameter Files


I ran a workflow or session that uses a parameter file, and it failed. What parameter and variable values does the
Integration Service use during the recovery run?
For service variables, service process variables, session parameters, and mapping parameters, the Integration
Service uses the values specified in the parameter file, if they exist. If values are not specified in the parameter
file, then the Integration Service uses the value stored in the recovery storage file. For workflow, worklet, and
mapping variables, the Integration Service always uses the value stored in the recovery storage file.

Tips
Use a single parameter file to group parameter information for related sessions.
When sessions are likely to use the same database connection or directory, you might want to include them in
the same parameter file. When connections or directories change, you can update information for all sessions by
editing one parameter file.

Use pmcmd and multiple parameter files for sessions with regular cycles.
Sometimes you reuse session parameters in a cycle. For example, you might run a session against a sales database
everyday, but run the same session against sales and marketing databases once a week. You can create separate
parameter files for each session run. Instead of changing the parameter file in the session properties each time
you run the weekly session, use pmcmd to specify the parameter file to use when you start the session.

Use reject file and session log parameters in conjunction with target file or target database connection parameters.
When you use a target file or target database connection parameter with a session, you can keep track of reject
files by using a reject file parameter. You can also use the session log parameter to write the session log to the
target machine.

Use a resource to verify the session runs on a node that has access to the parameter file.
In the Administration Console, you can define a file resource for each node that has access to the parameter file
and configure the Integration Service to check resources. Then, edit the session that uses the parameter file and
assign the resource. When you run the workflow, the Integration Service runs the session with the required
resource on a node that has the resource available.
For more information about configuring the Integration Service to check resources, see the PowerCenter
Administrator Guide.

You can override initial values of workflow variables for a session by defining them in a session section.
If a workflow contains an Assignment task that changes the value of a workflow variable, the next session in the
workflow uses the latest value of the variable as the initial value for the session. To override the initial value for
the session, define a new value for the variable in the session section of the parameter file.

You can define parameters and variables using other parameters and variables.
For example, in the parameter file, you can define session parameter $PMSessionLogFile using a service process
variable as follows:
$PMSessionLogFile=$PMSessionLogDir/TestRun.txt

Tips 417
418 Chapter 28: Parameter Files
CHAPTER 29

External Loading
This chapter includes the following topics:
♦ Overview, 419
♦ External Loader Behavior, 420
♦ Loading to IBM DB2, 421
♦ Loading to Oracle, 425
♦ Loading to Sybase IQ, 427
♦ Loading to Teradata, 429
♦ Configuring External Loading in a Session, 439
♦ Troubleshooting, 441

Overview
You can configure a session to use IBM DB2, Oracle, Sybase IQ, and Teradata external loaders to load session
target files into their respective databases. External loaders can increase session performance by loading
information directly from a file or pipe rather than running the SQL commands to insert the same data into the
database.
Use multiple external loaders within one session. For example, if a mapping contains two targets, you can create
a session that uses an Oracle external loader connection and a Sybase IQ external loader connection.
For information about creating external loader connections, see “External Loader Connections” on page 36.

Before You Begin


Before you run external loaders, complete the following tasks:
♦ Disable constraints. You disable constraints built into the tables receiving the data before performing the
load. For information about disabling constraints, see the database documentation.
♦ Turn off or disable database logging. To preserve high performance, you can increase commit intervals and
turn off database logging. However, to perform database recovery on failed sessions, you must have database
logging turned on.
♦ Configure code pages. IBM DB2, Oracle, Sybase IQ, and Teradata database servers must use the same code
page as the target flat file code page. The Integration Service creates the control files and target flat files
using the target flat file code page. If you use a code page other than 7-bit ASCII for the target flat file, run
the Integration Service in Unicode data movement mode.

419
♦ Configure the external loader connection as a resource. If the Integration Service is configured to run on a
grid, configure the external loader connection as a resource on the node where the external loader is
available. For more information, see the PowerCenter Administrator Guide.

External Loader Behavior


When you run a session that uses an external loader, the Integration Service creates a control file and a target
flat file. The control file contains information such as data format and loading instructions for the external
loader. The control file has an extension of .ctl. You can view the control file and the target flat file in the target
file directory.
When you run a session, the Integration Service deletes and recreates the target file. The external loader uses the
control file to load session output to the database. The Integration Service processes datetime data before
loading to the database in the following ways:
♦ If the session is configured to trim subseconds, the Integration Service processes datetime data with a
precision of 19.
♦ If the session is not configured to trim subseconds, the Integration Service processes datetime data based on
the precision specified in the target flat file. Precision ranges from 19 to 29. Subseconds are trimmed
according to the precision specified.
♦ If the precision specified in the target file is greater than that specified for the database, the Integration
Service limits the precision to the maximum precision specified for the database.
The Integration Service waits for all external loading to complete before it performs post-session commands,
runs external procedures, and sends post-session email.

The Integration Service writes external loader initialization and completion messages in the session log. For
more information about the external loader performance, check the external loader log. The loader saves the log
in the same directory as the target flat files. The default extension for external loader logs is .ldrlog.
The behavior of the external loader depends on how you choose to load the data. You can load data to a named
pipe or to a flat file.

Loading Data to a Named Pipe


The external loader starts to load data to the database as soon as the data appears in the pipe. The loader deletes
the named pipe as soon as it completes the load.
On UNIX, the Integration Service writes to a named pipe that is named after the configured target file name.
On Windows, the Integration Service writes data to a named pipe using the specified format:
\\.\pipe\<pipe name>

The pipe name is the same as the configured target file name.

Staging Data to a Flat File


When you stage data to a flat file on Windows or UNIX, the Integration Service writes data to a flat file, which
is named after the configured target file name. The external loader starts loading data to the target database
after the Integration Service writes all the data to the target flat file. The external loader does not delete the
target flat file after loading it to the database. Make sure the target file directory can accommodate the size of
the target flat file.
Note: The Integration Service rounds numerical values based on the scale of the port when staging data to a flat
file. It does not round results when you use an external loader that loads the data to a named pipe or if you
configure the target for a normal load.

420 Chapter 29: External Loading


If a session aborts or fails before the Integration Service writes all the data to the flat file target, the external
loader does not start. If a session aborts or fails after the Integration Service writes all the data to the flat file
target, the external loader completes loading data to the target database before the external loader exits.

Partitioning Sessions with External Loaders


When you configure multiple partitions in a session using a flat file target, the Integration Service creates a
separate flat file for each partition. Some external loaders cannot load data from multiple files. When you use an
external loader in a session with multiple partitions, you must configure the target partition type according to
the external loader you use.
When you use an external loader that can load data from multiple files, you can choose any partition type
available for a flat file target. You also choose an external loader connection for each partition. The Integration
Service creates an output file for each partition, and the external loader loads the output from each target file to
the database. Use any partition type for the target when you use the following loaders:
♦ Oracle, with parallel load enabled
♦ Teradata Tpump
♦ Teradata Warehouse Builder
If you use a loader that cannot load from multiple files, use round-robin partitioning to route the data to a
single target file. You choose an external loader connection for each partition. However, the Integration Service
uses the loader connection for the first partition. The Integration Service creates a single output file, and the
external loader loads the output from the target file to the database. If you choose any other partition type for
the target, the Integration Service fails the session. Use round-robin partition type for the target when you use
the following loaders:
♦ IBM DB2 EE
♦ IBM DB2 EEE Autoloader
♦ Oracle, with parallel load disabled
♦ Sybase IQ
♦ Teradata MultiLoad
♦ Teradata Fastload

Loading to IBM DB2


When you load to IBM DB2 targets, use one of the following external loaders:
♦ IBM DB2 EE external loader. Performs insert and replace operations on targets. The external loader can
also restart or terminate load operations. The IBM DB2 EE external loader invokes the db2load executable
located in the Integration Service installation directory. The IBM DB2 EE external loader can load data to
an IBM DB2 server on a machine that is remote to the Integration Service.
♦ IBM DB2 EEE Autoloader. Performs insert and replace operations on targets. The external loader can also
restart or terminate load operations. The IBM DB2 EEE external loader invokes the IBM DB2 Autoloader
program to load data. The Autoloader program uses the db2atld executable. The IBM DB2 EEE external
loader can partition data and load the partitioned data simultaneously to the corresponding database
partitions. The IBM DB2 EEE loader requires that the IBM DB2 server be on the same machine hosting the
Integration Service.
Note: If the IBM DB2 EEE server is on a machine that is remote to the Integration Service, connect to the IBM
DB2 EEE database using a relational database connection. Use database partitioning for the IBM DB2 target.
When you use database partitioning, the Integration Service queries the IBM DB2 system for table partition
information and loads partitioned data to the corresponding nodes in the target database. For more information
about database partitioning, see “Database Partitioning Partition Type” on page 284.

Loading to IBM DB2 421


Rules and Guidelines
Use the following rules and guidelines when you use external loaders to load to IBM DB2:
♦ The IBM DB2 external loaders load from a delimited flat file. Verify that the target table columns are wide
enough to store all of the data.
♦ For a connection that uses IBM DB2 client authentication, enter the PmNullUser user name and
PmNullPasswd when you create the external loader connection. PowerCenter uses IBM DB2 client
authentication when the connection user name is PmNullUser and the connection is to an IBM DB2
database.
♦ For a session with multiple partitions, use the round-robin partition type to route data to a single target file.
For more information about partitioning sessions with external loaders, see “Partitioning Sessions with
External Loaders” on page 421.
♦ If you configure multiple targets in the same pipeline to use IBM DB2 external loaders, each loader must
load to a different tablespace on the target database. For more information about selecting external loaders,
see “Configuring External Loading in a Session” on page 439.
♦ You must have the correct authority levels and privileges to load data to the database tables. For more
information, see “Configuring Authorities, Privileges, and Permissions” on page 422.

Setting Operation Modes


IBM DB2 operation modes specify the type of load the external loader runs. You can configure the IBM DB2
EE or IBM DB2 EEE external loader to run in one of the following operation modes:
♦ Insert. Adds loaded data to the table without changing existing table data.
♦ Replace. Deletes all existing data from the table, and inserts the loaded data. The table and index definitions
do not change.
♦ Restart. Restarts a previously interrupted load operation.
♦ Terminate. Terminates a previously interrupted load operation and rolls back the operation to the starting
point, even if consistency points were passed. The tablespaces return to normal state, and the external loader
makes all table objects consistent.

Configuring Authorities, Privileges, and Permissions


IBM DB2 privileges allow you to create or access database resources. Authority levels allow you to group
privileges and perform higher-level database manager maintenance and utility operations. Together, these act to
control access to the database manager and its database objects. You can access objects for which you have the
required privilege or authority.
To load data into a table, you must have one of the following authorities:
♦ SYSADM authority
♦ DBADM authority
♦ LOAD authority on the database and one of the following privileges:
− INSERT privilege on the table when the load utility is invoked in insert, terminate, or restart mode.
− INSERT and DELETE privilege on the table when the load utility is invoked in replace, terminate, or
restart mode.
In addition, you must have proper read access and read/write permissions:
♦ The database instance owner must have read access to the external loader input files.
♦ If you run IBM DB2 as a service on Windows, you must configure the service start account with a user
account that has read/write permissions to use LAN resources, including drives, directories, and files.

422 Chapter 29: External Loading


♦ If you load to IBM DB2 EEE, the database instance owner must have write access to the load dump file and
the load temporary file.
For more information, see the IBM DB2 database documentation.

Configuring IBM DB2 EE External Loader Attributes


Table 29-1 describes attributes for IBM DB2 EE external loader connections:

Table 29-1. IBM DB2 EE External Loader Attributes

Default
Attributes Description
Value

Opmode Insert IBM DB2 external loader operation mode. Select one of the following
operation modes:
- Insert
- Replace
- Restart
- Terminate
For more information about IBM DB2 operation modes, see “Setting
Operation Modes” on page 422.

External Loader db2load Name of the IBM DB2 EE external loader executable file.
Executable

DB2 Server Location Remote Location of the IBM DB2 EE database server relative to the Integration
Service. Select Local if the database server resides on the machine
hosting the Integration Service. Select Remote if the database server
resides on another machine.

Is Staged Disabled Method of loading data. Select Is Staged to load data to a flat file
staging area before loading to the database. By default, the data is
loaded to the database using a named pipe. For more information, see
“External Loader Behavior” on page 420.

Recoverable Enabled Sets tablespaces in backup pending state if forward recovery is


enabled. If you disable forward recovery, the IBM DB2 tablespace will
not set to backup pending state. If the IBM DB2 tablespace is in
backup pending state, you must fully back up the database before you
perform any other operation on the tablespace.

IBM DB2 EE External Loader Return Codes


The IBM DB2 EE external loader indicates the success or failure of a load operation with a return code. The
Integration Service writes the external loader return code to the session log. Return code (0) indicates that the
load operation succeeded. The Integration Service writes the following message to the session log if the external
loader successfully completes the load operation:
WRT_8029 External loader process <external loader name> exited successfully.

Any other return code indicates that the load operation failed. The Integration Service writes the following
error message to the session log:
WRT_8047 Error: External loader process <external loader name> exited with error <return
code>.

Table 29-2 describes the return codes for the IBM DB2 EE external loader:

Table 29-2. IBM DB2 EE External Loader Return Codes

Code Description

0 External loader operation completed successfully.

1 External loader cannot locate the control file.

2 External loader could not open the external loader log file.

Loading to IBM DB2 423


Table 29-2. IBM DB2 EE External Loader Return Codes

Code Description

3 External loader could not access the control file because the control file is locked by another
process.

4 IBM DB2 database returned an error.

Configuring IBM DB2 EEE External Loader Attributes


You can configure the IBM DB2 EEE external loader to use different loading modes when loading to the
database. Loading modes determine how the IBM DB2 EEE external loader loads data across partitions in the
database. You can configure the IBM DB2 EEE external loader to use the following loading modes:
♦ Split and load. Partitions the data and loads it simultaneously using the corresponding database partitions.
♦ Split only. Partitions the data and writes the output to files in the specified split file directory.
♦ Load only. Does not partition the data. It loads data in existing split files using the corresponding database
partitions.
♦ Analyze. Generates an optimal partitioning map with even distribution across all database partitions. If you
run the external loader in split and load mode after you run it in analyze mode, the external loader uses the
optimal partitioning map to partition the data.
For more information about IBM DB2 loading modes, see the IBM DB2 database documentation.
The IBM DB2 EEE external loader creates multiple logs based on the number of database partitions it loads to.
For each partition, the external loader appends a number corresponding to the partition number to the external
loader log file name. The IBM DB2 EEE external loader log file format is file_name.ldrlog.partition_number.
The Integration Service does not archive or overwrite IBM DB2 EEE external loader logs. If an external loader
log of the same name exists when the external loader runs, the external loader appends new external loader log
messages to the end of the existing external loader log file. You must manually archive or delete the external
loader log files. For more information about log files generated by DB2 Autoload, see the IBM DB2
documentation.
For information about IBM DB2 EEE external loader return codes, see the IBM DB2 documentation.
Table 29-3 describes attributes for IBM DB2 EEE external loader connections:

Table 29-3. IBM DB2 EEE External Loader Attributes

Default
Attribute Description
Value

Opmode Insert IBM DB2 external loader operation mode. Select one of the following
operation modes:
- Insert
- Replace
- Restart
- Terminate
For more information about IBM DB2 operation modes, see “Setting
Operation Modes” on page 422.

External Loader db2atld Name of the IBM DB2 EEE external loader executable file.
Executable

Split File Location n/a Location of the split files. The external loader creates split files if you
configure SPLIT_ONLY loading mode.

Output Nodes n/a Database partitions on which the load operation is to be performed.

Split Nodes n/a Database partitions that determine how to split the data. If you do not
specify this attribute, the external loader determines an optimal
splitting method.

424 Chapter 29: External Loading


Table 29-3. IBM DB2 EEE External Loader Attributes

Default
Attribute Description
Value

Mode Split and Loading mode the external loader uses to load the data. Select one
load of the following loading modes:
- Split and load
- Split only
- Load only
- Analyze

Max Num Splitters 25 Maximum number of splitter processes.

Force No Forces the external loader operation to continue even if it determines


at startup time that some target partitions or tablespaces are offline.

Status Interval 100 Number of megabytes of data the external loader loads before writing
a progress message to the external loader log. Specify a value
between 1 and 4,000 MB.

Ports 6000-6063 Range of TCP ports the external loader uses to create sockets for
internal communications with the IBM DB2 server.

Check Level Nocheck Checks for record truncation during input or output.

Map File Input n/a Name of the file that specifies the partitioning map. To use a
customized partitioning map, specify this attribute. Generate a
customized partitioning map when you run the external loader in
Analyze loading mode.

Map File Output n/a Name of the partitioning map when you run the external loader in
Analyze loading mode. You must specify this attribute if you want to
run the external loader in Analyze loading mode.

Trace 0 Number of rows the external loader traces when you need to review a
dump of the data conversion process and output of hashing values.

Is Staged Disabled Method of loading data. Select Is Staged to load data to a flat file
staging area before loading to the database. Otherwise, the data is
loaded to the database using a named pipe. For more information,
see “External Loader Behavior” on page 420.

Date Format mm/dd/yyyy Date format. Must match the date format you define in the target
definition. IBM DB2 supports the following date formats:
- MM/DD/YYYY
- YYYY-MM-DD
- DD.MM.YYYY
- YYYY-MM-DD

Loading to Oracle
When you load to Oracle targets, use the Oracle SQL Loader to perform insert, update, and delete operations
on targets.
The Oracle external loader creates a reject file for data rejected by the database. The reject file has an extension
of .ldrreject. The loader saves the reject file in the target files directory.

Rules and Guidelines


Use the following rules and guidelines when you use external loaders to load to Oracle:
♦ If you select an Oracle external loader, the default external loader executable name is sqlload. This is
accurate for most UNIX platforms, but if you use Windows, check the Oracle documentation to find the
name of the external loader executable.

Loading to Oracle 425


♦ For a connection that uses Oracle OS Authentication, enter the PmNullUser user name and PmNullPasswd
when you create the external loader connection. PowerCenter uses Oracle OS Authentication when the
connection user name is PmNullUser and the connection is to an Oracle database.
♦ The target flat file for an Oracle external loader can be fixed-width or delimited.
♦ For optimal performance when writing to a partitioned target, select Direct Path. For more information, see
the Oracle documentation.
♦ If you configure a session to write subsecond data to a Timestamp column in an Oracle 10.x or Oracle 11.x
target, the Integration Service writes subsecond data up to microseconds by default. To ensure greater
precision, edit the control file and change the Timestamp precision. For example, specify TIMESTAMP(9)
to process nanoseconds.
♦ For optimal performance, use the following guidelines to determine settings for partitioned and non-
partitioned targets:

Target Load Method Parallel Load Load Mode

Partitioned Direct Path enable Append

Partitioned Conventional Path enable n/a

Non-partitioned n/a disable* n/a


* If you disable parallel load, you must choose round-robin partitioning to route data to a single target file.

Loading Multibyte Data to Oracle


When you load multibyte data to Oracle, data precision is measured in bytes for fixed-width files and in
characters for delimited files. Make sure the target table columns are wide enough to store all the data.
Oracle supports character-oriented datatypes, such as Nchar, where the precision is measured in characters. If
you use the Nchar datatype, multiply the maximum number of characters by K, where K is the maximum
number of bytes a character contains in the selected target code page. This ensures that the Integration Service
does not truncate data before loading the target file.

Configuring Oracle External Loader Attributes


Table 29-4 describes the attributes for Oracle external loader connections:

Table 29-4. Oracle External Loader Attributes

Attribute Default Value Description

Error Limit 1 Number of errors to allow before the external loader stops the
load operation.

Load Mode Append Loading mode the external loader uses to load data. Select one
of the following loading modes:
- Append
- Insert
- Replace
- Truncate

Load Method Use Conventional Method the external loader uses to load data. Select one of the
Path following load methods:
- Use Conventional Path.
- Use Direct Path (Recoverable).
- Use Direct Path (Unrecoverable).

Enable Parallel Enable Parallel Determines whether the Oracle external loader loads data in
Load Load parallel to a partitioned Oracle target table.
- Enable Parallel Load to load to partitioned targets.
- Do Not Enable Parallel Load to load to non-partitioned
targets.

426 Chapter 29: External Loading


Table 29-4. Oracle External Loader Attributes

Attribute Default Value Description

Rows Per Commit 10000 For Conventional Path load method, this attribute specifies the
number of rows in the bind array for load operations. For Direct
Path load methods, this attribute specifies the number of rows
the external loader reads from the target flat file before it saves
the data to the database.

External Loader sqlload Name of the external loader executable file.


Executable

Log File Name n/a Path and name of the external loader log file.

Is Staged Disabled Method of loading data. Select Is Staged to load data to a flat
file staging area before loading to the database. Otherwise, the
data is loaded to the database using a named pipe. For more
information, see “External Loader Behavior” on page 420.

Loading to Sybase IQ
When you load to Sybase IQ, use the Sybase IQ external loader to perform insert operations. The Integration
Service can load multibyte data to Sybase IQ targets. The Integration Service can write to a flat file when the
Sybase IQ server is on the same machine or on a different machine as the Integration Service. The Integration
Service can write to a named pipe if the Integration Service is local to the Sybase IQ database server.

Rules and Guidelines


Use the following rules and guidelines when you use external loaders to load to Sybase IQ:
♦ Ensure that target tables do not violate primary key constraints.
♦ Configure a Sybase IQ user with read/write access before you use a Sybase IQ external loader.
♦ Target flat files for a Sybase IQ external loader can be fixed-width or delimited.
♦ The Sybase IQ external loader cannot perform update or delete operations on targets.
♦ For a session with multiple partitions, use the round-robin partition type to route data to a single target file.
For more information about partitioning sessions with external loaders, see “Partitioning Sessions with
External Loaders” on page 421.
♦ If the Integration Service and Sybase IQ server are on different machines, map or mount a drive from the
machine hosting the Integration Service to the machine hosting the Sybase IQ server.

Loading Multibyte Data to Sybase IQ


Use the following guidelines when you load multibyte data to Sybase IQ targets.

Delimited Flat File Targets


For delimited flat files, data precision is measured in characters. When you insert multibyte character data in
the target, you do not need to allow for additional precision for multibyte data. Sybase IQ does not allow
optional quotes. You must choose None for Optional Quotes if you have a delimited target flat file.
When you load multibyte data to Sybase IQ, null characters and delimiters can be up to four bytes each. To
avoid reading the delimiter as regular characters, each byte of the delimiter must have an ASCII value of less
than 0x40.

Loading to Sybase IQ 427


Fixed-Width Flat File Targets
For fixed-width flat files, data precision is measured in bytes, not characters. When you load multibyte data
into a fixed-width flat file target, configure the precision to accommodate the multibyte data. The Integration
Service writes the row to the reject file if the precision is not large enough to accommodate the multibyte data.

Configuring Sybase IQ External Loader Attributes


Use the following rules and guidelines when you enter attributes for the Sybase IQ external loader connection:
♦ The connect string must contain the following attributes:
uid=user ID; pwd=password; eng=Sybase IQ database server name; links=tcpip (host=host
name; port=port number)

For example, you might use the following connect string:


uid=qasrvr65;pwd=qasrvr65;eng=SUNQA2SybaseIQ_1243;links=tcpip(host=sunqa2)

♦ The session might fail if you use quotes in the connect string.
Table 29-5 describes the attributes for Sybase IQ external loader connections:

Table 29-5. Sybase IQ External Loader Attributes

Default
Attribute Description
Value

Block Factor 10000 Number of records per block in the target Sybase table. The
external loader applies the Block Factor attribute to load
operations for fixed-width flat file targets only.

Block Size 50000 Size of blocks used in Sybase database operations. The
external loader applies the Block Size attribute to load
operations for delimited flat file targets only.

Checkpoint Enabled If enabled, the Sybase IQ database issues a checkpoint after


successfully loading the table. If disabled, the database issues
no checkpoints.

Notify Interval 1000 Number of rows the Sybase IQ external loader loads before it
writes a status message to the external loader log.

Server Datafile Directory n/a Location of the flat file target. You must specify this attribute
relative to the database server installation directory.
If the directory is in a Windows system, use a backslash (\) in
the directory path:
D:\mydirectory\inputfile.out
If the directory is in a UNIX system, use a forward slash (/) in
the directory path:
/mydirectory/inputfile.out
Enter the target file directory path using the syntax for the
machine hosting the database server installation. For example,
if the Integration Service is on a Windows machine and the
Sybase IQ server is on a UNIX machine, use UNIX syntax.

External Loader dbisql Name of the Sybase IQ external loader executable. When you
Executable create a Sybase IQ external loader connection, the Workflow
Manager sets the name of the external loader executable file to
dbisql by default. If you use an executable file with a different
name, for example, dbisqlc, you must update the External
Loader Executable field. If the external loader executable file
directory is not in the system path, you must enter the file path
and file name in this field.

Is Staged Enabled Method of loading data. Select Is Staged to load data to a flat
file staging area before loading to the database. Clear the
attribute to load data from a named pipe. The Integration
Service can write to a named pipe if the Integration Service is
local to the Sybase IQ database. For more information, see
“External Loader Behavior” on page 420.

428 Chapter 29: External Loading


Loading to Teradata
When you load to Teradata targets, use one of the following external loaders:
♦ Multiload. Performs insert, update, delete, and upsert operations for large volume incremental loads. Use
this loader when you run a session with a single partition. Multiload acquires table level locks, making it
appropriate for offline loading. For more information about configuring the Multiload external loader
connection object, see “Configuring Teradata MultiLoad External Loader Attributes” on page 431.
♦ TPump. Performs insert, update, delete, and upsert operations for relatively low volume updates. Use this
loader when you run a session with multiple partitions. TPump acquires row-hash locks on the table,
allowing other users to access the table as TPump loads to it. For more information about configuring the
Tpump external loader connection object, see “Configuring Teradata TPump External Loader Attributes”
on page 433.
♦ FastLoad. Performs insert operations for high volume initial loads, or for high volume truncate and reload
operations. Use this loader when you run a session with a single partition. Use this loader on empty tables
with no secondary indexes. For more information about configuring the FastLoad external loader
connection object, see “Configuring Teradata FastLoad External Loader Attributes” on page 435.
♦ Warehouse Builder. Performs insert, update, upsert, and delete operations on targets. Use this loader when
you run a session with multiple partitions. You can achieve the functionality of the other loaders based on
the operator you use. For more information about configuring the Warehouse Builder external loader
connection object, see “Configuring Teradata Warehouse Builder Attributes” on page 437.
If you use a Teradata external loader to perform update or upsert operations, use the Target Update Override
option in the Mapping Designer to override the UPDATE statement in the external loader control file. For
upsert, the INSERT statement in the external loader control file remains unchanged. For more information
about using the Target Update Override option, see the PowerCenter Designer Guide.

Rules and Guidelines


Use the following rules and guidelines when you use external loaders to load to Teradata:
♦ The Integration Service can use Teradata external loaders to load fixed-width and delimited flat files to a
Teradata database. Since all Teradata loaders delimit individual records using the line-feed (\n) character,
you cannot use the line-feed character as a delimiter for Teradata loaders.
♦ If a session contains one partition, the target output file name, including the file extension, must not exceed
27 characters. If the session contains multiple partitions, the target output file name, including the file
extension, must not exceed 25 characters.
♦ Use the Teradata external loaders to load multibyte data.
♦ You cannot use the Teradata external loaders to load binary data.
♦ When you load to Teradata using named pipes, set the checkpoint value to 0 to prevent external loaders
from performing checkpoint operations.
♦ You can specify error, log, or work table names, depending on the loader you use. You can also specify error,
log, or work database names.
♦ You can override the control file in the session properties.
♦ When you use Teradata, you can enter PmNullPasswd as the database password to prevent the password
from appearing in the control file. Instead, the Integration Service writes an empty string for the password in
the control file.
For more information about Teradata external loaders, see the Teradata documentation.

Overriding the Control File


When you edit the loader connection in a session, you can override the control file. You might want to override
the control file to change some loader properties that you cannot edit in the loader connection. For example,
you can specify the tracing option in the control file.

Loading to Teradata 429


When you override the control file, the Workflow Manager saves the control file to the repository. The
Integration Service uses the saved control file when you run the session and for each subsequent session run
until you clear the control file attribute. If you change a target or loader connection setting after you edit the
control file, the control file does not include those changes. To include those changes, you must generate the
control file again and edit it.
If you do not override the control file, the Integration Service generates a new control file based on the session
and loader properties each time you run a session. The Integration Service generates the control file in the
output file directory. It overwrites each time you run the session.
Note: The Workflow Manager does not validate the control file syntax. Teradata verifies the control file syntax
when you run a session. If the control file is invalid, the session fails.
You can view the edited control file by opening the Control File Editor.

To override a control file:

1. In the Workflow Manager, open the session properties.


2. Click the Mapping tab and open the Transformations view.
3. Click the Targets node.
4. In the Connections settings, in the Value field, click Change.
5. In the Control File Content Override field, click Open.
The Control File Editor dialog box appears.
6. Click Generate.
The Workflow Manager generates the control file based on the session and loader properties.
7. Edit the generated control file and click OK to save the changes.

Creating User Variables in the Control File


When you configure MultiLoad or TPump external loader attributes, you can create user variables. User
variables are custom-defined substitution variables that you use in the control file. User variables capture
session specific information that may not be available in the connection object attributes. User variables are
often used for pre- or post-load processing.
You define the user variable name and substitution value in the connection object. In the control file, you add
the substitution variable prefix and the user variable name to the corresponding command. When you run the
session, the Integration Service replaces the substitution variable prefix and the user variable name in the
control file with the substitution value. If you change the substitution value after you edit the control file, the
control file uses the new value.
Use the following rules and guidelines when you create user variables:
♦ When you create the user variable, use the following syntax:
<User_Variable_Name>=<Substitution_Value>

♦ If you include spaces in the user variable name or the substitution value, the session may fail.
♦ When you add the user variable to the control file, use the following syntax:
:CF.<User_Variable_Name>

Example
After the Integration Service loads data to the target, you want to display the system date to an output file. In
the connection object, you configure the following user variable:
OutputFileName=output_file.txt

430 Chapter 29: External Loading


In the control file, you configure the following:
DISPLAY ‘&SYSDATE’ TO FILE ‘:CF.OutputFileName’

When you run the session, the Integration Service replaces :CF.OutputFileName with output_file.txt in the
control file.

Configuring Teradata MultiLoad External Loader Attributes


Use the following rules and guidelines when you work with the MultiLoad external loader:
♦ You can perform insert, update, delete, and upsert operations on targets. You can also use data driven mode
to perform insert, update, or delete operations based on an Update Strategy or Custom transformation.
♦ For a session with multiple partitions, use the round-robin partition type to route data to a single target file.
For more information about partitioning sessions with external loaders, see “Partitioning Sessions with
External Loaders” on page 421.
♦ If you invoke a greater number of sessions than the maximum number of concurrent sessions the database
allows, the session may hang. You can set the minimum value for Tenacity and Sleep to ensure that sessions
fail rather than hang.
To configure attributes for the Teradata MultiLoad external loader, click Connections > Loader, select the
Type, and click Edit.
Table 29-6 shows the attributes that you configure for the Teradata MultiLoad external loader:

Table 29-6. Teradata MultiLoad External Loader Attributes

Default
Attribute Description
Value

TDPID n/a Teradata database ID.

Database Name n/a Optional database name. If you do not specify a database name, the
Integration Service uses the target table database name defined in the
mapping.

Date Format n/a Date format. The date format in the connection object must match the
date format you define in the target definition. The Integration Service
supports the following date formats:
- DD/MM/YYYY
- MM/DD/YYYY
- YYYY/DD/MM
- YYYY/MM/DD

Error Limit 0 Total number of rejected records that MultiLoad can write to the MultiLoad
error tables. Uniqueness violations do not count as rejected records.
An error limit of 0 means that there is no limit on the number of rejected
records.

Checkpoint 10,000 Interval between checkpoints. You can set the interval to the following
values:
- 60 or more. MultiLoad performs a checkpoint operation after it processes
each multiple of that number of records.
- 1–59. MultiLoad performs a checkpoint operation at the specified
interval, in minutes.
- 0. MultiLoad does not perform any checkpoint operation during the
import task.

Tenacity 10,000 Amount of time, in hours, MultiLoad tries to log in to the required sessions.
If a login fails, MultiLoad delays for the number of minutes specified in the
Sleep attribute, and then retries the login. MultiLoad keeps trying until the
login succeeds or the number of hours specified in the Tenacity attribute
elapses.

Loading to Teradata 431


Table 29-6. Teradata MultiLoad External Loader Attributes

Default
Attribute Description
Value

Load Mode Upsert Mode to generate SQL commands: Insert, Delete, Update, Upsert, or Data
Driven.
When you select Data Driven loading, the Integration Service follows
instructions in an Update Strategy or Custom transformation to determine
how to flag rows for insert, delete, or update. The Integration Service
writes a column in the target file or named pipe to indicate the update
strategy. The control file uses these values to determine how to load data
to the target. The Integration Service uses the following values to indicate
the update strategy:
0 - Insert
1 - Update
2 - Delete

Drop Error Tables Enabled Drops the MultiLoad error tables before beginning the next session.
Select this option to drop the tables, or clear it to keep them.

External Loader mload Name and optional file path of the Teradata external loader executable. If
Executable the external loader executable directory is not in the system path, you
must enter the full path.

Max Sessions 1 Maximum number of MultiLoad sessions per MultiLoad job. Max Sessions
must be between 1 and 32,767.
Running multiple MultiLoad sessions causes the client and database to
use more resources. Therefore, setting this value to a small number may
improve performance.

Sleep 6 Number of minutes MultiLoad waits before retrying a login. MultiLoad tries
until the login succeeds or the number of hours specified in the Tenacity
attribute elapses.
Sleep must be greater than 0. If you specify 0, MultiLoad issues an error
message and uses the default value, 6 minutes.

Is Staged Disabled Method of loading data. Select Is Staged to load data to a flat file staging
area before loading to the database. Otherwise, the data is loaded to the
database using a named pipe. For more information, see “External Loader
Behavior” on page 420.

Error Database n/a Error database name. Use this attribute to override the default error
database name. If you do not specify a database name, the Integration
Service uses the target table database.

Work Table n/a Work table database name. Use this attribute to override the default work
Database table database name. If you do not specify a database name, the
Integration Service uses the target table database.

Log Table n/a Log table database name. Use this attribute to override the default log
Database table database name. If you do not specify a database name, the
Integration Service uses the target table database.

User Variables n/a User-defined variable used in the default control file. For more
information, see “Creating User Variables in the Control File” on
page 430.

432 Chapter 29: External Loading


Table 29-7 shows the attributes that you configure when you override the Teradata MultiLoad external loader
connection object in the session properties:

Table 29-7. Teradata MultiLoad External Loader Attributes Defined at the Session Level

Default
Attribute Description
Value

Error Table 1 n/a Table name for the first error table. Use this attribute to override the
default error table name. If you do not specify an error table name,
the Integration Service uses ET_<target_table_name>.

Error Table 2 n/a Table name for the second error table. Use this attribute to override
the default error table name. If you do not specify an error table
name, the Integration Service uses UV_<target_table_name>.

Work Table n/a Work table name overrides the default work table name. If you do not
specify a work table name, the Integration Service uses
WT_<target_table_name>.

Log Table n/a Log table name overrides the default log table name. If you do not
specify a log table name, the Integration Service uses
ML_<target_table_name>.

Control File Content n/a Control file text. Use this attribute to override the control file the
Override Integration Service uses when it loads to Teradata. For more
information, see “Overriding the Control File” on page 429.

For more information about these attributes, see the Teradata documentation.

Configuring Teradata TPump External Loader Attributes


You can perform insert, update, delete, and upsert operations on targets. You can also use data driven mode to
perform insert, update, or delete operations based on an Update Strategy or Custom transformation.
If you run a session with multiple partitions, select a Teradata TPump external loader for each partition.
To configure attributes for the Teradata TPump external loader, click Connections > Loader, select the Type,
and click Edit.
Table 29-8 shows the attributes that you configure for the Teradata TPump external loader:

Table 29-8. Teradata TPump External Loader Attributes

Default
Attribute Description
Value

TDPID n/a Teradata database ID.

Database Name n/a Optional database name. If you do not specify a database name, the
Integration Service uses the target table database name defined in the
mapping.

Error Limit 0 Limits the number of rows rejected for errors. When the error limit is
exceeded, TPump rolls back the transaction that causes the last error.
An error limit of 0 causes TPump to stop processing after any error.

Checkpoint 15 Number of minutes between checkpoints. You must set the checkpoint
to a value between 0 and 60.

Tenacity 4 Amount of time, in hours, TPump tries to log in to the required


sessions. If a login fails, TPump delays for the number of minutes
specified in the Sleep attribute, and then retries the login. TPump
keeps trying until the login succeeds or the number of hours specified
in the Tenacity attribute elapses.
To disable Tenacity, set the value to 0.

Loading to Teradata 433


Table 29-8. Teradata TPump External Loader Attributes

Default
Attribute Description
Value

Load Mode Upsert Mode to generate SQL commands: Insert, Delete, Update, Upsert, or
Data Driven.
When you select Data Driven loading, the Integration Service follows
instructions in an Update Strategy or Custom transformation to
determine how to flag rows for insert, delete, or update. The
Integration Service writes a column in the target file or named pipe to
indicate the update strategy. The control file uses these values to
determine how to load data to the database. The Integration Service
uses the following values to indicate the update strategy:
0 - Insert
1 - Update
2 - Delete

Drop Error Tables Enabled Drops the TPump error tables before beginning the next session.
Select this option to drop the tables, or clear it to keep them.

External Loader tpump Name and optional file path of the Teradata external loader
Executable executable. If the external loader executable directory is not in the
system path, you must enter the full path.

Max Sessions 1 Maximum number of TPump sessions per TPump job. Each partition in
a session starts its own TPump job. Running multiple TPump sessions
causes the client and database to use more resources. Therefore,
setting this value to a small number may improve performance.

Sleep 6 Number of minutes TPump waits before retrying a login. TPump tries
until the login succeeds or the number of hours specified in the
Tenacity attribute elapses.

Packing Factor 20 Number of rows that each session buffer holds. Packing improves
network/channel efficiency by reducing the number of sends and
receives between the target flat file and the Teradata database.

Statement Rate 0 Initial maximum rate, per minute, at which the TPump executable
sends statements to the Teradata database. If you set this attribute to
0, the statement rate is unspecified.

Serialize Disabled Determines whether or not operations on a given key combination


(row) occur serially.
You may want to enable this if the TPump job contains multiple
changes to one row. Sessions that contain multiple partitions with the
same key range but different filter conditions may cause multiple
changes to a single row. In this case, you may want to enable Serialize
to prevent locking conflicts in the Teradata database, especially if you
set the Pack attribute to a value greater than 1.
If you enable Serialize, the Integration Service uses the primary key
specified in the target table as the Key column. If no primary key exists
in the target table, you must either clear this option or indicate the Key
column in the data layout section of the control file.

Robust Disabled When Robust is not selected, it signals TPump to use simple restart
logic. In this case, restarts cause TPump to begin at the last
checkpoint. TPump reloads any data that was loaded after the
checkpoint. This method does not have the extra overhead of the
additional database writes in the robust logic.

No Monitor Enabled When selected, this attribute prevents TPump from checking for
statement rate changes from, or update status information for, the
TPump monitor application.

Is Staged Disabled Method of loading data. Select Is Staged to load data to a flat file
staging area before loading to the database. Otherwise, the data is
loaded to the database using a named pipe. For more information, see
“External Loader Behavior” on page 420.

434 Chapter 29: External Loading


Table 29-8. Teradata TPump External Loader Attributes

Default
Attribute Description
Value

Error Database n/a Error database name. Use this attribute to override the default error
database name. If you do not specify a database name, the Integration
Service uses the target table database.

Log Table Database n/a Log table database name. Use this attribute to override the default log
table database name. If you do not specify a database name, the
Integration Service uses the target table database.

User Variables n/a User-defined variable used in the default control file. For more
information, see “Creating User Variables in the Control File” on
page 430.

Table 29-9 shows the attributes that you configure when you override the Teradata TPump external loader
connection object in the session properties:

Table 29-9. Teradata TPump External Loader Attributes Defined at the Session Level

Default
Attribute Description
Value

Error Table n/a Error table name. Use this attribute to override the default error table
name. If you do not specify an error table name, the Integration
Service uses ET_<target_table_name><partition_number>.

Log Table n/a Log table name. Use this attribute to override the default log table
name. If you do not specify a log table name, the Integration Service
uses TL_<target_table_name><partition_number>.

Control File Content n/a Control file text. Use this attribute to override the control file the
Override Integration Service uses when it loads to Teradata. For more
information, see “Overriding the Control File” on page 429.

For more information about these attributes, see the Teradata documentation.

Configuring Teradata FastLoad External Loader Attributes


Use the following guidelines when you work with the FastLoad external loader:
♦ Each FastLoad job loads data to one Teradata database table. If you want to load data to multiple tables
using FastLoad, you must create multiple FastLoad jobs.
♦ For a session with multiple partitions, use the round-robin partition type to route data to a single target file.
For more information about partitioning sessions with external loaders, see “Partitioning Sessions with
External Loaders” on page 421.
♦ The target table must be empty with no defined secondary indexes.
♦ FastLoad does not load duplicate rows from the output file to the target table in the Teradata database if the
target table has a primary key.
♦ If you load date values to the target table, you must configure the date format for the column in the target
table in the format YYYY-MM-DD.
♦ You cannot use FastLoad to load binary data.
♦ You can use comma (,), tab (\t), and pipe ( | ) as delimiters.
To configure attributes for the Teradata FastLoad external loader, click Connections > Loader, select the Type,
and click Edit.

Loading to Teradata 435


Table 29-10 shows the attributes that you configure for the Teradata FastLoad external loader:

Table 29-10. Teradata FastLoad External Loader Attributes

Default
Attribute Description
Value

TDPID n/a Teradata database ID.

Database Name n/a Database name.

Error Limit 1,000,000 Maximum number of rows that FastLoad rejects before it stops loading
data to the database table.

Checkpoint 0 Number of rows transmitted to the Teradata database between


checkpoints. If processing stops while a FastLoad job is running, you can
restart the job at the most recent checkpoint.
If you enter 0, FastLoad does not perform checkpoint operations.

Tenacity 4 Number of hours FastLoad tries to log in to the required FastLoad


sessions when the maximum number of load jobs are already running on
the Teradata database. When FastLoad tries to log in to a new session,
and the Teradata database indicates that the maximum number of load
sessions is already running, FastLoad logs off all new sessions that were
logged in, delays for the number of minutes specified in the Sleep
attribute, and then retries the login. FastLoad keeps trying until it logs in
for the required number of sessions or exceeds the number of hours
specified in the Tenacity attribute.

Drop Error Tables Enabled Drops the FastLoad error tables before beginning the next session.
FastLoad will not run if non-empty error tables exist from a prior job.
Select this option to drop the tables, or clear it to keep them.

External Loader fastload Name and optional file path of the Teradata external loader executable. If
Executable the external loader executable directory is not in the system path, you
must enter the full path.

Max Sessions 1 Maximum number of FastLoad sessions per FastLoad job. Max Sessions
must be between 1 and the total number of access module processes
(AMPs) on the system.

Sleep 6 Number of minutes FastLoad pauses before retrying a login. FastLoad


tries until the login succeeds or the number of hours specified in the
Tenacity attribute elapses.

Truncate Target Disabled Truncates the target database table before beginning the FastLoad job.
Table FastLoad cannot load data to non-empty tables.

Is Staged Disabled Method of loading data. Select Is Staged to load data to a flat file staging
area before loading to the database. Otherwise, the data is loaded to the
database using a named pipe. For more information, see “External Loader
Behavior” on page 420.

Error Database n/a Error database name. Use this attribute to override the default error
database name. If you do not specify a database name, the Integration
Service uses the target table database.

Table 29-11 shows the attributes that you configure when you override the Teradata FastLoad external loader
connection object in the session properties:

Table 29-11. Teradata FastLoad External Loader Attributes Defined at the Session Level

Default
Attribute Description
Value

Error Table 1 n/a Table name for the first error table overrides the default error table name.
If you do not specify an error table name, the Integration Service uses
ET_<target_table_name>.

436 Chapter 29: External Loading


Table 29-11. Teradata FastLoad External Loader Attributes Defined at the Session Level

Default
Attribute Description
Value

Error Table 2 n/a Table name for the second error table overrides the default error table
name. If you do not specify an error table name, the Integration Service
uses UV_<target_table_name>.

Control File n/a Control file text. Use this attribute to override the control file the
Content Override Integration Service uses when it loads to Teradata. For more information,
see “Overriding the Control File” on page 429.

For more information about these attributes, see the Teradata documentation.

Configuring Teradata Warehouse Builder Attributes


Teradata Warehouse Builder uses operators to load data. Operators allow the Teradata Warehouse Builder to
achieve the functionality of FastLoad, MultiLoad, or TPump.
If you run a session with multiple partitions, use a Warehouse Builder external loader to load the output files to
a Teradata database. You must select a Teradata Warehouse Builder external loader for each partition. For
information about selecting external loaders, see “Configuring External Loading in a Session” on page 439.
Table 29-12 shows the operators and protocol for each Teradata Warehouse Builder operator:

Table 29-12. Teradata Warehouse Builder Operators and Protocol

Operator Protocol

Load Uses FastLoad protocol. Load attributes are described in Table 29-13. For more
information about how FastLoad works, see “Configuring Teradata FastLoad External
Loader Attributes” on page 435.

Update Uses MultiLoad protocol. Update attributes are described in Table 29-13. For more
information about how MultiLoad works, see “Configuring Teradata MultiLoad External
Loader Attributes” on page 431.

Stream Uses TPump protocol. Stream attributes are described in Table 29-13. For more
information about how TPump works, see “Configuring Teradata TPump External Loader
Attributes” on page 433.

Each Teradata Warehouse Builder operator has associated attributes. Not all attributes available for FastLoad,
MultiLoad, and TPump external loaders are available for Teradata Warehouse Builder.
Table 29-13 shows the attributes that you configure for Teradata Warehouse Builder:

Table 29-13. Teradata Warehouse Builder External Loader Attributes

Default
Attribute Description
Value

TDPID n/a Teradata database ID.

Database Name n/a Database name.

Error Database n/a Name of the error database.


Name

Operator Update Warehouse Builder operator used to load the data. Select Load, Update, or
Stream.

Max instances 4 Maximum number of parallel instances for the defined operator.

Error Limit 0 Maximum number of rows that Warehouse Builder rejects before it stops
loading data to the database table.

Loading to Teradata 437


Table 29-13. Teradata Warehouse Builder External Loader Attributes

Default
Attribute Description
Value

Checkpoint 0 Number of rows transmitted to the Teradata database between checkpoints.


If processing stops while a Warehouse Builder job is running, you can
restart the job at the most recent checkpoint.
If you enter 0, Warehouse Builder does not perform checkpoint operations.

Tenacity 4 Number of hours Warehouse Builder tries to log in to the Warehouse Builder
sessions when the maximum number of load jobs are already running on
the Teradata database. When Warehouse Builder tries to log in for a new
session, and the Teradata database indicates that the maximum number of
load sessions is already running, Warehouse Builder logs off all new
sessions that were logged in, delays for the number of minutes specified in
the Sleep attribute, and then retries the login. Warehouse Builder keeps
trying until it logs in for the required number of sessions or exceeds the
number of hours specified in the Tenacity attribute.
To disable Tenacity, set the value to 0.

Load Mode Upsert Mode to generate SQL commands. Select Insert, Update, Upsert, Delete, or
Data Driven.
When you use the Update or Stream operators, you can choose Data
Driven load mode. When you select data driven loading, the Integration
Service follows instructions in Update Strategy or Custom transformations
to determine how to flag rows for insert, delete, or update. The Integration
Service writes a column in the target file or named pipe to indicate the
update strategy. The control file uses these values to determine how to load
data to the database. The Integration Service uses the following values to
indicate the update strategy:
0 - Insert
1 - Update
2 - Delete

Drop Error Tables Enabled Drops the Warehouse Builder error tables before beginning the next
session. Warehouse Builder will not run if error tables containing data exist
from a prior job. Clear the option to keep error tables.

Truncate Target Disabled Specifies whether to truncate target tables. Enable this option to truncate
Table the target database table before beginning the Warehouse Builder job.

External Loader tbuild Name and optional file path of the Teradata external loader executable file.
Executable If the external loader directory is not in the system path, enter the file path
and file name.

Max Sessions 4 Maximum number of Warehouse Builder sessions per Warehouse Builder
job. Max Sessions must be between 1 and the total number of access
module processes (AMPs) on the system.

Sleep 6 Number of minutes Warehouse Builder pauses before retrying a login.


Warehouse Builder tries until the login succeeds or the number of hours
specified in the Tenacity attribute elapses.

Serialize Disabled Specifies whether operations on a column occur serially.


Available with Update and Stream operators.

Packing Factor 20 Number of rows that each session buffer holds. Packing improves
network/channel efficiency by reducing the number of sends and receives
between the target file and the Teradata database. Available with Stream
operator.

Robust Disabled Recovery or restart mode. When you disable Robust, the Stream operator
uses simple restart logic. The Stream operator reloads any data that was
loaded after the last checkpoint.
When you enable Robust, Warehouse Builder uses robust restart logic. In
robust mode, the Stream operator determines how many rows were
processed since the last checkpoint. The Stream operator processes all the
rows that were not processed after the last checkpoint. Available with
Stream operator.

438 Chapter 29: External Loading


Table 29-13. Teradata Warehouse Builder External Loader Attributes

Default
Attribute Description
Value

Is Staged Disabled Method of loading data. Select Is Staged to load data to a flat file staging
area before loading to the database. Otherwise, the data is loaded to the
database using a named pipe. For more information, see “External Loader
Behavior” on page 420.

Error Database n/a Error database name. Use this attribute to override the default error
database name. If you do not specify a database name, the Integration
Service uses the target table database.

Work Table n/a Work table database name. Use this attribute to override the default work
Database table database name. If you do not specify a database name, the
Integration Service uses the target table database.

Log Table n/a Log table database name. Use this attribute to override the default log table
Database database name. If you do not specify a database name, the Integration
Service uses the target table database.
Note: Available attributes depend on the operator you select.

Table 29-14 shows the attributes that you configure when you override the Teradata Warehouse Builder
external loader connection object in the session properties:

Table 29-14. Teradata Warehouse Builder External Loader Attributes Defined for Sessions

Default
Attribute Description
Value

Error Table 1 n/a Table name for the first error table. Use this attribute to override the default
error table name. If you do not specify an error table name, the Integration
Service uses ET_<target_table_name>.

Error Table 2 n/a Table name for the second error table. Use this attribute to override the
default error table name. If you do not specify an error table name, the
Integration Service uses UV_<target_table_name>.

Work Table n/a Work table name. This attribute overrides the default work table name. If
you do not specify a work table name, the Integration Service uses
WT_<target_table_name>.

Log Table n/a Log table name. This attribute overrides the default log table name. If you
do not specify a log table name, the Integration Service uses
RL_<target_table_name>.

Control File n/a Control file text. This attribute overrides the control file the Integration
Content Override Service uses to loads to Teradata. For more information, see “Overriding
the Control File” on page 429.
Note: Available attributes depend on the operator you select.

For more information about these attributes, see the Teradata documentation.

Configuring External Loading in a Session


Before you can configure external loading in a session, you must create an external loader connection in the
Workflow Manager and configure the external loader attributes. For more information about creating external
loader connections, see “External Loader Connections” on page 36.
Complete the following steps to use an external loader for a session:

Configuring External Loading in a Session 439


1. Configure the session to write to flat file instead of to a relational database. For more information, see
“Configuring a Session to Write to a File” on page 440.
2. Configure the file properties. For more information, see “Configuring File Properties” on page 440.
3. Select an external loader connection in the session properties. For more information, see “Selecting an
External Loader Connection” on page 441.

Configuring a Session to Write to a File


To use an external loader, create the target definition in the mapping according to the target database type. The
session configures a relational target type by default. To select an external loader connection, you must
configure the session to write to a file instead of a relational target. To configure the session to write to a file,
change the writer type from relational writer to file writer. You change the writer type using the Writers settings
on the Mapping tab.
To change the writer type for the target, select the target instance and change the writer type from Relational
Writer to File Writer.

Configuring File Properties


After you configure the session to write to a file, you can set the file properties. You need to specify the output
file name and directory, and the reject file name and directory. You configure these properties in the Properties
settings on the Mapping tab. To set the file properties, select the target instance.
Table 29-15 shows the attributes in Properties settings:

Table 29-15. Properties Settings

Attribute Description

Output File Directory Name and path of the output file directory. Enter the directory name in this field.
By default, the Integration Service writes output files to the directory
$PMTargetFileDir.
If you enter a full directory and file name in the Output Filename field, clear this
field. External loader sessions may fail if you use double spaces in the path for
the output file.

Output Filename Name of the output file. Enter the file name, or file name and path. By default, the
Workflow Manager names the target file based on the target definition used in
the mapping: target_name.out. External loader sessions may fail if you use
double spaces in the path for the output file.

Reject File Directory Name and path of the reject file directory. By default, the Integration Service
writes all reject files to the directory $PMBadFileDir.
If you enter a full directory and file name in the Reject Filename field, clear this
field.

Reject Filename Name of the reject file. Enter the file name, or file name and directory. The
Integration Service appends information in this field to that entered in the Reject
File Directory field. For example, if you have “C:/reject_file/” in the Reject File
Directory field, and enter “filename.bad” in the Reject Filename field, the
Integration Service writes rejected rows to C:/reject_file/filename.bad.
By default, the Integration Service names the reject file after the target instance
name: target_name.bad.
You can also enter a reject file session parameter to represent the reject file or
the reject file and directory. Name all reject file parameters $BadFileName. For
more information about session parameters, see “Parameter Files” on page 403.

Set File Properties Definition of flat file properties. When you use an external loader, you must
define the flat file properties by clicking the Set File Properties link.
For Oracle external loaders, the target flat file can be fixed-width or delimited.
For Sybase IQ external loaders, the target flat file can be fixed-width or delimited.
For Teradata external loaders, the target flat file must be fixed-width.
For IBM DB2 external loaders, the target flat file must be delimited.
For more information, see “Configuring Fixed-Width Properties” on page 181 and
“Configuring Delimited Properties” on page 182.

440 Chapter 29: External Loading


Note: Do not select Merge Partitioned Files or enter a merge file name. You cannot merge partitioned output
files when you use an external loader.

Selecting an External Loader Connection


After you configure file properties, you can select the external loader connection. To select the external loader
connection, choose the connection type and the connection object. You configure connection options in the
Connections settings on the Mapping tab.
If the session contains multiple partitions, and you choose a loader that can load from multiple output files, you
can select a different connection for each partition, but each connection must be of the same type. For example,
you can select different Teradata TPump external loader connections for each partition, but you cannot select a
Teradata TPump connection for one partition and an Oracle connection for another partition.
If the session contains multiple partitions, and you choose a loader that can load from only one output file, use
round-robin partitioning to route data to a single target file. You can choose a loader for each connection, but
the Integration Service uses the connection for the first partition.
For more information about running external loader sessions with multiple partitions, see “Partitioning
Sessions with External Loaders” on page 421.

To select an external loader connection:

1. On the Mapping tab, select the target instance in the Navigator.


2. Select the Loader connection type.
3. Click the Open button in the Value field.
4. Select a connection object or variable:
♦ Use object. Select a loader connection object. Click the Override button to override connection
attributes. For more information about overriding connection attributes in the Workflow Manager, see
“Overriding Connection Attributes in the Session” on page 25. The attributes you can override vary
according to loader type.
♦ Use connection variable. Use the $LoaderConnectionName session parameter, and define the parameter
in the parameter file. Override connection attributes in the parameter file. For information about
overriding connection attributes in the parameter file, see “Overriding Connection Attributes in the
Parameter File” on page 26. For more information about parameter files, see “Parameter Files” on
page 403.
5. Click OK.

Troubleshooting
I am trying to set up a session to load data to an external loader, but I cannot select an external loader connection in
the session properties.
Verify that the mapping contains a relational target. When you create the session, select a file writer in the
Writers settings of the Mapping tab in the session properties. Then open the Connections settings and select an
external loader connection.

I am trying to run a session that uses TPump, but the session fails. The session log displays an error saying that the
Teradata output file name is too long.
The Integration Service uses the Teradata output file name to generate names for the TPump error and log files
and the log table name. To generate these names, the Integration Service adds a prefix of several characters to
the output file name. It adds three characters for sessions with one partition and five characters for sessions with
multiple partitions.

Troubleshooting 441
Teradata allows log table names of up to 30 characters. Because the Integration Service adds a prefix, if you are
running a session with a single partition, specify a target output file name with a maximum of 27 characters,
including the file extension. If you are running a session with multiple partitions, specify a target output file
name with a maximum of 25 characters, including the file extension.

I tried to load data to Teradata using TPump, but the session failed. I corrected the error, but the session still fails.
Occasionally, Teradata does not drop the log table when you rerun the session. Check the Teradata database,
and manually drop the log table if it exists. Then rerun the session.

442 Chapter 29: External Loading


CHAPTER 30

Using FTP
This chapter includes the following topics:
♦ Overview, 443
♦ Integration Service Behavior, 444
♦ Configuring FTP in a Session, 445

Overview
You can configure a session to use File Transfer Protocol (FTP) to read from flat file or XML sources or write to
flat file or XML targets. The Integration Service can use FTP to access any machine it can connect to, including
mainframes. With both source and target files, use FTP to transfer the files directly or stage them in a local
directory. Access source files directly or use a file list to access indirect source files in a session.
To use FTP file sources and targets in a session, complete the following tasks:
1. Create an FTP connection object in the Workflow Manager and configure the connection attributes. For
more information about creating FTP connections, see “FTP Connections” on page 34.
2. Configure the session to use the FTP connection object in the session properties. For more information,
see “Configuring FTP in a Session” on page 445.
Configure an FTP connection to use SSH File Transfer Protocol (SFTP) if you are connecting to an SFTP
server. SFTP enables file transfer over a secure data stream. The Integration Service creates an SSH2 transport
layer that enables a secure connection and access to the files on an SFTP server.

Rules and Guidelines


Use the following guidelines when using FTP with flat file or XML sources and targets:
♦ Specify the source or target output directory in the session properties. If you do not specify a directory, the
Integration Service stages the file in the directory where the Integration Service runs on UNIX or in the
Windows system directory.
♦ You cannot run sessions concurrently if the sessions use the same FTP source file or target file located on a
mainframe.
♦ If you abort a workflow containing a session that stages an FTP source or target from a mainframe, you may
need to wait for the connection to timeout before you can run the workflow again.
♦ To run a session using an FTP connection for an SFTP server that requires public key authentication, the
public key and private key files must be accessible on nodes where the session will run.

443
Integration Service Behavior
The behavior of the Integration Service using FTP depends on the way you configure the FTP connection and
the session. The Integration Service can use FTP to access source and target files in the following ways:
♦ Source files. Stage source files on the machine hosting the Integration Service or access the source files
directly from the FTP host. Use a single source file or a file list that contains indirect source files for a single
source instance.
♦ Target files. Stage target files on the machine hosting the Integration Service or write to the target files on
the FTP host.
Select staging options for the session when you select the FTP connection object in the session properties. You
can also stage files by creating a pre- or post-session shell command to copy the files to or from the FTP host.
You generally get better performance when you access source files directly with FTP. However, you may want to
stage FTP files to keep a local archive.

Using FTP with Source Files


Use FTP in a session that reads flat file or XML file sources. You can stage the source files for a session on the
machine hosting the Integration Service. Use a single source file or a file list for each source instance.
When you stage source files, the Integration Service copies the source file from the FTP host to the machine
hosting the Integration Service. The Integration Service uses the local file as the source file for the session. If the
local source file exists, the Integration Service replaces the file.
The following table describes the behavior of the Integration Service using FTP with source files:

Source Type Is Staged Integration Service Behavior

Direct Yes Integration Service copies the file from the FTP host to the machine
hosting the Integration Service after the session begins.

Direct No Integration Service uses FTP to access the source file directly.

Indirect Yes Integration Service reads the file list and copies the file list and the
source files to the machine hosting the Integration Service after the
session begins.

Indirect No Integration Service copies the file list to the machine hosting the
Integration Service after the session begins. The Integration Service
uses FTP to access the source files directly.

Using FTP with Target Files


Use FTP in a session that writes to flat file or XML file targets. You can stage the target files on the machine
hosting the Integration Service before copying them to the FTP host.
When you stage target files, the Integration Service creates a target file locally and transfers it to the FTP host
after the session completes. If you do not stage the target file, the Integration Service writes directly to the target
file on the FTP host. If the target file exists, the Integration Service truncates the file.
If you have the Partitioning option, use FTP for multiple target partition instances. You can write to multiple
target files or a merge file on the Integration Service or the FTP host.

RELATED TOPICS:
♦ “Partitioning File Targets” on page 268

444 Chapter 30: Using FTP


Configuring FTP in a Session
Before you can configure a session to use FTP, you must create an FTP connection object in the Workflow
Manager. The Integration Service uses the FTP connection attributes to connect to the FTP server.
After you create an FTP connection in the Workflow Manager, you can configure a session to use FTP. To use a
secure connection, select an FTP connection object configured for SFTP. Use any session with flat file or XML
sources or targets.
To configure the session, complete the following tasks for each source and target that requires an FTP
connection:
♦ Select an FTP connection.
♦ Configure source file properties.
♦ Configure target file properties.
To stage the source or target file on the Integration Service machine, edit the FTP connection in the session
properties to configure the directory and file name for the staged file.

Configuring SFTP in a Session


To run a session using an FTP connection for an SFTP server that requires public key authentication, the public
key and private key files must be accessible on nodes where the session will run.
If the Integration Service is configured to run on primary and backup nodes, make the key files accessible on
each node configured to run the Integration Service process.
If the Integration Service is configured to run on a grid, make the key files accessible on each node configured
to run on the grid. If you cannot put the files on each node in the grid, create a resource in the domain and
assign it to each node where you put the files. When you create a session, configure it to use the resource.
For example, create a custom resource called SFTP. When you create a session, you can require the session to use
the SFTP resource. The Load Balancer will only dispatch the session to nodes where the key files are accessible.

Selecting an FTP Connection


To configure a session to use FTP, select the connection type and the connection object. Select an FTP
connection object for each source and target that will use the FTP connection. To use SFTP, select an FTP
connection object that is configured for SFTP. You configure connection options in the Connections settings on
the Mapping tab.

To select an FTP connection for a source or target instance:

1. On the Mapping tab, select the source or target instance in the Transformation view.
2. Select the FTP connection type.
3. Click the Open button in the Value field.
4. Select a connection object or variable:
♦ Use object. Select an FTP connection object. Click the Override button to override connection
attributes.
♦ Use connection variable. Use the $FTPConnectionName session parameter, and define the parameter in
the parameter file. Override connection attributes in the parameter file.

Configuring FTP in a Session 445


You can override the following attributes:

Attribute Description

Remote Filename The remote file name for the source or target. If you use an indirect source file,
enter the indirect source file name.
You must use 7-bit ASCII characters for the file name. The session fails if you use
a remote file name with Unicode characters.
If you enter a fully qualified name for the source file name, the Integration Service
ignores the path entered in the Default Remote Directory field. The session will fail
if you enclose the fully qualified file name in single or double quotation marks.
You can use a parameter or variable for the remote file name. Use any parameter
or variable type that you can define in the parameter file. For example, you can use
a session parameter, $ParamMyRemoteFile, as the source or target remote file
name, and set $ParamMyRemoteFile to the file name in the parameter file.

Is Staged Stages the source or target file on the Integration Service. Default is not staged.

Is Transfer Mode Changes the transfer mode. When enabled, the Integration Service uses ASCII
ASCII transfer mode. You can use ASCII mode when transferring files on Windows
machines to ensure that the end of line character is translated properly in text files.
When disabled, the Integration Service uses binary transfer mode. Use binary
transfer mode when transferring files on UNIX machines. Default is disabled.

5. Click OK.

RELATED TOPICS:
♦ “Overriding Connection Attributes in the Session” on page 25

♦ “Overriding Connection Attributes in the Parameter File” on page 26

Configuring Source File Properties


If you access source files with FTP, configure the source file properties after you choose the FTP connection for
the source instance. The source file properties determine the source file type and the staging location. You can
configure source file properties in the Properties settings on the Mapping tab.
If you stage the source file, select the source file name, directory, and file type.
If you do not stage the source file, specify the source file type. The Integration Service uses the remote file name
and directory from the FTP connection object and ignores the source file name and directory.
The following table describes the source file properties on the Mapping tab:

Attribute Description

Source File Type Indicates whether the source file contains the source data or a list of files with the
same file properties. Choose Direct if the source file contains the source data.
Choose Indirect if the source file contains a list of files.

Source File Directory Name and path of the local source file directory used to stage the source data. By
default, the Integration Service uses the service process variable directory,
$PMSourceFileDir, for file sources. The Integration Service concatenates this field
with the Source file name field when it runs the session.
If you do not stage the source file, the Integration Service uses the file name and
directory from the FTP connection object.
The Integration Service ignores this field if you enter a fully qualified file name in the
Source file name field.

Source File Name Name of the local source file used to stage the source data. You can enter the file
name or the file name and path. If you enter a fully qualified file name, the Integration
Service ignores the Source file directory field.
If you do not stage the source file, the Integration Service uses the remote file name
and default directory from the FTP connection object.

446 Chapter 30: Using FTP


Configuring Target File Properties
If you write to target files with FTP, specify the target file properties after you specify the FTP connection for
the target instance. The target file properties determine the reject file and directory and staging location.
Specify target file properties in the Properties settings on the Mapping tab.
If you stage the target file, configure the target file name and directory and the reject file name and directory. If
you do not stage the target file, configure the reject file and directory. The Integration Service uses the remote
file name and directory from the FTP connection object.
If you have the Partitioning option, you can also select merge file properties.
The following table describes the target file attributes on the Mapping tab:

Attribute Description

Output File Directory Name and path of the local target file directory used to stage the target data. By
default, the Integration Service uses the service process variable directory,
$PMTargetFileDir. The Integration Service concatenates this field with the Output file
name field when it runs the session.
If you do not stage the target file, the Integration Service uses the file name and
directory from the FTP connection object.
The Integration Service ignores this field if you enter a fully qualified file name in the
Output file name field.

Output File Name Name of the local target file used to stage the target data. You can enter the file name,
or the file name and path. If you enter a fully qualified file name, the Integration
Service ignores the Output file directory field.
If you do not stage the source file, the Integration Service uses the remote file name
and default directory from the FTP connection object.

Partitioning FTP File Targets


When you choose an FTP connection type for the partitioned targets in a session, you configure FTP settings
for the target partitions. You can merge the target files or individual target files for each partition.
Use the following rules and guidelines when you configure FTP settings for target partitions:
♦ You must use an FTP connection for each target partition.
♦ You can choose to stage the files when selecting the connection object for the target partition. You must stage
the files to use sequential merge.
♦ If the FTP connections for the target partitions have any settings other than a remote file name, the
Integration Service does not create a merge file.
The following table describes the actions of the Integration Service with partitioned FTP file targets:

Merge Type Integration Service Behavior

No Merge If you stage the files, The Integration Service creates one target file for each partition. At
the end of the session, the Integration Service transfers the target files to the remote
location.
If you do not stage the files, the Integration Service generates a target file for each
partition at the remote location.

Sequential Merge Enable the Is Staged option in the connection object. Integration Service creates one
output file for each partition. At the end of the session, the Integration Service merges
the individual output files into a single merge file, deletes the individual output files, and
transfers the merge file to the remote location.

Configuring FTP in a Session 447


Merge Type Integration Service Behavior

File List If you stage the files, the Integration Service creates the following files:
- Output file for each partition
- File list that contains the names and paths of the local files
- File list that contains the names and paths of the remote files
At the end of the session, the Integration Service transfers the files to the remote
location. If the individual target files are in the Merge File Directory, file list contains
relative paths. Otherwise, the file list contains absolute paths.
If you do not stage the files, the Integration Service writes the data for each partition at
the remote location and creates a remote file list that contains a list of the individual
target files.
Use the file list as a source file in another mapping.

Concurrent Merge If you stage the files, the Integration Service concurrently writes the data for all target
partitions to a local merge file. At the end of the session, the Integration Service
transfers the merge file to the remote location. The Integration Service does not write to
any intermediate output files.
If you do not stage the files, the Integration Service concurrently writes the target data
for all partitions to a merge file at the remote location.

448 Chapter 30: Using FTP


CHAPTER 31

Using Incremental Aggregation


This chapter includes the following topics:
♦ Overview, 449
♦ Integration Service Processing for Incremental Aggregation, 450
♦ Reinitializing the Aggregate Files, 450
♦ Moving or Deleting the Aggregate Files, 451
♦ Partitioning Guidelines with Incremental Aggregation, 451
♦ Preparing for Incremental Aggregation, 452

Overview
When using incremental aggregation, you apply captured changes in the source to aggregate calculations in a
session. If the source changes incrementally and you can capture changes, you can configure the session to
process those changes. This allows the Integration Service to update the target incrementally, rather than
forcing it to process the entire source and recalculate the same data each time you run the session.
For example, you might have a session using a source that receives new data every day. You can capture those
incremental changes because you have added a filter condition to the mapping that removes pre-existing data
from the flow of data. You then enable incremental aggregation.
When the session runs with incremental aggregation enabled for the first time on March 1, you use the entire
source. This allows the Integration Service to read and store the necessary aggregate data. On March 2, when
you run the session again, you filter out all the records except those time-stamped March 2. The Integration
Service then processes the new data and updates the target accordingly.
Consider using incremental aggregation in the following circumstances:
♦ You can capture new source data. Use incremental aggregation when you can capture new source data each
time you run the session. Use a Stored Procedure or Filter transformation to process new data.
♦ Incremental changes do not significantly change the target. Use incremental aggregation when the changes
do not significantly change the target. If processing the incrementally changed source alters more than half
the existing target, the session may not benefit from using incremental aggregation. In this case, drop the
table and recreate the target with complete source data.
Note: Do not use incremental aggregation if the mapping contains percentile or median functions. The
Integration Service uses system memory to process these functions in addition to the cache memory you
configure in the session properties. As a result, the Integration Service does not store incremental aggregation
values for percentile and median functions in disk caches.

449
Integration Service Processing for Incremental
Aggregation
The first time you run an incremental aggregation session, the Integration Service processes the entire source.
At the end of the session, the Integration Service stores aggregate data from that session run in two files, the
index file and the data file. The Integration Service creates the files in the cache directory specified in the
Aggregator transformation properties.
Each subsequent time you run the session with incremental aggregation, you use the incremental source changes
in the session. For each input record, the Integration Service checks historical information in the index file for a
corresponding group. If it finds a corresponding group, the Integration Service performs the aggregate
operation incrementally, using the aggregate data for that group, and saves the incremental change. If it does
not find a corresponding group, the Integration Service creates a new group and saves the record data.
When writing to the target, the Integration Service applies the changes to the existing target. It saves modified
aggregate data in the index and data files to be used as historical data the next time you run the session.
If the source changes significantly and you want the Integration Service to continue saving aggregate data for
future incremental changes, configure the Integration Service to overwrite existing aggregate data with new
aggregate data.
Each subsequent time you run a session with incremental aggregation, the Integration Service creates a backup
of the incremental aggregation files. The cache directory for the Aggregator transformation must contain
enough disk space for two sets of the files.
When you partition a session that uses incremental aggregation, the Integration Service creates one set of cache
files for each partition.
The Integration Service creates new aggregate data, instead of using historical data, when you perform one of
the following tasks:
♦ Save a new version of the mapping.
♦ Configure the session to reinitialize the aggregate cache.
♦ Move the aggregate files without correcting the configured path or directory for the files in the session
properties.
♦ Change the configured path or directory for the aggregate files without moving the files to the new location.
♦ Delete cache files.
♦ Decrease the number of partitions.
When the Integration Service rebuilds incremental aggregation files, the data in the previous files is lost.
Note: To protect the incremental aggregation files from file corruption or disk failure, periodically back up the
files.

Reinitializing the Aggregate Files


If the source tables change significantly, you might want the Integration Service to create new aggregate data,
instead of using historical data. To have the Integration Service create new aggregate data, configure the session
to reinitialize the aggregate cache.
For example, you can reinitialize the aggregate cache if the source for a session changes incrementally every day
and completely changes once a month. When you receive the new source data for the month, you might
configure the session to reinitialize the aggregate cache, truncate the existing target, and use the new source
table during the session.

450 Chapter 31: Using Incremental Aggregation


After you run a session that reinitializes the aggregate cache, edit the session properties to disable the
Reinitialize Aggregate Cache option. If you do not clear Reinitialize Aggregate Cache, the Integration Service
overwrites the aggregate cache each time you run the session.
Note: When you move from Windows to UNIX, you must reinitialize the cache. Therefore, you cannot change
from a Latin1 code page to an MSLatin1 code page, even though these code pages are compatible.

Moving or Deleting the Aggregate Files


After you run an incremental aggregation session, avoid moving or modifying the index and data files that store
historical aggregate information.
If you move the files into a different directory, and you want the Integration Service to use the aggregate files,
you must also change the path to those files in the session properties. As well, if you change the path to the files,
but you do not move the files, the Integration Service rebuilds the files the next time you run the session.
If you change certain session or Integration Service properties, the Integration Service cannot use the
incremental aggregation files, and it fails the session. To avoid session failure, delete existing incremental
aggregation files when you perform any of the following tasks:
♦ Change the Integration Service data movement mode from ASCII to Unicode or from Unicode to ASCII.
♦ Change the Integration Service code page to an incompatible code page.
♦ Change the session sort order when the Integration Service runs in Unicode mode.
♦ Change the Enable High Precision session option.

Finding Index and Data Files


By default, the Integration Service stores the index and data files in the directory entered in the process variable,
$PMCacheDir, in the Workflow Manager. The Integration Service names the index file PMAGG*.idx*. The
Integration Service names the data file PMAGG*.dat*.
When you run the session, the Integration Service writes the file names in the session log. To locate the files,
look in the previous session log for the SM_7034 and SM_7035 messages that indicate the cache file name and
location. The following messages show sample entries in the session log:
MAPPING> SM_7034 Aggregate Information: Index file is
[C:\Informatica\PowerCenter8.0\server\infa_shared\Cache\PMAGG8_4_2.idx2]
MAPPING> SM_7035 Aggregate Information: Data file is
[C:\Informatica\PowerCenter8.0\server\infa_shared\Cache\PMAGG8_4_2.dat2]

RELATED TOPICS:
♦ “Cache Files” on page 457

Partitioning Guidelines with Incremental Aggregation


When you use incremental aggregation in a session with multiple partitions, the Integration Service creates one
set of cache files for each partition.
Use the following guidelines when you change the number of partitions or the cache directory:
♦ Change the cache directory for a partition. If you change the directory for a partition and you want the
Integration Service to reuse the cache files, you must move the cache files for the partition associated with
the changed directory.

Moving or Deleting the Aggregate Files 451


− If you change the directory for the first partition, and you do not move the cache files, the Integration
Service rebuilds the cache files for all partitions.
− If you change the directory for partitions 2-n, and you do not move the cache files, the Integration Service
rebuilds the cache files that it cannot locate.
♦ Decrease the number of partitions. If you delete a partition and you want the Integration Service to reuse
the cache files, you must move the cache files for the deleted partition to the directory configured for the
first partition. If you do not move the files to the directory of the first partition, the Integration Service
rebuilds the cache files that it cannot locate.
Note: If you increase the number of partitions, the Integration Service realigns the index and data cache files
the next time you run a session. It does not need to rebuild the files.
♦ Move cache files. If you move cache files for a partition and you want the Integration Service to reuse the
files, you must also change the partition directory. If you do not change the directory, the Integration Service
rebuilds the files the next time you run a session.
♦ Delete cache files. If you delete cache files, the Integration Service rebuilds them the next time you run a
session.
If you change the number of partitions and the cache directory, you may need to move cache files for both. For
example, if you change the cache directory for the first partition and you decrease the number of partitions, you
need to move the cache files for the deleted partition and the cache files for the partition associated with the
changed directory.

Preparing for Incremental Aggregation


When you use incremental aggregation, you need to configure both mapping and session properties:
♦ Implement mapping logic or filter to remove pre-existing data.
♦ Configure the session for incremental aggregation and verify that the file directory has enough disk space for
the aggregate files.

Configuring the Mapping


Before enabling incremental aggregation, you must capture changes in source data. You can use a Filter or
Stored Procedure transformation in the mapping to remove pre-existing source data during a session.

Configuring the Session


Use the following guidelines when you configure the session for incremental aggregation:
♦ Verify the location where you want to store the aggregate files. The index and data files grow in proportion
to the source data. Be sure the cache directory has enough disk space to store historical data for the session.
When you run multiple sessions with incremental aggregation, decide where you want the files stored. Then,
enter the appropriate directory for the process variable, $PMCacheDir, in the Workflow Manager. You can
enter session-specific directories for the index and data files. However, by using the process variable for all
sessions using incremental aggregation, you can easily change the cache directory when necessary by
changing $PMCacheDir.
Changing the cache directory without moving the files causes the Integration Service to reinitialize the
aggregate cache and gather new aggregate data.
In a grid, Integration Services rebuild incremental aggregation files they cannot find. When an Integration
Service rebuilds incremental aggregation files, it loses aggregate history.
♦ Verify the incremental aggregation settings in the session properties. You can configure the session for
incremental aggregation in the Performance settings on the Properties tab.

452 Chapter 31: Using Incremental Aggregation


You can also configure the session to reinitialize the aggregate cache. If you choose to reinitialize the cache,
the Workflow Manager displays a warning indicating the Integration Service overwrites the existing cache
and a reminder to clear this option after running the session.
Note: You cannot use incremental aggregation when the mapping includes an Aggregator transformation with
Transaction transformation scope. The Workflow Manager marks the session invalid.

Preparing for Incremental Aggregation 453


454 Chapter 31: Using Incremental Aggregation
CHAPTER 32

Session Caches
This chapter includes the following topics:
♦ Overview, 455
♦ Cache Memory, 456
♦ Cache Files, 457
♦ Configuring the Cache Size, 459
♦ Cache Partitioning, 461
♦ Aggregator Caches, 462
♦ Joiner Caches, 464
♦ Lookup Caches, 467
♦ Rank Caches, 468
♦ Sorter Caches, 469
♦ XML Target Caches, 470
♦ Optimizing the Cache Size, 471

Overview
The Integration Service allocates cache memory for XML targets and Aggregator, Joiner, Lookup, Rank, and
Sorter transformations in a mapping. The Integration Service creates index and data caches for the XML targets
and Aggregator, Joiner, Lookup, and Rank transformations. The Integration Service stores key values in the
index cache and output values in the data cache. The Integration Service creates one cache for the Sorter
transformation to store sort keys and the data to be sorted.
You configure memory parameters for the caches in the session properties. When you first configure the cache
size, you can calculate the amount of memory required to process the transformation or you can configure the
Integration Service to automatically configure the memory requirements at run time.
After you run a session, you can tune the cache sizes for the transformations in the session. You can analyze the
transformation statistics to determine the cache sizes required for optimal session performance, and then update
the configured cache sizes.
If the Integration Service requires more memory than what you configure, it stores overflow values in cache
files. When the session completes, the Integration Service releases cache memory, and in most circumstances, it
deletes the cache files.

455
If the session contains multiple partitions, the Integration Service creates one memory cache for each partition.
In particular situations, the Integration Service uses cache partitioning, creating a separate cache for each
partition.
The following table describes the type of information that the Integration Service stores in each cache:

Cache
Mapping Object Description
Type

Aggregator - Index - Stores group values as configured in the group by ports.


- Data - Stores calculations based on the group by ports.

Joiner - Index - Stores all master rows in the join condition that have unique keys.
- Data - Stores master source rows.

Lookup - Index - Stores lookup condition information.


- Data - Stores lookup data that is not stored in the index cache.

Rank - Index - Stores group values as configured in the group by ports.


- Data - Stores ranking information based on the group by ports.

Sorter - Sorter - Stores sort keys and data.

XML Target - Index - Stores primary and foreign key information in separate caches.
- Data - Stores XML row data while it generates the XML target.

Cache Memory
The Integration Service creates each memory cache based on the configured cache size. When you create a
session, you can configure the cache sizes for each transformation instance in the session properties.
The Integration Service might increase the configured cache size for one of the following reasons:
♦ The configured cache size is less than the minimum cache size required to process the operation. The
Integration Service requires a minimum amount of memory to initialize each session. If the configured cache
size is less than the minimum required cache size, then the Integration Service increases the configured cache
size to meet the minimum requirement. If the Integration Service cannot allocate the minimum required
memory, the session fails.
♦ The configured cache size is not a multiple of the cache page size. The Integration Service stores cached
data in cache pages. The cached pages must fit evenly into the cache. Thus, if you configure 10 MB
(1,048,576 bytes) for the cache size and the cache page size is 10,000 bytes, then the Integration Service
increases the configured cache size to 1,050,000 bytes to make it a multiple of the 10,000-byte page size.
When the Integration Service increases the configured cache size, it continues to run the session and writes a
message similar to the following message in the session log:
MAPPING> TE_7212 Increasing [Index Cache] size for transformation <transformation name>
from <configured index cache size> to <new index cache size>.

Review the session log to verify that enough memory is allocated for the minimum requirements.
For optimal performance, set the cache size to the total memory required to process the transformation. If there
is not enough cache memory to process the transformation, the Integration Service processes some of the
transformation in memory and pages information to disk to process the rest.
Use the following information to understand how the Integration Service handles memory caches differently on
32-bit and 64-bit machines:
♦ An Integration Service process running on a 32-bit machine cannot run a session if the total size of all the
configured session caches is more than 2 GB. If you run the session on a grid, the total cache size of all
session threads running on a single node must not exceed 2 GB.
♦ If a grid has 32-bit and 64-bit Integration Service processes and a session exceeds 2 GB of memory, you must
configure the session to run on an Integration Service on a 64-bit machine.

456 Chapter 32: Session Caches


RELATED TOPICS:
♦ “Running Workflows on a Grid” on page 366

Cache Files
When you run a session, the Integration Service creates at least one cache file for each transformation. If the
Integration Service cannot process a transformation in memory, it writes the overflow values to the cache files.
The following table describes the types of cache files that the Integration Service creates for different mapping
objects:

Mapping Object Cache File

Aggregator, Joiner, Lookup, The Integration Service creates the following types of cache files:
and Rank transformations - One header file for each index cache and data cache
- One data file for each index cache and data cache

Sorter transformation The Integration Service creates one sorter cache file.

XML target The Integration Service creates the following types of cache files:
- One data cache file for each XML target group
- One primary key index cache file for each XML target group
- One foreign key index cache file for each XML target group

The Integration Service creates cache files based on the Integration Service code page.
When you run a session, the Integration Service writes a message in the session log indicating the cache file
name and the transformation name. When a session completes, the Integration Service releases cache memory
and usually deletes the cache files. You may find index and data cache files in the cache directory under the
following circumstances:
♦ The session performs incremental aggregation.
♦ You configure the Lookup transformation to use a persistent cache.
♦ The session does not complete successfully. The next time you run the session, the Integration Service
deletes the existing cache files and creates new ones.
Note: Since writing to cache files can slow session performance, configure the cache sizes to process the
transformation in memory.

Naming Convention for Cache Files


The Integration Service uses the different naming conventions for index, data, and sorter cache files.
The following table describes the naming convention for each type of cache file:

Cache Files Naming Convention

Data and sorter [<Name Prefix> | <prefix> <session ID>_<transformation ID>]_[partition


index]_[OS][BIT].<suffix> [overflow index]

Index <prefix> <session id>_<transformation id>_<group id>_<key type>.<suffix> <overflow>

Cache Files 457


The following table describes the components of the cache file names:

File Name
Description
Component

Name Prefix Cache file name prefix configured in the Lookup transformation. For Lookup
transformation cache file.

Prefix Describes the type of transformation:


- Aggregator transformation is PMAGG.
- Joiner transformation is PMJNR.
- Lookup transformation is PMLKUP.
- Rank transformation is PMAGG.
- Sorter transformation is PMSORT.
- XML target is PMXML.

Session ID Session instance ID number.

Transformation ID Transformation instance ID number.

Group ID ID for each group in a hierarchical XML target. The Integration Service creates one index
cache for each group. For XML target cache file.

Key Type Type of key. Can be foreign key or primary key. For XML target cache file.

Partition Index If the session contains more than one partition, this identifies the partition number. The
partition index is zero-based, so the first partition has no partition index. Partition index 2
indicates a cache file created in the third partition.

OS Identifies the operating system of the machine running the Integration Service process:
- W is Windows.
- H is HP-UX.
- S is Solaris.
- A is AIX.
- L is Linux.
- M is Mainframe.
For Lookup transformation cache file.

BIT Identifies the bit platform of the machine running the Integration Service process: 32-bit
or 64-bit. For Lookup transformation cache file.

Suffix Identifies the type of cache file:


- Index cache file is .idx0 for the header file and .idxn for the data files.
- Data cache file is .dat0 for the header file and .datn for the data files.
- Sorter cache file is .PMSORT().

Overflow Index If a cache file handles more than 2 GB of data, the Integration Service creates more
cache files. When creating these files, the Integration Service appends an overflow index
to the file name, such as PMAGG*.idx2 and PMAGG*.idx3. The number of cache files is
limited by the amount of disk space available in the cache directory.

For example, the name of the data file for the index cache is PMLKUP748_2_5S32.idx1. PMLKUP identifies
the transformation type as Lookup, 748 is the session ID, 2 is the transformation ID, 5 is the partition index, S
(Solaris) is the operating system, and 32 is the bit platform.

Cache File Directory


The Integration Service creates the cache files by default in the $PMCacheDir directory. If the Integration
Service process does not find the directory, it fails the session and writes a message to the session log indicating
that it could not create or open the cache file.
The Integration Service may create multiple cache files. The number of cache files is limited by the amount of
disk space available in the cache directory.

458 Chapter 32: Session Caches


If you run the Integration Service on a grid and only some Integration Service nodes have fast access to the
shared cache file directory, configure each session with a large cache to run on the nodes with fast access to the
directory. To configure a session to run on a node with fast access to the directory, complete the following steps:
1. Create a PowerCenter resource.
2. Make the resource available to the nodes with fast access to the directory.
3. Assign the resource to the session.
If all Integration Service processes in a grid have slow access to the cache files, set up a separate, local cache file
directory for each Integration Service process. An Integration Service process may have faster access to the cache
files if it runs on the same machine that contains the cache directory.

Configuring the Cache Size


Configure the amount of memory for a cache in the session properties. The cache size specified in the session
properties overrides the value set in the transformation properties.
The amount of memory you configure depends on how much memory cache and disk cache you want to use. If
you configure the cache size and it is not enough to process the transformation in memory, the Integration
Service processes some of the transformation in memory and pages information to cache files to process the rest
of the transformation. For optimal session performance, configure the cache size so that the Integration Service
can process all data in memory.
If the session is reusable, all instances of the session use the cache size configured in the reusable session
properties. You cannot override the cache size in the session instance.
Use one of the following methods to configure a cache size:
♦ Cache calculator. Use the calculator to estimate the total amount of memory required to process the
transformation. For more information, see “Calculating the Cache Size” on page 459.
♦ Auto cache memory. Use auto memory to specify a maximum limit on the cache size that is allocated for
processing the transformation. Use this method if the machine on which the Integration Service process runs
has limited cache memory. For more information, see “Using Auto Memory Size” on page 460.
♦ Numeric value. Configure a specific value for the cache size. Configure a specific value when you want to
tune the cache size. For more information, see “Configuring a Numeric Cache Size” on page 460.
You configure the memory requirements differently when the Integration Service uses cache partitioning. If the
Integration Service uses cache partitioning, it allocates the configured cache size for each partition. To configure
the memory requirements for a transformation with cache partitioning, calculate the total requirements for the
transformation and divide by the number of partitions.
The cache size requirements for a transformation may change when the inputs to the transformation change.
Monitor the cache sizes in the session logs on a regular basis to help you tune the cache size.

RELATED TOPICS:
♦ “Configuring the Cache Size for Cache Partitioning” on page 462

♦ “Optimizing the Cache Size” on page 471

Calculating the Cache Size


Use the cache calculator to estimate the total amount of memory required to process the transformation. You
must provide inputs to calculate the cache size. The inputs depend on the type of transformation. For example,
to calculate the cache size for an Aggregator transformation, you supply the number of groups.
You can select one of the following modes in the cache calculator:

Configuring the Cache Size 459


♦ Auto. Choose auto mode if you want the Integration Service to determine the cache size at run time based
on the maximum memory configured on the Config Object tab.
♦ Calculate. Select to calculate the total requirements for a transformation based on inputs. The cache
calculator requires different inputs for each transformation. You must select the applicable cache type to
apply the calculated cache size. For example, to apply the calculated cache size for the data cache and not the
index cache, select only the Data Cache Size option.
The cache calculator estimates the cache size required for optimal session performance based on your input.
After you configure the cache size and run the session, you can review the transformation statistics in the session
log to tune the configured cache size.
Note: You cannot use the cache calculator to estimate the cache size for an XML target.

RELATED TOPICS:
♦ “Optimizing the Cache Size” on page 471

Using Auto Memory Size


Use auto cache memory to specify a maximum limit on the cache size that is allocated for processing the
transformation. If you use auto cache memory, you configure the Integration Service to determine the cache size
for a transformation at run time.
The Integration Service allocates memory cache based on the maximum memory size specified in the auto
memory attributes in the session properties. The Integration Service distributes the maximum cache size
specified among all transformations in the session.
You might use auto memory cache if the machine on which the Integration Service process runs has limited
cache memory. For example, you use the cache calculator to determine that the Aggregator transformation
requires 1 GB of cache memory to process the transformation. The machine only has 800 MB of cache memory
available.
When you configure a numeric value and a percentage for the auto cache memory, the Integration Service
compares the values and uses the lesser of the two for the maximum memory limit. The Integration Service
allocates up to 800 MB as long as 800 MB is less than 5% of the total memory.
To configure auto cache memory, you can use the cache calculator or you can enter ‘Auto’ directly into the
session properties. By default, transformations use auto cache memory.
If a session has multiple transformations that require caching, you can configure some transformations with
auto memory cache and other transformations with numeric cache sizes. The Integration Service allocates the
maximum memory specified for auto caching in addition to the configured numeric cache sizes. For example, a
session has three transformations. You assign auto caching to two transformations and specify a maximum
memory cache size of 800 MB. You specify 500 MB as the cache size for the third transformation. The
Integration Service allocates a total of 1,300 MB of memory.
If the Integration Service uses cache partitioning, the Integration Service distributes the maximum cache size
specified for the auto cache memory across all transformations in the session and divides the cache memory for
each transformation among all of its partitions.

RELATED TOPICS:
♦ “Configuring Automatic Memory Settings” on page 124

Configuring a Numeric Cache Size


You can configure a specific value for the cache size. You configure a specific value when you tune a cache size.
The first time you configure the cache size, you can use the cache calculator or auto cache memory. After you
configure the cache size and run the session, you can analyze the transformation statistics in the session log to
tune the cache size. The session log shows the cache size required to process the transformation in memory
without paging to disk. Use the cache size specified in the session log for optimal session performance.

460 Chapter 32: Session Caches


Steps to Configure the Cache Size
You can configure the cache size for a transformation in the session properties. When you configure the cache
size, you specify the total requirements for the transformation, unless the Integration Service uses cache
partitioning.
You configure the cache size differently if the Integration Services uses cache partitioning. To calculate the cache
size when the Integration Service uses cache partitioning, calculate the total requirements for the
transformation, and divide by the number of partitions.

To configure the cache size in the session:

1. In the Workflow Manager, open the session.


2. Click the Mapping tab.
3. Select the mapping object in the left pane.
The right pane of the Mapping tab shows the object properties where you can configure the cache size.
4. Use one of the following methods to set the cache size:
Enter a value for the cache size, click OK, and then skip to step 8. If you enter a value, all values are in
bytes by default. However, you can enter a value and specify one of the following units: KB, MB, or GB. If
you enter the units, do not enter a space between the value and unit. For example, enter 350000KB,
200MB, or 1GB.
or
Enter ‘Auto’ for the cache size, click OK, and then skip to step 8.
or
Click the Open button to open the cache calculator.
5. Select a mode.
Select the Auto mode to limit the amount of cache allocated to the transformation. Skip to step 8.
or
Select the Calculate mode to calculate the total memory requirement for the transformation.
6. Provide the input based on the transformation type, and click Calculate.
Note: If the input value is too large and you cannot enter the value in the cache calculator, use auto memory
cache.
The cache calculator calculates the cache sizes in kilobytes.
7. If the transformation has a data cache and index cache, select Data Cache Size, Index Cache Size, or both.
8. Click OK to apply the calculated values to the cache sizes you selected in step 7.

Cache Partitioning
When you create a session with multiple partitions, the Integration Service may use cache partitioning for the
Aggregator, Joiner, Lookup, Rank, and Sorter transformations. When the Integration Service partitions a cache,
it creates a separate cache for each partition and allocates the configured cache size to each partition. The
Integration Service stores different data in each cache, where each cache contains only the rows needed by that
partition. As a result, the Integration Service requires a portion of total cache memory for each partition.
When the Integration Service uses cache partitioning, it accesses the cache in parallel for each partition. If it
does not use cache partitioning, it accesses the cache serially for each partition.

Cache Partitioning 461


The following table describes the situations when the Integration Service uses cache partitioning for each
applicable transformation:

Transformation Description

Aggregator Transformation You create multiple partitions in a session with an Aggregator transformation.
You do not have to set a partition point at the Aggregator transformation. For
more information about Aggregator transformation caches, see “Aggregator
Caches” on page 462.

Joiner Transformation You create a partition point at the Joiner transformation. For more information
about caches for the Joiner transformation, see “Joiner Caches” on page 464.

Lookup Transformation You create a hash auto-keys partition point at the Lookup transformation. For
more information about Lookup transformation caches, see “Lookup Caches” on
page 467.

Rank Transformation You create multiple partitions in a session with a Rank transformation. You do
not have to set a partition point at the Rank transformation. For more information
about Rank transformation caches, see “Rank Caches” on page 468.

Sorter Transformation You create multiple partitions in a session with a Sorter transformation. You do
not have to set a partition point at the Sorter transformation. For more
information about Sorter transformation caches, see “Sorter Caches” on
page 469.

Configuring the Cache Size for Cache Partitioning


You configure the memory requirements differently when the Integration Service uses cache partitioning. If the
Integration Service uses cache partitioning, it allocates the configured cache size for each partition. To configure
the memory requirements for a transformation with cache partitioning, calculate the total requirements for the
transformation and divide by the number of partitions.
For example, you create four partitions in a session with an Aggregator transformation. You determine that an
Aggregator transformation requires 400 MB of memory for the data cache. Configure 100 MB for the data
cache size for the Aggregator transformation. When you run the session, the Integration Service allocates 100
MB for each partition, using a total of 400 MB for the Aggregator transformation.
Use the cache calculator to calculate the total requirements for the transformation. If you use dynamic
partitioning, you can determine the number of partitions based on the dynamic partitioning method. If you use
dynamic partitioning based on the nodes in a grid, the Integration Service creates one partition for each node. If
you use dynamic partitioning based on the source partitioning, use the number of partitions in the source
database.

Aggregator Caches
The Integration Service uses cache memory to process Aggregator transformations with unsorted input. When
you run the session, the Integration Service stores data in memory until it completes the aggregate calculations.
The Integration Service creates the following caches for the Aggregator transformation:
♦ Index cache. Stores group values as configured in the group by ports.
♦ Data cache. Stores calculations based on the group by ports.
By default, the Integration Service creates one memory cache and one disk cache for both the data and index in
the transformation.
When you create multiple partitions in a session with an Aggregator transformation, the Integration Service
uses cache partitioning. It creates one disk cache for all partitions and a separate memory cache for each
partition.

462 Chapter 32: Session Caches


Incremental Aggregation
The first time you run an incremental aggregation session, the Integration Service processes the source. At the
end of the session, the Integration Service stores the aggregated data in two cache files, the index and data cache
files. The Integration Service saves the cache files in the cache file directory. The next time you run the session,
the Integration Service aggregates the new rows with the cached aggregated values in the cache files.
When you run a session with an incremental Aggregator transformation, the Integration Service creates a
backup of the Aggregator cache files in $PMCacheDir at the beginning of a session run. The Integration Service
promotes the backup cache to the initial cache at the beginning of a session recovery run. The Integration
Service cannot restore the backup cache file if the session aborts.
When you create multiple partitions in a session that uses incremental aggregation, the Integration Service
creates one set of cache files for each partition.

Configuring the Cache Sizes for an Aggregator Transformation


You configure the cache sizes for an Aggregator transformation with unsorted ports.
You do not need to configure cache memory for Aggregator transformations that use sorted ports. The
Integration Service uses system memory to process an Aggregator transformation with sorted ports.
The following table describes the input you provide to calculate the Aggregator cache sizes:

Option Name Description

Number of Number of groups. The Aggregator transformation aggregates data by group. Calculate
Groups the number of groups using the group by ports. For example, if you group by Store ID and
Item ID, you have 5 stores and 25 items, and each store contains all 25 items, then
calculate the number of groups as:
5 * 25 = 125 groups

Data Movement The data movement mode of the Integration Service. The cache requirement varies based
Mode on the data movement mode. Each ASCII character uses one byte. Each Unicode
character uses two bytes.

Enter the input and then click Calculate to calculate the data and index cache sizes. The calculated values
appear in the Data Cache Size and Index Cache Size fields.

RELATED TOPICS:
♦ “Configuring the Cache Size” on page 459

Troubleshooting
Use the information in this section to help troubleshoot caching for an Aggregator transformation.

The following warning appears when I use the cache calculator to calculate the cache size for an Aggregator
transformation:
CMN_2019 Warning: The estimated data cache size assumes the number of aggregate functions
equals the number of connected output-only ports. If there are more aggregate functions,
increase the cache size to cache all data in memory.

You can use one or more aggregate functions in an Aggregator transformation. The cache calculator estimates
the cache size when the output is based on one aggregate function. If you use multiple aggregate functions to
determine a value for one output port, then you must increase the cache size.
Review the transformation statistics in the session log and tune the cache size for the Aggregator transformation
in the session.

Aggregator Caches 463


The following memory allocation error appears in the session log when I run a session with an aggregator cache size
greater than 4 GB and the Integration Service process runs on a Hewlett Packard 64-bit machine with a PA RISC
processor:
FATAL 8/17/2006 5:12:21 PM node01_havoc *********** FATAL ERROR : Failed to allocate
memory (out of virtual memory). ***********
FATAL 8/17/2006 5:12:21 PM node01_havoc *********** FATAL ERROR : Aborting the DTM
process due to memory allocation failure. ***********

By default, a 64-bit HP-UX machine with a PA RISC processor allocates up to 4 GB of memory for each
process. If a session requires more than 4 GB of memory, increase the maximum data memory limit for the
machine using the maxdsiz_64bit operating system variable. For more information about maxdsiz_64bit, see
the following URL:
http://docs.hp.com/en/B3921-90010/maxdsiz.5.html

Joiner Caches
The Integration Service uses cache memory to process Joiner transformations. When you run a session, the
Integration Service reads rows from the master and detail sources concurrently and builds index and data caches
based on the master rows. The Integration Service performs the join based on the detail source data and the
cached master data.
The Integration Service stores a different number of rows in the caches based on the type of Joiner
transformation.
The following table describes the information that Integration Service stores in the caches for different types of
Joiner transformations:

Joiner
Transformation Index Cache Data Cache
Type

Unsorted Input Stores all master rows in the join condition Stores all master rows.
with unique index keys.

Sorted Input with Stores 100 master rows in the join Stores master rows that correspond to the
Different Sources condition with unique index keys. rows stored in the index cache. If the
master data contains multiple rows with
the same key, the Integration Service
stores more than 100 rows in the data
cache.

Sorted Input with Stores all master or detail rows in the join Stores data for the rows stored in the
the Same Source condition with unique keys. Stores detail index cache. If the index cache stores
rows if the Integration Service processes keys for the master pipeline, the data
the detail pipeline faster than the master cache stores the data for master pipeline.
pipeline. Otherwise, stores master rows. If the index cache stores keys for the
The number of rows it stores depends on detail pipeline, the data cache stores data
the processing rates of the master and for detail pipeline.
detail pipelines. If one pipeline processes
its rows faster than the other, the
Integration Service caches all rows that
have already been processed and keeps
them cached until the other pipeline
finishes processing its rows.

If the data is sorted, the Integration Service creates one disk cache for all partitions and a separate memory
cache for each partition. It releases each row from the cache after it joins the data in the row.
If the data is not sorted and there is not a partition at the Joiner transformation, the Integration Service creates
one disk cache and a separate memory cache for each partition. If the data is not sorted and there is a partition
at the Joiner transformation, the Integration Service creates a separate disk cache and memory cache for each

464 Chapter 32: Session Caches


partition. When the data is not sorted, the Integration Service keeps all master data in the cache until it joins all
data.
When you create multiple partitions in a session, you can use 1:n partitioning or n:n partitioning. The
Integration Service processes the Joiner transformation differently when you use 1:n partitioning and when you
use n:n partitioning.

1:n Partitioning
You can use 1:n partitioning with Joiner transformations with sorted input. When you use 1:n partitioning, you
create one partition for the master pipeline and more than one partition in the detail pipeline. When the
Integration Service processes the join, it compares the rows in a detail partition against the rows in the master
source. When processing master and detail data for outer joins, the Integration Service outputs unmatched
master rows after it processes all detail partitions.

n:n Partitioning
You can use n:n partitioning with Joiner transformations with sorted or unsorted input. When you use n:n
partitioning for a Joiner transformation, you create n partitions in the master and detail pipelines. When the
Integration Service processes the join, it compares the rows in a detail partition against the rows in the
corresponding master partition, ignoring rows in other master partitions. When processing master and detail
data for outer joins, the Integration Service outputs unmatched master rows after it processes the partition for
each detail cache.
Tip: If the master source has a large number of rows, use n:n partitioning for better session performance.

To use n:n partitioning, you must create multiple partitions in the session and create a partition point at the
Joiner transformation. You create the partition point at the Joiner transformation to create multiple partitions
for both the master and detail source of the Joiner transformation.
If you create a partition point at the Joiner transformation, the Integration Service uses cache partitioning. It
creates one memory cache for each partition. The memory cache for each partition contains only the rows
needed by that partition. As a result, the Integration Service requires a portion of total cache memory for each
partition.

Configuring the Cache Sizes for a Joiner Transformation


You can configure the index and data cache sizes for a Joiner transformation session properties.
When you use 1:n partitioning, the Integration Service replicates the memory cache for each partition. Each
partition requires as much memory as the total requirements for the transformation. When you configure the
cache size for the Joiner transformation with 1:n partitioning, set the cache size to the total requirements for the
transformation.
When you use n:n partitioning, each partition requires a portion of the total memory required to process the
transformation. When you configure the cache size for the Joiner transformation with n:n partitioning,
calculate the total requirements for the transformation, and then divide it by the number of partitions.
You can use the cache calculator to determine the cache size required to process the transformation. For
example, you use the cache calculator to determine that the Joiner transformation requires 2,000,000 bytes of
memory for the index cache and 4,000,000 bytes of memory for the data cache. You create four partitions for
the pipeline. If you use 1:n partitioning, configure 2,000,000 bytes for the index cache and 4,000,000 bytes for
the data cache. If you use n:n partitioning, configure 500,000 bytes for the index cache and 1,000,000 bytes for
the data cache.

Joiner Caches 465


The following table describes the input you provide to calculate the Joiner cache sizes:

Input Description

Number of Number of rows in the master source. Applies to a Joiner transformation with unsorted
Master Rows input. The number of master rows does not affect the cache size for a sorted Joiner
transformation.
Note: If rows in the master source share unique keys, the cache calculator overestimates
the index cache size.

Data Movement The data movement mode of the Integration Service. The cache requirement varies based
Mode on the data movement mode. ASCII characters use one byte. Unicode characters use two
bytes.

Enter the input and then click Calculate to calculate the data and index cache sizes. The calculated values
appear in the Data Cache Size and Index Cache Size fields.

RELATED TOPICS:
♦ “Configuring the Cache Size” on page 459

Troubleshooting
Use the information in this section to help troubleshoot caching for a Joiner transformation.

The following warning appears when I use the cache calculator to calculate the cache size for a Joiner transformation
with sorted input:
CMN_2020 Warning: If the master and detail pipelines of a sorted Joiner transformation
are from the same source, the Integration Service cannot determine how fast it will
process the rows in each pipeline. As a result, the cache size estimate may be
inaccurate.

The master and detail pipelines process rows concurrently. If you join data from the same source, the pipelines
may process the rows at different rates. If one pipeline processes its rows faster than the other, the Integration
Service caches all rows that have already been processed and keeps them cached until the other pipeline finishes
processing its rows. The amount of rows cached depends on the difference in processing rates between the two
pipelines.
The cache size must be large enough to store all cached rows to achieve optimal session performance. If the
cache size is not large enough, increase it.
Note: This message applies if you join data from the same source even though it also appears when you join data
from different sources.

The following warning appears when I use the cache calculator to calculate the cache size for a Joiner transformation
with sorted input:
CMN_2021 Warning: Increase the data cache size if the sorted Joiner transformation
processes master rows that share the same key. To determine the new cache size, divide
the existing cache size by 2.5 and multiply the result by the average number of master
rows per key.

When you calculate the cache size for the Joiner transformation with sorted input, the cache calculator bases the
estimated cache requirements on an average of 2.5 master rows for each unique key. If the average number of
master rows for each unique key is greater than 2.5, increase the cache size accordingly. For example, if the
average number of master rows for each unique key is 5 (double the size of 2.5), then double the cache size
calculated by the cache calculator.

466 Chapter 32: Session Caches


Lookup Caches
If you enable caching in a Lookup transformation, the Integration Service builds a cache in memory to store
lookup data. When the Integration Service builds a lookup cache in memory, it processes the first row of data in
the transformation and queries the cache for each row that enters the transformation. If you do not enable
caching, the Integration Service queries the lookup source for each input row.
The result of the Lookup query and processing is the same, whether or not you cache the lookup source.
However, using a lookup cache can increase session performance. You can optimize performance by caching the
lookup source when the source is large.
If the lookup does not change between sessions, you can configure the transformation to use a persistent lookup
cache. When you run the session, the Integration Service rebuilds the persistent cache if any cache file is missing
or invalid.
The Integration Service creates the following caches for the Lookup transformation:
♦ Data cache. For a connected Lookup transformation, stores data for the connected output ports, not
including ports used in the lookup condition. For an unconnected Lookup transformation, stores data from
the return port.
♦ Index cache. Stores data for the columns used in the lookup condition.
The Integration Service creates disk and memory caches based on the lookup caching and partitioning
information.
The following table describes the caches that the Integration Service creates based on the cache and partitioning
information:

Lookup Conditions Disk Cache Memory Cache

- Static cache One disk cache for all partitions. One memory cache for each
- No hash auto-keys partition point partition.

- Dynamic cache One disk cache for all partitions. One memory cache for all
- No hash auto-keys partition point partitions.

- Static or dynamic cache One disk cache for each partition. One memory cache for each
- Hash auto-keys partition point partition.

When you create multiple partitions in a session with a Lookup transformation and create a hash auto-keys
partition point at the Lookup transformation, the Integration Service uses cache partitioning.
When the Integration Service uses cache partitioning, it creates caches for the Lookup transformation when the
first row of any partition reaches the Lookup transformation. If you configure the Lookup transformation for
concurrent caches, the Integration Service builds all caches for the partitions concurrently.

Sharing Caches
The Integration Service handles shared lookup caches differently depending on whether the cache is static or
dynamic:
♦ Static cache. If two Lookup transformations share a static cache, the Integration Service does not allocate
additional memory for shared transformations in the same pipeline stage. For shared transformations in
different pipeline stages, the Integration Service does allocate additional memory.
Static Lookup transformations that use the same data or a subset of data to create a disk cache can share the
disk cache. However, the lookup keys may be different, so the transformations must have separate memory
caches.
♦ Dynamic cache. When Lookup transformations share a dynamic cache, the Integration Service updates the
memory cache and disk cache. To keep the caches synchronized, the Integration Service must share the disk
cache and the corresponding memory cache between the transformations.

Lookup Caches 467


Configuring the Cache Sizes for a Lookup Transformation
You can configure the cache sizes for the Lookup transformation in the session properties.
The following table describes the input you provide to calculate the Lookup cache sizes:

Input Description

Number of Rows with Number of rows in the lookup source with unique lookup keys.
Unique Lookup Keys

Data Movement Mode The data movement mode of the Integration Service. The cache requirement varies
based on the data movement mode. ASCII characters use one byte. Unicode
characters use two bytes.

Enter the input and then click Calculate to calculate the data and index cache sizes. The calculated values
appear in the Data Cache Size and Index Cache Size fields.

RELATED TOPICS:
♦ “Configuring the Cache Size” on page 459

Rank Caches
The Integration Service uses cache memory to process Rank transformations. It stores data in rank memory
until it completes the rankings.
When the Integration Service runs a session with a Rank transformation, it compares an input row with rows in
the data cache. If the input row out-ranks a stored row, the Integration Service replaces the stored row with the
input row.
For example, you configure a Rank transformation to find the top three sales. The Integration Service reads the
following input data:
SALES
10,000
12,210
5,000
2,455
6,324

The Integration Service caches the first three rows (10,000, 12,210, and 5,000). When the Integration Service
reads the next row (2,455), it compares it to the cache values. Since the row is lower in rank than the cached
rows, it discards the row with 2,455. The next row (6,324), however, is higher in rank than one of the cached
rows. Therefore, the Integration Service replaces the cached row with the higher-ranked input row.
If the Rank transformation is configured to rank across multiple groups, the Integration Service ranks
incrementally for each group it finds.
The Integration Service creates the following caches for the Rank transformation:
♦ Data cache. Stores ranking information based on the group by ports.
♦ Index cache. Stores group values as configured in the group by ports.
By default, the Integration Service creates one memory cache and disk cache for all partitions.
If you create multiple partitions for the session, the Integration Service uses cache partitioning. It creates one
disk cache for the Rank transformation and one memory cache for each partition, and routes data from one
partition to another based on group key values of the transformation.

468 Chapter 32: Session Caches


Configuring the Cache Sizes for a Rank Transformation
You can configure the cache sizes for the Rank transformation in the session properties.
The following table describes the input you provide to calculate the Rank cache sizes:

Input Description

Number of Number of groups. The Rank transformation ranks data by group. Determine the number
Groups of groups using the group by ports. For example, if you group by Store ID and Item ID,
have 5 stores and 25 items, and each store has all 25 items, then calculate the number of
groups as:
5 * 25 = 125 groups

Number of Number items in the ranking. For example, if you want to rank the top 10 sales, you have
Ranks 10 ranks. The cache calculator populates this value based on the value set in the Rank
transformation.

Data Movement The data movement mode of the Integration Service. The cache requirement varies based
Mode on the data movement mode. ASCII characters use one byte. Unicode characters use two
bytes.

Enter the input and then click Calculate to calculate the data and index cache sizes. The calculated values
appear in the Data Cache Size and Index Cache Size fields.

RELATED TOPICS:
♦ “Configuring the Cache Size” on page 459

Troubleshooting
Use the information in this section to help troubleshoot caching for a Rank transformation.

The following memory allocation error appears in the session log when I run a session with an rank cache size greater
than 4 GB and the Integration Service process runs on a Hewlett Packard 64-bit machine with a PA RISC processor:
FATAL 8/17/2006 5:12:21 PM node01_havoc *********** FATAL ERROR : Failed to allocate
memory (out of virtual memory). ***********
FATAL 8/17/2006 5:12:21 PM node01_havoc *********** FATAL ERROR : Aborting the DTM
process due to memory allocation failure. ***********

By default, a 64-bit HP-UX machine with a PA RISC processor allocates up to 4 GB of memory for each
process. If a session requires more than 4 GB of memory, increase the maximum data memory limit for the
machine using the maxdsiz_64bit operating system variable. For more information about maxdsiz_64bit, see
the following URL:
http://docs.hp.com/en/B3921-90010/maxdsiz.5.html

Sorter Caches
The Integration Service uses cache memory to process Sorter transformations. The Integration Service passes all
incoming data into the Sorter transformation before it performs the sort operation.
The Integration Service creates a sorter cache to store sort keys and data while the Integration Service sorts the
data. By default, the Integration Service creates one memory cache and disk cache for all partitions.
If you create multiple partitions in the session, the Integration Service uses cache partitioning. It creates one
disk cache for the Sorter transformation and one memory cache for each partition. The Integration Service
creates a separate cache for each partition and sorts each partition separately.
If you do not configure the cache size to sort all of the data in memory, a warning appears in the session log,
stating that the Integration Service made multiple passes on the source data. The Integration Service makes

Sorter Caches 469


multiple passes on the data when it has to page information to disk to complete the sort. The message specifies
the number of bytes required for a single pass, which is when the Integration Service reads the data once and
performs the sort in memory without paging to disk. To increase session performance, configure the cache size
so that the Integration Service makes one pass on the data.

Configuring the Cache Size for a Sorter Transformation


You can configure the sorter cache for a Sorter transformation in the session properties.
The following table describes the input you provide to calculate the Sorter cache size:

Input Description

Number of Rows Number of rows.

Data Movement The data movement mode of the Integration Service. The cache requirement varies based
Mode on the data movement mode. ASCII characters use one byte. Unicode characters use two
bytes.

Enter the input and then click Calculate to calculate the sorter cache size. The calculated value appears in the
Sorter Cache Size field.

RELATED TOPICS:
♦ “Configuring the Cache Size” on page 459

XML Target Caches


The Integration Service uses cache memory to create an XML target. The Integration Service stores the data and
XML hierarchies in cache memory while it generates the XML target.
The Integration Service creates the following types of caches for an XML target:
♦ Data cache. Stores XML row data while it generates an XML target document. Stores one data cache for all
groups.
♦ Index caches. Stores primary keys or foreign keys. Creates a primary key index cache and a foreign key index
cache for each group.

Configuring the Cache Size for an XML Target


You configure the cache size for an XML target in the target or session properties. By default, cache size is set to
“auto.” The Integration Service determines the required amount of cache memory at run-time.
You can also configure the cache size and specify the amount of cache memory in bytes. Complete the following
steps to calculate the cache size:
1. Estimate the number of rows in each group.
2. Use the following formula to calculate the cache size for each group:
Group cache size = Data cache size + Primary key index cache size + Foreign key index cache
size

3. Use the following formula to calculate the total cache size:


Total cache size = Σ(Cache size of all groups)

470 Chapter 32: Session Caches


The following table shows how to calculate the size of the data cache and trees for a group:

Data Cache = (Number of rows in a group) X (Row size of the group)

Primary Key Tree Size = (Number of rows in a group) X (Primary key index cache size)

Foreign Key Tree Size = Σ ((Number of rows in parent group) X (Foreign key index cache size))

Note: You cannot use the cache calculator to configure the cache size for an XML target.

Optimizing the Cache Size


For optimal session performance, configure the cache size so that the Integration Service processes the
transformation in memory without paging to disk. Session performance decreases when the Integration Service
pages to disk.
When you use the cache calculator to calculate the cache size, the cache calculator estimates the cache size
required for optimal session performance based on your input. You can tune the estimate by using the cache size
specified in the session log. After you run the session, review the transformation statistics in the session log to
get the cache size.
For example, you run an Aggregator transformation called AGGTRANS. The session log contains the following
text:
MAPPING> TT_11031 Transformation [AGGTRANS]:
MAPPING> TT_11114 [AGGTRANS]: Input Group Index = [0], Input Row Count [110264]
MAPPING> TT_11034 [SQ_V_PETL]: Input - 110264
MAPPING> TT_11115 [AGGTRANS]: Output Group Index = [0]
MAPPING> TT_11037 [FILTRANS]: Output - 1098,Dropped - 0
MAPPING> CMN_1791 The index cache size that would hold [1098] aggregate groups of input
rows for [AGGTRANS], in memory, is [286720] bytes
MAPPING> CMN_1790 The data cache size that would hold [1098] aggregate groups of input
rows for [AGGTRANS], in memory, is [1774368] bytes

The log shows that the index cache requires 286,720 bytes and the data cache requires 1,774,368 bytes to
process the transformation in memory without paging to disk.
The cache size may vary depending on changes to the session or source data. Review the session logs after
subsequent session runs to monitor changes to the cache size.
You must set the tracing level to Verbose Initialization in the session properties to enable the Integration Service
to write the transformation statistics to the session log.
Note: The session log does not contain transformation statistics for a Sorter, a Joiner transformation with sorted
input, an Aggregator transformation with sorted input, or an XML target.

Optimizing the Cache Size 471


472 Chapter 32: Session Caches
APPENDIX A

Session Properties Reference


This appendix contains a listing of settings in the session properties. These settings are grouped by the
following tabs:
♦ General Tab, 474
♦ Properties Tab, 475
♦ Mapping Tab (Transformations View), 479
♦ Mapping Tab (Partitions View), 496
♦ Components Tab, 501
♦ Metadata Extensions Tab, 506

473
General Tab
By default, the General tab appears when you edit a session task.
Figure A-1 shows the General tab:

Figure A-1. General Tab

On the General tab, you can rename the session task and enter a description for the session task.
Table A-1 describes settings on the General tab:

Table A-1. General Tab

General Tab Options Description

Rename You can enter a new name for the session task with the Rename button.

Description You can enter a description for the session task in the Description field.

Mapping name Name of the mapping associated with the session task.

Resources You can associate an object with an available resource.

Fail Parent if This Task Fails* Fails the parent worklet or workflow if this task fails.

Fail Parent if This Task Does Not Fails the parent worklet or workflow if this task does not run.
Run*

Disable This Task* Disables the task.

Treat the Input Links as AND or Runs the task when all or one of the input link conditions evaluate to
OR* True.
*Appears only in the Workflow Designer.

474 Appendix A: Session Properties Reference


Properties Tab
On the Properties tab, you can configure the following settings:
♦ General Options. General Options settings allow you to configure session log file name, session log file
directory, parameter file name and other general session settings. For more information, see “General
Options Settings” on page 475.
♦ Performance. The Performance settings allow you to increase memory size, collect performance details, and
set configuration parameters. For more information, see “Performance Settings” on page 478.

General Options Settings


You can configure General Options settings on the Properties tab. You can enter session log file name, session
log file directory, and other general session settings.
Figure A-2 shows the General Options settings on the Properties tab:

Figure A-2. Properties Tab - General Options Settings

Properties Tab 475


Table A-2 describes the General Options settings on the Properties tab:

Table A-2. Properties Tab - General Options Settings

General Options
Description
Settings

Write Backward Select to write session log to a file.


Compatible
Session Log File

Session Log File By default, the Integration Service uses the session name for the log file name:
Name s_mapping name.log. For a debug session, it uses DebugSession_mapping name.log.
Enter a file name, a file name and directory, or use the $PMSessionLogFile session
parameter. The Integration Service appends information in this field to that entered in
the Session Log File Directory field. For example, if you have “C:\session_logs\” in the
Session Log File Directory File field and then enter “logname.txt” in the Session Log
File field, the Integration Service writes the logname.txt to the C:\session_logs\
directory.
You can also use the $PMSessionLogFile session parameter to represent the name of
the session log or the name and location of the session log.

Session Log File Designates a location for the session log file. By default, the Integration Service writes
Directory the log file in the service process variable directory, $PMSessionLogDir.
If you enter a full directory and file name in the Session Log File Name field, clear this
field.

Parameter File Designates the name and directory for the parameter file. Use the parameter file to
Name define session parameters and override values of mapping parameters and variables.
You can enter a workflow or worklet variable as the session parameter file name if you
configure a workflow to run concurrently, and you want to use different parameter files
for the sessions in each workflow run instance.

Enable Test Load You can configure the Integration Service to perform a test load.
With a test load, the Integration Service reads and transforms data without writing to
targets. The Integration Service generates all session files and performs all pre- and
post-session functions, as if running the full session.
The Integration Service writes data to relational targets, but rolls back the data when
the session completes. For all other target types, such as flat file and SAP BW, the
Integration Service does not write data to the targets.
Enter the number of source rows you want to test in the Number of Rows to Test field.
You cannot perform a test load on sessions that use XML sources.
Note: You can perform a test load when you configure a session for normal mode. If you
configure the session for bulk mode, the session fails.

Number of Rows to Enter the number of source rows you want the Integration Service to test load.
Test The Integration Service reads the number you configure for the test load. You cannot
perform a test load when you run a session against a mapping that contains XML
sources.

$Source Enter the database connection you want the Integration Service to use for the $Source
Connection Value connection variable. You can select a relational or application connection object, or you
can use the $DBConnectionName or $AppConnectionName session parameter if you
want to define the connection value in a parameter file.
Use the $Source variable in Lookup and Stored Procedure transformations to specify
the database location for the lookup table or stored procedure. If you use $Source in a
Lookup or Stored Procedure transformation, you can specify the database location in
the $Source Connection Value field to ensure the Integration Service uses the correct
database connection to run the session. If you use $Source in a mapping, but do not
specify a database connection in one of these fields, the Integration Service determines
which database connection to use when it runs the session. If it cannot determine the
database connection, it fails the session.

476 Appendix A: Session Properties Reference


Table A-2. Properties Tab - General Options Settings

General Options
Description
Settings

$Target Enter the database connection you want the Integration Service to use for the $Target
Connection Value connection variable. You can select a relational or application connection object, or you
can use the $DBConnectionName or $AppConnectionName session parameter if you
want to define the connection value in a parameter file.
Use the $Target variable in Lookup and Stored Procedure transformations to specify the
database location for the lookup table or stored procedure. If you use $Target in a
Lookup or Stored Procedure transformation, you can specify the database location in
the $Target Connection Value field to ensure the Integration Service uses the correct
database connection to run the session. If you use $Target in a mapping, but do not
specify a database connection in one of these fields, the Integration Service determines
which database connection to use when it runs the session. If it cannot determine the
database connection, it fails the session.

Treat Source Rows Indicates how the Integration Service treats all source rows. If the mapping for the
As session contains an Update Strategy transformation or a Custom transformation
configured to set the update strategy, the default option is Data Driven.
When you select Data Driven and you load to either a Microsoft SQL Server or Oracle
database, you must use a normal load. If you bulk load, the Integration Service fails the
session.

Commit Type Determines if the Integration Service uses a source- or target-based, or user-defined
commit. You can choose source- or target-based commit if the mapping has no
Transaction Control transformation or only ineffective Transaction Control
transformations. By default, the Integration Service performs a target-based commit.
A user-defined commit is enabled by default if the mapping has effective Transaction
Control transformations.

Commit Interval In conjunction with the selected commit interval type, indicates the number of rows. By
default, the Integration Service uses a commit interval of 10,000 rows.
This option is not available for user-defined commit.

Commit On End of By default, this option is enabled and the Integration Service performs a commit at the
File end of the file. Clear this option if you want to roll back open transactions.
This option is enabled by default for a target-based commit. You cannot disable it.

Rollback The Integration Service rolls back the transaction at the next commit point when it
Transactions on encounters a non-fatal writer error.
Errors

Recovery Strategy When a session is fails, the Integration Service suspends the workflow if the workflow is
configured to suspend on task error. If the session recovery strategy is resume from the
last checkpoint or restart, you can recover the workflow. The Integration Service
recovers the session and continues the workflow if the session succeeds. If the session
fails, the workflow becomes suspended again.
You can also recover the session or recover the workflow from the session when you
configure the session to resume from last checkpoint or restart.
When you configure a Session task, you can choose one of the following recovery
strategies:
- Resume from the last checkpoint. The Integration Service saves the session state of
operation and maintains target recovery tables.
- Restart. The Integration Service runs the session again when it recovers the workflow.
- Fail session and continue the workflow. The Integration Service cannot recover the
session, but it continues the workflow. This is the default session recovery strategy.

Java Classpath If you enter a Java Classpath in this field, the Java Classpath is added to the beginning
of the system classpath when the Integration Service runs the session. Use this option
if you use third-party Java packages, built-in Java packages, or custom Java packages
in a Java transformation.
*Tip: When you bulk load to Microsoft SQL Server or Oracle targets, define a large commit interval. Microsoft SQL Server and
Oracle start a new bulk load transaction after each commit. Increasing the commit interval reduces the number of bulk load
transactions and increases performance.

Properties Tab 477


Performance Settings
You can configure performance settings on the Properties tab. In Performance settings, you can increase
memory size, collect performance details, and set configuration parameters.
Figure A-3 shows the Performance settings on the Properties tab:

Figure A-3. Properties Tab - Performance Settings

Table A-3 describes the Performance settings on the Properties tab:

Table A-3. Properties Tab - Performance Settings

Performance
Description
Settings

DTM Buffer Size Amount of memory allocated to the session from the DTM process.
By default, the Integration Service determines the DTM buffer size at run time. The
Workflow Manager allocates a minimum of 12 MB for DTM buffer memory.
You can specify auto or a numeric value. If you enter 2000, the Integration Service
interprets the number as 2000 bytes. Append KB, MB, or GB to the value to specify
other units. For example, you can specify 512MB.
Increase the DTM buffer size in the following circumstances:
- A session contains large amounts of character data and you configure it to run in
Unicode mode. Increase the DTM buffer size to 24MB.
- A session contains n partitions. Increase the DTM buffer size to at least n times the
value for the session with one partition.
- A source contains a large binary object with a precision larger than the allocated DTM
buffer size. Increase the DTM buffer size so that the session does not fail.

Collect Collects performance details when the session runs. Use the Workflow Monitor to view
Performance Data performance details while the session runs.

Write Performance Writes performance details for the session to the PowerCenter repository. Write
Data to Repository performance details to the repository to view performance details for previous session
runs. Use the Workflow Monitor to view performance details for previous session runs.

Incremental Select Incremental Aggregation option if you want the Integration Service to perform
Aggregation incremental aggregation.

478 Appendix A: Session Properties Reference


Table A-3. Properties Tab - Performance Settings

Performance
Description
Settings

Reinitialize Select Reinitialize Aggregate Cache option if the session is an incremental aggregation
Aggregate Cache session and you want to overwrite existing aggregate files.
After a single session run, to return to a normal incremental aggregation session run,
you must clear this option.

Enable High When selected, the Integration Service processes the Decimal datatype to a precision
Precision of 28. If a session does not use the Decimal datatype, leave this setting clear.

Session Retry On Select this option if you want the Integration Service to retry target writes on deadlock.
Deadlock You can only use Session Retry on Deadlock for sessions configured for normal load.
This option is disabled for bulk mode. You can configure the Integration Service to set
the number of deadlock retries and the deadlock sleep time period.

Pushdown Use pushdown optimization to push transformation logic to the source or target
Optimization database. The Integration Service analyzes the transformation logic, mapping, and
session configuration to determine the transformation logic it can push to the database.
At run time, the Integration Service executes any SQL statement generated against the
source or target tables, and it processes any transformation logic that it cannot push to
the database.
Select one of the following values:
- None. The Integration Service does not push any transformation logic to the database.
- To Source. The Integration Service pushes as much transformation logic as possible
to the source database.
- To Target. The Integration Service pushes as much transformation logic as possible to
the target database.
- Full. The Integration Service pushes as much transformation logic as possible to both
the source database and target database.
- $$PushdownConfig. The $$PushdownConfig mapping parameter allows you to run the
same session with different pushdown optimization configurations at different times.
Default is None.

Allow Temporary Select to allow the Integration Service to create temporary views in the database when
View for Pushdown it pushes the session to the database. The Integration Service must create a view in the
database if the session contains an SQL override, a filtered lookup, or an unconnected
lookup.

Allow Temporary Select to allow the Integration Service to create temporary sequence objects in the
Sequence for database. The Integration Service must create a sequence object in the database if the
Pushdown session contains a Sequence Generator transformation.

Session Sort Order Specify a sort order for the session. The session properties display all sort orders
associated with the Integration Service code page. When the Integration Service runs in
Unicode mode, it sorts character data in the session using the selected sort order.
When the Integration Service runs in ASCII mode, it ignores this setting and uses a
binary sort order to sort character data.

Mapping Tab (Transformations View)


In the Transformations view of the Mapping tab, you can configure settings for connections, sources, targets,
and transformations.
You can configure the following nodes:
♦ Connections
♦ Sources
♦ Targets
♦ Transformations

Mapping Tab (Transformations View) 479


Connections Node
The Connections node displays the source, target, lookup, stored procedure, FTP, external loader, and queue
connections. You can choose connection types and connection values. You can also edit connection object
values.
Figure A-4 shows the Connections settings on the Mapping tab:

Figure A-4. Mapping Tab - Connections Settings

Table A-4 describes the Connections settings on the Mapping tab:

Table A-4. Mapping Tab - Connections Settings

Connections
Description
Node Settings

Type Enter the connection type for relational and non-relational sources and targets based
on the value you choose in the Type column. You can choose a connection object
based on the connection type:
- Application
- Queue
- External Loader
- FTP
- None

Partitions Displays the partitions if the session is partitioned.

Value Enter a source and target connection based on the value you choose in the Type
column.

Sources Node
The Sources node lists the sources used in the session and displays their settings. If you want to view and
configure the settings of a specific source, select the source from the list.
You can configure the following settings:
♦ Readers. The Readers settings displays the reader the Integration Service uses with each source instance. For
more information, see “Readers Settings” on page 481.

480 Appendix A: Session Properties Reference


♦ Connections. The Connections settings lets you configure connections for the sources. For more
information, see “Connections Settings” on page 482.
♦ Properties. The Properties settings lets you configure the source properties. For more information, see
“Properties Settings” on page 483.

Readers Settings
You can view the reader the Integration Service uses with each source instance. The Workflow Manager specifies
the necessary reader for each source instance. For relations sources the reader is Relational Reader and for file
sources it is File Reader.
Figure A-5 shows the Readers settings on the Mapping tab (Sources node):

Figure A-5. Mapping Tab - Sources Node - Readers Settings

Mapping Tab (Transformations View) 481


Connections Settings
You can configure the connections the Integration Service uses with each source instance.
Figure A-6 shows the Connections settings on the Mapping tab (Sources node):

Figure A-6. Mapping Tab - Sources Node - Connections Settings

Table A-5 describes the Connections settings on the Mapping tab (Sources node):

Table A-5. Mapping Tab - Sources Node - Connections Settings

Connections
Description
Settings

Type Enter the connection type for relational and non-relational sources. Specifies Relational for
relational sources.

Value Enter a source connection based on the value you choose in the Type column.

482 Appendix A: Session Properties Reference


Properties Settings
Click the Properties settings to define source property information. The Workflow Manager displays properties
for both relational and file sources.
Figure A-7 shows the Properties settings on the Mapping tab (Sources node):

Figure A-7. Mapping Tab - Sources Node - Properties Settings

Table A-6 describes Properties settings on the Mapping tab for relational sources:

Table A-6. Mapping Tab - Sources Node - Properties Settings (Relational Sources)

Relational
Description
Source Options

Owner Name Specifies the table owner name.

Source Table Specifies the source table name. Enter a table name, or enter a parameter or variable to
Name define the source table name in a parameter file. You can use mapping parameters,
mapping variables, session parameters, workflow variables, or worklet variables.
If you override the source table name here, and you override the source table name
using an SQL query, the Integration Service uses the source table name defined in the
SQL query.

User Defined Join Specifies the condition used to join data from multiple sources represented in the same
Source Qualifier transformation.

Tracing Level Specifies the amount of detail included in the session log when you run a session
containing this transformation. You can view the value of this attribute when you click
Show all properties.

Select Distinct Selects unique rows.

Pre SQL Pre-session SQL commands to run against the source database before the Integration
Service reads the source.

Post SQL Post-session SQL commands to run against the source database after the Integration
Service writes to the target.

Mapping Tab (Transformations View) 483


Table A-6. Mapping Tab - Sources Node - Properties Settings (Relational Sources)

Relational
Description
Source Options

Sql Query Defines a custom query that replaces the default query the Integration Service uses to
read data from sources represented in this Source Qualifier. A custom query overrides
entries for a custom join or a source filter.

Source Filter Specifies the filter condition the Integration Service applies when querying records.

Table A-7 describes the Properties settings on the Mapping tab for file sources:

Table A-7. Mapping Tab - Sources Node - Properties Settings (File Sources)

File Source
Description
Options

Source File Enter the directory name in this field. By default, the Integration Service looks in the
Directory service process variable directory, $PMSourceFileDir, for file sources.
If you specify both the directory and file name in the Source Filename field, clear this
field. The Integration Service concatenates this field with the Source Filename field
when it runs the session.
You can also use the $InputFileName session parameter to specify the file directory.

Source Filename Enter the file name, or file name and path. Optionally use the $InputFileName session
parameter for the file name.
The Integration Service concatenates this field with the Source File Directory field when
it runs the session. For example, if you have “C:\data\” in the Source File Directory field,
then enter “filename.dat” in the Source Filename field. When the Integration Service
begins the session, it looks for “C:\data\filename.dat”.
By default, the Workflow Manager enters the file name configured in the source
definition.

Source Filetype You can configure multiple file sources using a file list.
Indicates whether the source file contains the source data, or a list of files with the
same file properties. Select Direct if the source file contains the source data. Select
Indirect if the source file contains a list of files.
When you select Indirect, the Integration Service finds the file list then reads each listed
file when it executes the session.

Set File Properties You can configure the file properties.

Datetime Format* Displays the datetime format for datetime fields.

Thousand Displays the thousand separator for numeric fields.


Separator*

Decimal Separator* Displays the decimal separator for numeric fields.


*You can view the value of this attribute when you click Show all properties. This attribute is read-only. For more information,
see the PowerCenter Designer Guide.

Setting File Properties for Sources


Configure flat file properties by clicking the Set File Properties link in the Sources node. You can define
properties for both fixed-width and delimited flat file sources.
You can configure flat file properties for non-reusable sessions in the Workflow Designer and for reusable
sessions in the Task Developer.

484 Appendix A: Session Properties Reference


Figure A-8 shows the Flat Files dialog box that appears when you click Set File Properties:

Figure A-8. Flat Files Dialog Box for Sources

Select the file type (fixed-width or delimited) you want to configure and click Advanced.

Configuring Fixed-Width Properties for Sources


To edit the fixed-width properties, select Fixed Width in the Flat Files dialog box and click the Advanced
button. The Fixed Width Properties dialog box appears.
Note: Edit these settings only if you need to override those configured in the source definition.

Figure A-9 shows the Fixed Width Properties dialog box for flat file sources:

Figure A-9. Fixed Width Properties

Table A-8 describes the options you define in the Fixed Width Properties dialog box for sources:

Table A-8. Fixed-Width Properties for File Sources

Fixed-Width
Description
Properties Options

Null Character: Indicates the character representing a null value in the file. This can be any valid
Text/Binary character in the file code page, or any binary value from 0 to 255.

Repeat Null If selected, the Integration Service reads repeat NULL characters in a single field as a
Character single NULL value. If you do not select this option, the Integration Service reads a
single null character at the beginning of a field as a null field. Important: For multibyte
code pages, specify a single-byte null character if you use repeating non-binary null
characters. This ensures that repeating null characters fit into the column.

Code Page Select the code page of the fixed-width file. The default setting is the client code
page.

Mapping Tab (Transformations View) 485


Table A-8. Fixed-Width Properties for File Sources

Fixed-Width
Description
Properties Options

Number of Initial Integration Service skips the specified number of rows before reading the file. Use
Rows to Skip this to skip header rows. One row may contain multiple rows. If you select the Line
Sequential File Format option, the Integration Service ignores this option.
You can enter any integer from zero to 2147483647.

Number of Bytes to Integration Service skips the specified number of bytes between records. For
Skip Between example, you have an ASCII file on Windows with one record on each line, and a
Records carriage return and line feed appear at the end of each line. If you want the
Integration Service to skip these two single-byte characters, enter 2.
If you have an ASCII file on UNIX with one record for each line, ending in a carriage
return, skip the single character by entering 1.

Strip Trailing Blanks If selected, the Integration Service strips trailing blank spaces from records before
passing them to the Source Qualifier transformation.

Line Sequential File Select this option if the file uses a carriage return at the end of each record,
Format shortening the final column.

Configuring Delimited File Properties for Sources


To edit the delimited properties, select Delimited in the Flat Files dialog box and click the Advanced button.
The Delimited File Properties dialog box appears.
Note: Edit these settings only if you need to override those configured in the source definition.

Figure A-10 shows the Delimited File Properties dialog box for flat file sources:

Figure A-10. Delimited Properties for File Sources

486 Appendix A: Session Properties Reference


Table A-9 describes the options you can define in the Delimited File Properties dialog box for flat file sources:

Table A-9. Delimited Properties for File Sources

Delimited File
Description
Properties Options

Delimiters Character used to separate columns of data in the source file. Delimiters can be
either printable or single-byte unprintable characters, and must be different from the
escape character and the quote character (if selected). To enter a single-byte
unprintable character, click the Browse button to the right of this field. In the
Delimiters dialog box, select an unprintable character from the Insert Delimiter list
and click Add. You cannot select unprintable multibyte characters as delimiters. The
delimiter must be in the same code page as the flat file code page.

Optional Quotes Select None, Single, or Double. If you select a quote character, the Integration
Service ignores delimiter characters within the quote characters. Therefore, the
Integration Service uses quote characters to escape the delimiter.
For example, a source file uses a comma as a delimiter and contains the following
row: 342-3849, ‘Smith, Jenna’, ‘Rockville, MD’, 6.
If you select the optional single quote character, the Integration Service ignores the
commas within the quotes and reads the row as four fields.
If you do not select the optional single quote, the Integration Service reads six
separate fields.
When the Integration Service reads two optional quote characters within a quoted
string, it treats them as one quote character. For example, the Integration Service
reads the following quoted string as I’m going tomorrow:
2353, ‘I’’m going tomorrow.’, MD
Additionally, if you select an optional quote character, the Integration Service only
reads a string as a quoted string if the quote character is the first character of the
field.
Note: You can improve session performance if the source file does not contain quotes
or escape characters.

Code Page Select the code page of the delimited file. The default setting is the client code page.

Row Delimiter Specify a line break character. Select from the list or enter a character. Preface an
octal code with a backslash (\). To use a single character, enter the character.
The Integration Service uses only the first character when the entry is not preceded
by a backslash. The character must be a single-byte character, and no other
character in the code page can contain that byte. Default is line-feed, \012 LF (\n).

Escape Character Character immediately preceding a delimiter character embedded in an unquoted


string, or immediately preceding the quote character in a quoted string. When you
specify an escape character, the Integration Service reads the delimiter character as
a regular character (called escaping the delimiter or quote character).
Note: You can improve session performance for mappings containing Sequence
Generator transformations if the source file does not contain quotes or escape
characters.

Remove Escape This option is selected by default. Clear this option to include the escape character in
Character From Data the output string.

Treat Consecutive By default, the Integration Service reads pairs of delimiters as a null value. If
Delimiters as One selected, the Integration Service reads any number of consecutive delimiter
characters as one.
For example, a source file uses a comma as the delimiter character and contains the
following record: 56, , , Jane Doe. By default, the Integration Service reads that
record as four columns separated by three delimiters: 56, NULL, NULL, Jane Doe. If
you select this option, the Integration Service reads the record as two columns
separated by one delimiter: 56, Jane Doe.

Number of Initial Integration Service skips the specified number of rows before reading the file. Use
Rows to Skip this to skip title or header rows in the file.

Mapping Tab (Transformations View) 487


Targets Node
The Targets node lists the used in the session and displays their settings. If you want to view and configure the
settings of a specific target, select the target from the list.
You can configure the following settings:
♦ Writers. The Writers settings displays the writer the Integration Service uses with each target instance. For
more information, see “Writers Settings” on page 488.
♦ Connections. The Connections settings lets you configure connections for the targets. For more
information, see “Connections Settings” on page 489.
♦ Properties. The Properties settings lets you configure the target properties. For more information, see
“Properties Settings” on page 490.

Writers Settings
You can view and configure the writer the Integration Service uses with each target instance. The Workflow
Manager specifies the necessary writer for each target instance. For relational targets the writer is Relational
Writer and for file targets it is File Writer.
Figure A-11 shows the Writers settings on the Mapping tab (Targets node):

Figure A-11. Mapping Tab - Targets Node - Writers Settings

488 Appendix A: Session Properties Reference


Table A-10 describes the Writers settings on the Mapping tab (Targets node):

Table A-10. Mapping Tab - Targets Node - Writers Settings

Writers
Description
Setting

Writers For relational targets, choose Relational Writer or File Writer. When the target in the
mapping is a flat file, an XML file, an SAP BW target, or MQ target, the Workflow Manager
specifies the necessary writer in the session properties.
When you choose File Writer for a relational target use an external loader to load data to
this target.
When you override a relational target to use the file writer, the Workflow Manager changes
the properties for that target instance on the Properties settings. It also changes the
connection options you can define on the Connections settings.
After you override a relational target to use a file writer, define the file properties for the
target. Click Set File Properties and choose the target to define.

Connections Settings
You can enter connection types and specific target database connections on the Targets node of the Mappings
tab.
Figure A-12 shows the Connections settings on the Mapping tab (Targets node):

Figure A-12. Mapping Tab - Targets Node - Connections Settings

Mapping Tab (Transformations View) 489


Table A-11 describes the Connections settings on the Mapping tab (Targets node):

Table A-11. Mapping Tab - Targets Node - Connections Settings

Connections
Description
Settings

Type Enter the connection type for relational and non-relational sources and targets based
on the value you choose in the Type column. You can choose a connection object
based on the connection type:
- Application
- Queue
- External Loader
- FTP
- None

Partitions Displays the partitions if the session is partitioned.

Value Enter a target connection based on the value you choose in the Type column.

Properties Settings
Click the Properties settings to define target property information. The Workflow Manager displays different
properties for the different target types: relational, flat file, and XML.

Properties Settings for Relational Targets


You can configure the writer and object instance attributes for a relational target.
Figure A-13 shows the Properties settings on the Mapping tab for relational targets:

Figure A-13. Mapping Tab - Targets Node - Properties Settings (Relational)

490 Appendix A: Session Properties Reference


Table A-12 describes the Properties settings on the Mapping tab for relational targets:

Table A-12. Mapping Tab - Targets Node - Properties Settings (Relational)

Target Property Description

Target Load Type You can choose Normal or Bulk.


If you select Normal, the Integration Service loads targets normally.
If you select Normal and the Microsoft SQL Server target name includes spaces,
configure the following connection environment SQL in the connection object:
SET QUOTED_IDENTIFIER ON
You can select Bulk when you load to IBM DB2, Sybase, Oracle, or Microsoft SQL
Server. If you select Bulk for an IBM DB2, Sybase, Oracle, or Microsoft SQL Server
target, the Integration Service invokes the bulk API with default settings, bypassing
database logging.
If you select Bulk for other database types, the Integration Service reverts to a
normal load.
Loading in bulk mode can improve session performance, but limits the ability to
recover because no database logging occurs.

Insert If selected, the Integration Service inserts all rows flagged for insert.
By default, this option is selected.

Update (as Update) If selected, the Integration Service updates all rows flagged for update.
By default, this option is selected.

Update (as Insert) If selected, the Integration Service inserts all rows flagged for update.
By default, this option is not selected.

Update (else Insert) If selected, the Integration Service updates rows flagged for update if it they exist in
the target, then inserts any remaining rows marked for insert.

Delete If selected, the Integration Service deletes all rows flagged for delete.

Truncate Table If selected, the Integration Service truncates the target before loading.

Reject File Directory Enter the directory name in this field. By default, the Integration Service writes all
reject files to the service process variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field, clear this
field. The Integration Service concatenates this field with the Reject Filename field
when it runs the session.
You can also use the $BadFileName session parameter to specify the file directory.

Reject Filename Enter the file name, or file name and path. By default, the Integration Service names
the reject file after the target instance name: target_name.bad. Optionally use the
$BadFileName session parameter for the file name.
The Integration Service concatenates this field with the Reject File Directory field
when it runs the session. For example, if you have “C:\reject_file\” in the Reject File
Directory field, and enter “filename.bad” in the Reject Filename field, the Integration
Service writes rejected rows to C:\reject_file\filename.bad.

Rejected Instructs the Integration Service to write the truncated and overflowed rows to the
Truncated/Overflowed reject file.
Rows*

Update Override* Override the default UPDATE statement.

Table Name Prefix Specify the owner of the target tables.

Pre SQL You can enter pre-session SQL commands for a target instance in a mapping to
execute commands against the target database before the Integration Service reads
the source.

Post SQL Enter post-session SQL commands to execute commands against the target
database after the Integration Service writes to the target.

Target Table Name Specifies the target table name. Enter a table name, or enter a parameter or variable
to define the target table name in a parameter file. You can use mapping parameters,
mapping variables, session parameters, workflow variables, or worklet variables.
*You can view the value of this attribute when you click Show all properties. This attribute is read-only. For more information,
see the PowerCenter Designer Guide.

Mapping Tab (Transformations View) 491


Properties Settings for Flat File Targets
Figure A-14 describes the Properties settings on the Mapping tab for file targets:

Figure A-14. Mapping Tab - Targets Node - File Properties Settings

Table A-13 describes the Properties settings on the Mapping tab for file targets:

Table A-13. Mapping Tab - Targets Node - File Properties Settings

Target Property Description

Merge Partitioned When selected, the Integration Service merges the partitioned target files into one file
Files when the session completes, and then deletes the individual output files. If the
Integration Service fails to create the merged file, it does not delete the individual
output files.
You cannot merge files if the session uses FTP, an external loader, or a message
queue.

Merge File Enter the directory name in this field. By default, the Integration Service writes the
Directory merged file in the service process variable directory, $PMTargetFileDir.
If you enter a full directory and file name in the Merge File Name field, clear this field.

Merge File Name Name of the merge file. Default is target_name.out. This property is required if you
select Merge Partitioned Files.

Output File Enter the directory name in this field. By default, the Integration Service writes output
Directory files in the service process variable directory, $PMTargetFileDir.
If you specify both the directory and file name in the Output Filename field, clear this
field. The Integration Service concatenates this field with the Output Filename field
when it runs the session.
You can also use the $OutputFileName session parameter to specify the file directory.

492 Appendix A: Session Properties Reference


Table A-13. Mapping Tab - Targets Node - File Properties Settings

Target Property Description

Output Filename Enter the file name, or file name and path. By default, the Workflow Manager names the
target file based on the target definition used in the mapping: target_name.out.
If the target definition contains a slash character, the Workflow Manager replaces the
slash character with an underscore.
When you use an external loader to load to an Oracle database, you must specify a file
extension. If you do not specify a file extension, the Oracle loader cannot find the flat
file and the Integration Service fails the session.
Enter the file name, or file name and path. Optionally use the $OutputFileName session
parameter for the file name.
The Integration Service concatenates this field with the Output File Directory field when
it runs the session.
Note: If you specify an absolute path file name when using FTP, the Integration Service
ignores the Default Remote Directory specified in the FTP connection. When you
specify an absolute path file name, do not use single or double quotes.

Reject File Enter the directory name in this field. By default, the Integration Service writes all reject
Directory files to the service process variable directory, $PMBadFileDir.
If you specify both the directory and file name in the Reject Filename field, clear this
field. The Integration Service concatenates this field with the Reject Filename field
when it runs the session.
You can also use the $BadFileName session parameter to specify the file directory.

Reject Filename Enter the file name, or file name and path. By default, the Integration Service names the
reject file after the target instance name: target_name.bad. Optionally use the
$BadFileName session parameter for the file name.
The Integration Service concatenates this field with the Reject File Directory field when
it runs the session. For example, if you have “C:\reject_file\” in the Reject File Directory
field, and enter “filename.bad” in the Reject Filename field, the Integration Service
writes rejected rows to C:\reject_file\filename.bad.

Set File Properties You can configure the file properties.

Datetime Format* Displays the datetime format selected for datetime fields.

Thousand Displays the thousand separator for numeric fields.


Separator*

Decimal Separator* Displays the decimal separator for numeric fields.


*You can view the value of this attribute when you click Show all properties. This attribute is read-only. For more information,
see the PowerCenter Designer Guide.

Setting File Properties for Targets


Click the Set File Properties button on the Mapping tab to configure flat file properties. You can define flat file
properties for both fixed-width and delimited flat file targets.
You can configure flat file properties for non-reusable sessions in the Workflow Designer and for reusable
sessions in the Task Developer.
Figure A-15 shows the Flat Files dialog box that appears when you click Set File Properties:

Figure A-15. Flat Files Dialog Box for Targets

Select the file type (fixed-width or delimited) you want to configure and click Advanced.

Mapping Tab (Transformations View) 493


Configuring Fixed-Width Properties for Targets
To edit the fixed-width properties, select Fixed Width in the Flat Files dialog box and click the Advanced
button. The Fixed Width Properties dialog box appears.
Figure A-16 shows the Fixed-Width Properties dialog box for flat file targets:

Figure A-16. Fixed-Width Properties for File Targets

Table A-14 describes the options you define in the Fixed Width Properties dialog box:

Table A-14. Fixed-Width Properties for File Targets

Fixed-Width
Description
Properties Options

Null Character Enter the character you want the Integration Service to use to represent null
values. You can enter any valid character in the file code page.

Repeat Null Character Select this option to indicate a null value by repeating the null character to fill the
field. If you do not select this option, the Integration Service enters a single null
character at the beginning of the field to represent a null value.

Code Page Select the code page of the fixed-width file. The default setting is the client code
page.

Configuring Delimited Properties for Targets


To edit the delimited properties, select Delimited in the Flat Files dialog box and click the Advanced button.
The Delimited File Properties dialog box appears.
Figure A-17 shows the Delimited File Properties dialog box for flat file targets:

Figure A-17. Delimited Properties for File Targets

494 Appendix A: Session Properties Reference


Table A-15 describes the options you can define in the Delimited File Properties dialog box for flat file targets:

Table A-15. Delimited Properties for File Targets

Edit Delimiter
Description
Options

Delimiters Character used to separate columns of data. Delimiters can be either printable or
single-byte unprintable characters, and must be different from the escape character and
the quote character (if selected). To enter a single-byte unprintable character, click the
Browse button to the right of this field. In the Delimiters dialog box, select an unprintable
character from the Insert Delimiter list and click Add. You cannot select unprintable
multibyte characters as delimiters.

Optional Quotes Select No Quotes, Single Quote, or Double Quotes. If you select a quote character, the
Integration Service does not treat delimiter characters within the quote characters as a
delimiter. For example, suppose an output file uses a comma as a delimiter and the
Integration Service receives the following row: 342-3849, ‘Smith, Jenna’, ‘Rockville,
MD’, 6.
If you select the optional single quote character, the Integration Service ignores the
commas within the quotes and writes the row as four fields.
If you do not select the optional single quote, the Integration Service writes six separate
fields.

Code Page Select the code page of the delimited file. The default setting is the client code page.

Transformations Node
On the Transformations node, you can override properties that you configure in transformation and target
instances in a mapping. The attributes you can configure depends on the type of transformation you select.
Figure A-18 shows the Transformations node on the Mapping tab:

Figure A-18. Mapping Tab - Transformations Node

Mapping Tab (Transformations View) 495


Mapping Tab (Partitions View)
In the Partitions view of the Mapping tab, you can configure partitions. You can configure partitions for non-
reusable sessions in the Workflow Designer and for reusable sessions in the Task Developer.
The following nodes are available in the Partitions view:
♦ Partition Properties. For more information, see “Partition Properties Node” on page 496.
♦ KeyRange. For more information, see “KeyRange Node” on page 497.
♦ HashKeys. For more information, see “HashKeys Node” on page 497.
♦ Partition Points. For more information, see “Partition Points Node” on page 497.
♦ Non-Partition Points. For more information, see “Non-Partition Points Node” on page 500.

Partition Properties Node


You can configure partitions with the Partition Properties node.
Figure A-19 shows the Mapping tab - Partitions Properties node:

Figure A-19. Mapping Tab - Partitions Properties Node

496 Appendix A: Session Properties Reference


KeyRange Node
In the KeyRange node, you can configure the partition range for key-range partitioning. Select Edit Keys to edit
the partition key. For more information, see “Edit Partition Key” on page 500.
Figure A-20 shows the KeyRange node on the Mapping tab:

Figure A-20. Mapping Tab - KeyRange Node

HashKeys Node
The HashKeys node you can configure hash key partitioning. Select Edit Keys to edit the partition key. For
more information, see “Edit Partition Key” on page 500.

Partition Points Node


The Partition Points node displays the mapping with the transformation icons. The Partition Points node lists
the partition points in the tree. Select a partition point to configure its attributes.
In the Partition Points node, you can configure the following options for each pipeline in a mapping:
♦ Add and delete partition points.
♦ Specify the partition type at each partition point.
♦ Add and delete partitions.
♦ Enter a description for each partition.
♦ Add keys and key ranges for certain partition types.

Mapping Tab (Partitions View) 497


Figure A-21 shows Mapping tab - Partition Points node:

Figure A-21. Mapping Tab - Partition Points Node

Table A-16 describes the Partition Points node:

Table A-16. Mapping Tab - Partition Points Node

Partition Points
Description
Node

Add Partition Point Click to add a new partition point to the Transformation list.

Delete Partition Click to delete the current partition point. You cannot delete certain partition points.
Point

Edit Partition Point Click to edit the current partition point.

Edit Keys Click to add, remove, or edit the key for key range or hash user keys partitioning. This
button is not available for auto-hash, round-robin, or pass-through partitioning.

498 Appendix A: Session Properties Reference


Edit Partition Point
The Edit Partition Point dialog box lets you add and delete partitions and select the partition type.
Figure A-22 shows the Edit Partition Points dialog box:

Figure A-22. Edit Partition Point Dialog Box

Table A-17 describes the options in the Edit Partition Point dialog box:

Table A-17. Edit Partition Point Dialog Box Options

Edit Partition Point Options Description

Add button Click to add a partition. You can add up to 64 partitions.

Delete button Click to delete the selected partition.

Name Partition number.

Description Enter a description for the current partition.

Select Partition Type Select a partition type from the list.

Mapping Tab (Partitions View) 499


Edit Partition Key
When you specify key range or hash user keys partitioning at any partition point, you must specify one or more
ports as the partition key. Click Edit Key to display the Edit Partition Key dialog box.
Figure A-23 shows the Edit Partition Key dialog box:

Figure A-23. Edit Partition Key Dialog Box

You can specify one or more ports as the partition key. To rearrange the order of the ports that make up the
key, select a port in the Selected Ports list and click the up or down arrow.

RELATED TOPICS:
♦ “Database Partitioning Partition Type” on page 284

♦ “Key Range Partition Type” on page 288

Non-Partition Points Node


The Non-Partition Points node displays the mapping objects in iconized view. The Partition Points node lists
the non-partition points in the tree. You can select a non-partition point and add partitions if you want.

500 Appendix A: Session Properties Reference


Components Tab
In the Components tab, you can configure pre-session shell commands, post-session commands, email messages
if the session succeeds or fails, and variable assignments.
Figure A-24 shows the Components Tab:

Figure A-24. Components Tab

Table A-18 describes the Components tab options:

Table A-18. Components Tab

Components Tab
Description
Option

Task Tasks you can perform in the Components tab. You can configure pre- or post-session
shell commands, success or failure email messages, and variable assignments in the
Components tab.

Type Select None if you do not want to configure commands and emails in the Components
tab.
For pre- and post-session commands, select Reusable to call an existing reusable
Command task as the pre- or post-session shell command. Select Non-Reusable to
create pre- or post-session shell commands for this session task.
For success or failure emails, select Reusable to call an existing Email task as the
success or failure email. Select Non-Reusable to create email messages for this
session task.

Value Use to configure commands, emails, or variable assignments.

Components Tab 501


Table A-19 describes the tasks available in the Components tab:

Table A-19. Components Tab Tasks

Components Tab
Description
Tasks

Pre-Session Shell commands that the Integration Service performs at the beginning of a session.
Command

Post-Session Shell commands that the Integration Service performs after the session completes
Success Command successfully.

Post-Session Shell commands that the Integration Service performs after the session if the session
Failure Command fails.

On Success Email Integration Service sends On Success email message if the session completes
successfully.

On Failure Email Integration Service sends On Failure email message if the session fails.

Pre-session Assign values to mapping parameters, mapping variables, and session parameters
variable before a session runs. This field is editable for non-reusable sessions.
assignment

Post-session on Assign values to parent workflow and worklet variables after a session completes
success variable successfully. This field is editable for non-reusable sessions.
assignment

Post-session on Assign values to parent workflow and worklet variables after a session fails. This field is
failure variable editable for non-reusable sessions.
assignment

Reusable Pre- or Post-Session Commands


Select Reusable in the Type field if you want to select an existing Command task as the pre- or post-session shell
command. The Command Object Browser appears when you click the Open button in the Value field.
Figure A-25 shows the Task Browser:

Figure A-25. Task Browser

Click the Override button to override the Fail Task if Any Command Fails option in the Command task. For
more information about the Fail Task if Any Command Fails option, see Table A-21 on page 504.

502 Appendix A: Session Properties Reference


Non-Reusable Pre- or Post-Session Commands
Select Non-Reusable in the Type field if you want to create pre- or post-session commands for the session. Non-
reusable pre- or post-session commands do not appear as Command tasks in the folder.
Click the Open button in the Value field in the Components tab to edit pre- or post-session shell commands.
The Edit Pre-Session Command or Edit Post-Session Command dialog box appears.
Figure A-26 shows the Edit Pre-Session Command dialog box:

Figure A-26. Edit Pre-Session Command Dialog Box

Table A-20 describes General tab for editing pre- or post-session shell commands:

Table A-20. Pre- or Post-Session Commands - General Tab

General Tab
Options for Pre-
Description
or Post-Session
Commands

Name Enter a name for the pre- or post-session shell command.

Make Reusable Select Make Reusable to create a reusable Command task from the pre- or post-
session shell commands.
Clear the Make Reusable option if you do not want the Workflow Manager to create a
reusable Command task from the shell commands.

Description Enter a description for the pre- or post-session shell command.

Components Tab 503


Table A-21 describes the Properties tab for editing pre- or post-session commands:

Table A-21. Pre- or Post-Session Commands - Properties Tab

Properties Tab
Options for Pre-
Description
or Post-Session
Commands

Name Name of the pre-session shell command.

Fail Task if Any Integration Service stops running the rest of the commands in a task when one
Command Fails command in the Command task fails if you select this option. If you do not select this
option, the Integration Service runs all the commands in the Command task and treats
the task as completed, even if a command fails.

Table A-22 describes the Commands tab for editing pre- or post-session commands:

Table A-22. Pre- or Post-Session Commands - Commands Tab

Commands Tab
Options for Pre-
Description
or Post-Session
Commands

Name Name of the pre- or post-session shell command.

Command Shell command you want the Integration Service to perform. Enter one command for
each line.
You can use parameters and variables in shell commands. Use any parameter or
variable type that you can define in the parameter file.
If the command contains spaces, enclose the command in quotes. For example, if you
want to call c:\program files\myprog.exe, you must enter “c:\program files\myprog.exe”,
including the quotes. Enter only one command on each line.

Reusable Email
Select Reusable in the Type field for the On-Success or On-Failure email if you want to select an existing Email
task as the On-Success or On-Failure email. The Email Object Browser appears when you click the right side of
the Values field.
Figure A-27 shows Email Object Browser:

Figure A-27. Email Object Browser

Select an Email task to use as On-Success or On-Failure email. Click the Override button to override properties
of the email. For more information about email properties, see Table A-24 on page 506.

504 Appendix A: Session Properties Reference


Non-Reusable Email
Select Non-Reusable in the Type field to create a non-reusable email for the session. Non-Reusable emails do
not appear as Email tasks in the Task folder. Click the right side of the Values field to edit the properties for the
non-reusable On-Success or On-Failure emails. For more information about email properties, see Table A-24
on page 506.

Email Properties
You configure email properties for On-Success or On-Failure Emails when you override an existing Email task
or when you create a non-reusable email for the session.
Table A-23 describes general settings for editing On-Success or On-Failure emails:

Table A-23. On-Success or On-Failure Emails - General Tab

Email Settings Description

Name Enter a name for the email you want to configure.

Description Enter a description for the email you want to configure.

Figure A-28 shows the properties for On-Success or On-Failure emails:

Figure A-28. On-Success or On-Failure Email - Properties Tab

Components Tab 505


Table A-24 describes the email properties for On-Success or On-Failure emails:

Table A-24. On-Success or On-Failure Emails - Properties Tab

Email Properties Description

Email User Name Required to send On-Success or On-Failure session email. Enter the email address of
the person you want the Integration Service to email after the session completes. The
email address must be entered in 7-bit ASCII.
For success email, you can enter $PMSuccessEmailUser to send email to the user
configured for the service variable.
For failure email, you can enter $PMFailureEmailUser to send email to the user
configured for the service variable.

Email Subject Enter the text you want to appear in the subject header.

Email Text Enter the text of the email. Use several variables when creating this text to convey
meaningful information, such as the session name and session status.

Metadata Extensions Tab


The Metadata Extensions tab appears in the session property sheet after the Partitions tab.
Figure A-29 shows the Metadata Extensions tab:

Figure A-29. Metadata Extensions Tab

You can create and promote metadata extensions with the Metadata Extensions tab.

506 Appendix A: Session Properties Reference


Table A-25 describes the configuration options for the Metadata Extensions tab:

Table A-25. Metadata Extensions Tab

Metadata
Extensions Tab Description
Options

Extension Name Name of the metadata extension. Metadata extension names must be unique in a
domain.

Datatype Datatype: numeric (integer), string, boolean, or XML.

Value Value of the metadata extension.


For a numeric metadata extension, the value must be an integer.
For a boolean metadata extension, choose true or false.
For a string or XML metadata extension, click the button in the Value field to enter a
value of more than one line. The Workflow Manager does not validate XML syntax.

Precision Maximum length for string or XML metadata extensions.

Reusable Select to make the metadata extension apply to all objects of this type (reusable). Clear
to make the metadata extension apply to this object only (non-reusable).

Description Description of the metadata extension.

Metadata Extensions Tab 507


508 Appendix A: Session Properties Reference
APPENDIX B

Workflow Properties Reference


This appendix contains a listing of settings in the workflow properties. These settings are grouped by the
following tabs:
♦ General Tab, 510
♦ Properties Tab, 511
♦ Scheduler Tab, 513
♦ Variables Tab, 516
♦ Events Tab, 517
♦ Metadata Extensions Tab, 518

509
General Tab
You can change the workflow name and enter a comment for the workflow on the General tab. By default, the
General tab appears when you open the workflow properties.
Figure B-1 shows the General tab of the workflow properties:

Figure B-1. Workflow Properties - General Tab

Select an Integration
Service to run the
workflow.

Select a suspension
email.

Configure workflow
instances.

Enable the
Integration Service to
run multiple instances
of the workflow at the
same time.

Table B-1 describes the settings on the General tab:

Table B-1. Workflow Properties - General Tab

General Tab
Description
Options

Name Name of the workflow.

Comments Comment that describes the workflow.

Integration Service Integration Service that runs the workflow by default. You can also assign an
Integration Service when you run the workflow.

Suspension Email Email message that the Integration Service sends when a task fails and the
Integration Service suspends the workflow.

Disabled Disables the workflow from the schedule. The Integration Service stops running the
workflow until you clear the Disabled option.

Suspend on Error The Integration Service suspends the workflow when a task in the workflow fails.

Web Services Creates a service workflow. Click Config Service to configure service information.

510 Appendix B: Workflow Properties Reference


Table B-1. Workflow Properties - General Tab

General Tab
Description
Options

Configure Concurrent Enables the Integration Service to run more than one instance of the workflow at a
Execution time. You can run multiple instances of the same workflow name, or you can
configure a different name and parameter file for each instance.
Click Configure Concurrent Execution to configure instance names.

Service Level Determines the order in which the Load Balancer dispatches tasks from the dispatch
queue when multiple tasks are waiting to be dispatched. Default is “Default.”
You create service levels in the Administration Console.

Properties Tab
Configure parameter file name and workflow log options on the Properties tab.
Figure B-2 shows the Properties tab:

Figure B-2. Workflow Properties - Properties Tab

Table B-2 describes the settings on the Properties tab:

Table B-2. Workflow Properties - Properties Tab

Properties Tab
Description
Options

Parameter File Designates the name and directory for the parameter file. Use the parameter file to
Name define workflow variables.

Write Backward Select to write workflow log to a file.


Compatible
Workflow Log File

Properties Tab 511


Table B-2. Workflow Properties - Properties Tab

Properties Tab
Description
Options

Workflow Log File Enter a file name, or a file name and directory. Required.
Name The Integration Service appends information in this field to that entered in the Workflow
Log File Directory field. For example, if you have “C:\workflow_logs\” in the Workflow
Log File Directory field, then enter “logname.txt” in the Workflow Log File Name field,
the Integration Service writes logname.txt to the C:\workflow_logs\ directory.

Workflow Log File Designates a location for the workflow log file. By default, the Integration Service writes
Directory the log file in the service variable directory, $PMWorkflowLogDir.
If you enter a full directory and file name in the Workflow Log File Name field, clear this
field.

Save Workflow Log If you select Save Workflow Log by Timestamp, the Integration Service saves all
By workflow logs, appending a timestamp to each log.
If you select Save Workflow Log by Runs, the Integration Service saves a designated
number of workflow logs. Configure the number of workflow logs in the Save Workflow
Log for These Runs option.
You can also use the $PMWorkflowLogCount service variable to save the configured
number of workflow logs for the Integration Service.

Save Workflow Log Number of historical workflow logs you want the Integration Service to save.
For These Runs The Integration Service saves the number of historical logs you specify, plus the most
recent workflow log. Therefore, if you specify 5 runs, the Integration Service saves the
most recent workflow log, plus historical logs 0–4, for a total of 6 logs.
You can specify up to 2,147,483,647 historical logs. If you specify 0 logs, the
Integration Service saves only the most recent workflow log.

Enable HA Enable workflow recovery. Not available for web service workflows.
Recovery

Automatically Recover terminated Session or Command tasks without user intervention. You must
recover terminated have high availability and the workflow must still be running. Not available for web
tasks service workflows.

Maximum When you automatically recover terminated tasks you can choose the number of times
automatic recovery the Integration Service attempts to recover the task. Default is 5.
attempts

512 Appendix B: Workflow Properties Reference


Scheduler Tab
The Scheduler Tab lets you schedule a workflow to run continuously, run at a given interval, or manually start
a workflow.
Figure B-3 shows the Scheduler tab:

Figure B-3. Workflow Properties - Scheduler Tab


Edit scheduler settings.

You can configure the following types of scheduler settings:


♦ Non-Reusable. Create a non-reusable scheduler for the workflow.
♦ Reusable. Choose a reusable scheduler for the workflow.
Table B-3 describes the settings on the Scheduler Tab:

Table B-3. Workflow Properties - Scheduler Tab

Scheduler Tab
Description
Options

Non-Reusable/Reusable Indicates the scheduler type.


If you select Non Reusable, the scheduler can only be used by the current
workflow.
If you select Reusable, choose a reusable scheduler. You can create reusable
schedulers by selecting Schedulers.

Scheduler Select a set of scheduler settings for the workflow.

Description Enter a description for the scheduler.

Summary Read-only summary of the selected scheduler settings.

Scheduler Tab 513


Edit Scheduler Settings
Click the Edit Scheduler Settings button to configure the scheduler. The Edit Scheduler dialog box appears.
Figure B-4 shows the Edit Scheduler dialog box:

Figure B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box

Table B-4 describes the settings on the Edit Scheduler dialog box:

Table B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box

Scheduler Options Description

Run Options: Run On Indicates the workflow schedule type.


Integration Service If you select Run On Integration Service Initialization, the Integration
Initialization/Run On Service runs the workflow as soon as the Integration Service is initialized.
Demand/Run Continuously If you select Run On Demand, the Integration Service only runs the
workflow when you start the workflow.
If you select Run Continuously, the Integration Service starts the next run of
the workflow as soon as it finishes the first run.

Schedule Options: Run Required if you select Run On Integration Service Initialization in Run
Once/Run Every/Customized Options.
Repeat Also required if you do not choose any setting in Run Options.
If you select Run Once, the Integration Service runs the workflow once, as
scheduled in the scheduler.
If you select Run Every, the Integration Service runs the workflow at regular
intervals, as configured.
If you select Customized Repeat, the Integration Service runs the workflow
on the dates and times specified in the Repeat dialog box.

Edit Required if you select Customized Repeat in Schedule Options. Opens the
Repeat dialog box, allowing you to schedule specific dates and times for the
workflow to run. The selected scheduler appears at the bottom of the page.

Start Date Required if you select Run On Integration Service Initialization in Run
Options.
Also required if you do not choose any setting in Run Options.
Indicates the date on which the Integration Service begins scheduling the
workflow.

514 Appendix B: Workflow Properties Reference


Table B-4. Workflow Properties - Scheduler Tab - Edit Scheduler Dialog Box

Scheduler Options Description

Start Time Required if you select Run On Integration Service Initialization in Run
Options.
Also required if you do not choose any setting in Run Options.
Indicates the time at which the Integration Service begins scheduling the
workflow.

End Options: End On/End Required if the workflow schedule is Run Every or Customized Repeat.
After/Forever If you select End On, the Integration Service stops scheduling the workflow
in the selected date.
If you select End After, the Integration Service stops scheduling the
workflow after the set number of workflow runs.
If you select Forever, the Integration Service schedules the workflow as
long as the workflow does not fail.

Customizing Repeat Option


You can schedule the workflow to run once, run at an interval, or customize the repeat option. Click the Edit
button on the Edit Scheduler dialog box to configure Customized Repeat options.
Figure B-5 shows the Customized Repeat dialog box:

Figure B-5. Workflow Properties - Customized Repeat Dialog Box

Table B-5 describes options in the Customized Repeat dialog box:

Table B-5. Workflow Properties - Repeat Dialog Box Options

Repeat Option Description

Repeat Every Enter the numeric interval you want to schedule the workflow, then select Days, Weeks,
or Months, as appropriate.
If you select Days, select the appropriate Daily Frequency settings.
If you select Weeks, select the appropriate Weekly and Daily Frequency settings.
If you select Months, select the appropriate Monthly and Daily Frequency settings.

Weekly Required to enter a weekly schedule. Select the day or days of the week on which you
want to schedule the workflow.

Scheduler Tab 515


Table B-5. Workflow Properties - Repeat Dialog Box Options

Repeat Option Description

Monthly Required to enter a monthly schedule.


If you select Run On Day, select the dates on which you want the workflow scheduled
on a monthly basis. The Integration Service schedules the workflow on the selected
dates. If you select a numeric date exceeding the number of days within a given month,
the Integration Service schedules the workflow for the last day of the month, including
leap years. For example, if you schedule the workflow to run on the 31st of every
month, the Integration Service schedules the session on the 30th of the following
months: April, June, September, and November.
If you select Run On The, select the week(s) of the month, then day of the week on
which you want the workflow to run. For example, if you select Second and Last, then
select Wednesday, the Integration Service schedules the workflow on the second and
last Wednesday of every month.

Daily Enter the number of times you would like the Integration Service to run the workflow on
any day the session is scheduled.
If you select Run Once, the Integration Service schedules the workflow once on the
selected day, at the time entered on the Start Time setting on the Time tab.
If you select Run Every, enter Hours and Minutes to define the interval at which the
Integration Service runs the workflow. The Integration Service then schedules the
workflow at regular intervals on the selected day. The Integration Service uses the Start
Time setting for the first scheduled workflow of the day.

Variables Tab
Before using workflow variables, you must declare them on the Variables tab.
Figure B-6 shows the settings on the Variables tab:

Figure B-6. Workflow Properties - Variables Tab

516 Appendix B: Workflow Properties Reference


Table B-6 describes the settings on the Variables tab:

Table B-6. Workflow Properties - Variables Tab

Variable Options Description

Name Name of the workflow variable.

Datatype Datatype of the workflow variable.

Persistent Indicates whether the Integration Service maintains the value of the variable from the
previous workflow run.

Is Null Indicates whether the workflow variable is null.

Default Default value of the workflow variable.

Description Optional details about the workflow variable.

Events Tab
Before using the Event-Raise task, declare a user-defined event on the Events tab.
Figure B-7 shows the Events tab:

Figure B-7. Workflow Properties - Events Tab

Table B-7 describes the settings on the Events tab:

Table B-7. Workflow Properties - Events Tab

Events Tab Options Description

Events Name of the event you declare.

Description Details to describe the event.

Events Tab 517


Metadata Extensions Tab
Extend the metadata stored in the repository by associating information with individual repository objects.
Create metadata extensions for repository objects by editing the object and then adding the metadata extension
to the Metadata Extension tab.
Figure B-8 shows the Metadata Extensions tab:

Figure B-8. Workflow Properties - Metadata Extensions Tab

You can create and promote metadata extensions with the Metadata Extensions tab.
Table B-8 describes the configuration options for the Metadata Extensions tab:

Table B-8. Workflow Properties - Metadata Extensions Tab

Metadata
Extensions Tab Description
Options

Extension Name Name of the metadata extension. Metadata extension names must be unique in a
domain.

Datatype Datatype: numeric (integer), string, boolean, or XML.

Value For a numeric metadata extension, the value must be an integer.


For a boolean metadata extension, choose true or false.
For a string or XML metadata extension, click the Edit button on the right side of the
Value field to enter a value of more than one line. The Workflow Manager does not
validate XML syntax.

Precision Maximum length for string or XML metadata extensions.

Reusable Select to make the metadata extension apply to all objects of this type (reusable). Clear
to make the metadata extension apply to this object only (non-reusable).

UnOverride This column appears only if the value of one of the metadata extensions was changed.
To restore the default value, click Revert.

Description Description of the metadata extension.

518 Appendix B: Workflow Properties Reference


INDEX

A AND links
input type 98
ABORT function $AppConnection
See also PowerCenter Transformation Language Reference using 133
session failure 131 Append if Exists
aborted flat file target property 179, 269
status 346 application connections
aborting CPI-C 47
Control tasks 103 JMS 42
Integration Service handling 82 JNDI 42
sessions 83 parameter types 408
status 346 password, parameter types 410
tasks 82 PeopleSoft 44
tasks in Workflow Monitor 344 RFC/BAPI 50
workflows 82 Salesforce 45
Absolute Time SAP ALE IDoc Reader 49
specifying 109 SAP ALE IDoc Writer 49
Timer task 109 SAP NetWeaver 46
active sources SAP NetWeaver BI 51
constraint-based loading 172 SAP R/3 47
definition 178 session parameter 133
generating commits 209 Siebel 53
row error logging 178 TIB/Adapter SDK 57
source-based commit 208, 209 TIB/Rendezvous 57
transaction generators 178 TIBCO 54
XML targets 178 user name, parameter types 410
adding Web Services 57
tasks to workflows 66 webMethods 59
advanced settings arrange
session properties 142 workflows vertically 4
aggregate caches workspace objects 9
reinitializing 450, 479 assigning
aggregate files Integration Services 69
deleting 451 Assignment tasks
moving 451 creating 100
reinitializing 450 definition 100
Aggregator cache description 95
description 462 using Expression Editor 72
overview 462 variables in 85, 407
Aggregator transformation attributes
See also PowerCenter Transformation Guide partition-level 252
adding to concurrent workflows 332 automatic memory settings
cache partitioning 462 configuring 124
caches 462 automatic task recovery
configure caches 463 configuring 229
inputs for cache calculator 463
pushdown optimization 316
sorted ports 463
using partition points 260

519
B Sorter transformation 462, 469
transformations 461
Backward Compatible Session Log cache size
configuring 381 configuring 459
Backward Compatible Workflow Log optimizing 471
configuring 380 session memory requirements, configuring 125
$BadFile caches
naming convention 133 Aggregator transformation 462
using 133 auto memory 460
Base URL cache calculator 459, 461
parameter and variable types 410 configuring 461
Based on Number of CPUs configuring concurrent lookup caches for sessions 142
setting 251 configuring for Aggregator transformation 463
Based on Number of Partitions configuring for Joiner transformation 465, 470
setting 251 configuring for Lookup transformation 468
bigint configuring for Rank transformation 469
high precision handling 139, 140 configuring for XML target 470
block size configuring lookup in sessions 142
FastExport attribute 160 configuring maximum memory limits 125
buffer block size configuring maximum numeric memory limit for sessions 142
configuring 124 data caches on a grid 368
configuring for sessions 142 for non-reusable sessions 459
buffer memory for reusable sessions 459
allocating 123 for sorted-input Aggregator transformations 463
buffer blocks 123 for transformations 456
configuring 123 index caches on a grid 368
bulk loading Joiner transformation 464
commit interval 175 Lookup transformation 467
data driven session 174 memory 456
DB2 guidelines 175 methods to configure 459
Oracle guidelines 175 numeric value 460
relational targets 174 optimizing 471
session properties 174, 491 overriding 459
test load 169 overview 455
using user-defined commit 212 partitioning 252
persistent lookup 467
Rank transformation 468
C resetting with real-time sessions 215
cache calculator session cache files 455
Aggregator transformation inputs 463 Sorter transformation 469
description 459 specifying maximum memory by percentage 142
Joiner transformation inputs 466 XML target 470
Lookup transformation inputs 468 certified messages
Rank transformation inputs 469 configuring TIB/Rendezvous application connections 55
Sorter transformation inputs 470 changed source data
using 461 real-time data 192
cache directories checking in
optimal, choosing 459 versioned objects 10
sharing 459 checking out
variable types 406 versioned objects 10
cache files checkpoint
locating 451 session recovery 229
naming convention 457 session state of operation 222, 229
cache partitioning COBOL sources
Aggregator transformation 462 error handling 156
configuring cache size 461 numeric data handling 157
described 252 code page compatibility
incremental aggregation 463 See also PowerCenter Administrator Guide
Joiner transformation 462, 465 multiple file sources 158
Lookup transformation 277, 462, 467 targets 164
performance 252 code pages
Rank transformation 462, 468 connection objects 20
database connections 20, 164

520 Index
delimited source 155 workflows 13
delimited target 182, 495 worklets 13
external loader files 419 Components tab
fixed-width source 153 properties 501
fixed-width target 182, 494 concurrent connections
relaxed validation 20 in partitioned pipelines 268
cold start concurrent merge
real-time sessions 203 file targets 270
tasks and workflows in Workflow Monitor 344 concurrent read partitioning
color themes session properties 265
selecting 5 concurrent workflows
colors adding instance names 329
setting 5 configuring to run with same name 327
workspace 5 configuring unique instances 326
command creating workflow instances with pmcmd 330
file targets 180 rules and guidelines 332
generating file list 153 running web service workflows 327
generating source data 152 scheduling 73, 332
partitioned sources 263 Start Workflow Advanced option 329
partitioned targets 269 Start Workflow option 330
processing target data 180 starting and stopping 329
Command property starting from command line 330
configuring flat file sources 152 steps to configure 329
configuring flat file targets 180 stopping from command line 330
configuring partitioned targets 269 transformation restrictions 332
partitioning file sources 265 using different session parameter files 413
Command tasks using parameters 328
assigning resources 372 viewing in Workflow Monitor 331
creating 101 viewing logs 331
definition 100 concurrent worklets
description 95 description 332
executing commands 102 Config Object tab
Fail Task if Any Command Fails 102 overview 141
making reusable 129 session properties 141
monitoring details in the Workflow Monitor 358 configuration objects
multiple UNIX commands 102 See session configuration objects
promoting to reusable 102 Configure Concurrent Execution
using parameters and variables 128 configuring workflow instances 329
using variables in 101 connect string
variable types 407 examples 29
Command Type syntax 29
configuring flat file sources 152 connection environment SQL
partitioning file sources 265 configuring 29, 30
comments parameter and variable types 409
adding in Expression Editor 72 connection objects
commit interval See also PowerCenter Repository Guide
bulk loading 175 assigning permissions 27
configuring 218 code pages 20
description 207 configuring in sessions 22
source- and target-based 207 definition 20
commit source deleting 22
source-based commit 209 overriding connection attributes 24
commit type owner 27
configuring 196, 477 Connection Retry Period (property)
real-time sessions 196 database connections 32
committing data WebSphere MQ 61
target connection groups 208 connection settings
transaction control 212 applying to all session instances 122
comparing objects targets 490
See also PowerCenter Designer Guide connection variables
See also PowerCenter Repository Guide defining for Lookup transformations 23
sessions 13 defining for Stored Procedure transformations 23
tasks 13 specifying $Source and $Target 23

Index 521
ConnectionParam.prm file workflows 65
using 26 CUME function
connections partitioning restrictions 280
changing Teradata FastExport connections 160 custom properties
configuring for sessions 22 overriding Integration Service properties for sessions 142
copy as 32, 33 Custom transformation
copying a relational database connection 32 partitioning guidelines 280
creating Teradata FastExport connections 159 pipeline partitioning 270
external loader 36 threads 271
FTP 34 customization
multiple targets 187 of toolbars 8
overriding connection attributes 24 of windows 8
overriding for Lookup transformations 23 workspace colors 5
overriding for Stored Procedure transformations 23 customized repeat
parameter file template 26 daily 76
relational database 31, 40 editing 75
replacing a relational database connection 33 monthly 76
sources 149 options 75
targets 166 repeat every 76
connectivity weekly 76
connect string examples 29
constraint-based loading
active sources 172 D
configuring 172
data
configuring for sessions 142
capturing incremental source changes 449, 452
enabling 174
Data Analyzer
key relationships 172
running from Workflow Manager 83
target connection groups 173
data cache
Update Strategy transformations 173
for incremental aggregation 451
control file override
naming convention 457
description 161
data driven
loading Teradata 429
bulk loading 174
setting Teradata FastExport statements 161
data encryption
steps to override Teradata FastExport 162
FastExport attribute 160
Control tasks
data files
definition 103
creating directory 452
description 95
finding 451
options 103
data flow
stopping or aborting the workflow 82
See pipelines
copying
data movement mode
repository objects 12
affecting incremental aggregation 451
counters
Data Profiling domains
overview 362
domain value, variable types 410
CPI-C application connections
database connections
See also connections
configuring 31, 40
configuring 47
configuring for PowerChannel 40
creating
connection retry period 32
Assignment tasks 100
copying a relational database connection 32
Command tasks 101
domain name 32, 41
data files directory 452
packet size 32, 41
Decision tasks 105
parameter 135
Email tasks 241
parameter types 408
error log tables 394
password, parameter types 409
external loader connections 36
replacing a relational database connection 33
file list for partitioned sources 264
resilience 30
FTP sessions 445
session parameter 133
index directory 452
use trusted connection 32, 41
metadata extensions 15
user name parameter types 409
reserved words file 176
using IBM DB2 client authentication 29
reusable scheduler 74
using Oracle OS Authentication 29
sessions 120
database partitioning
tasks 96
description 250, 281
workflow variables 92

522 Index
Integration Service handling for sources 285 using Expression Editor 72
multiple sources 285 variable types 407
one source 284 variables in 85
performance 284, 286 Default Remote Directory
rules and guidelines for Integration Service 285 for FTP connections 35
rules and guidelines for sources 286 deleting
rules and guidelines for targets 286 connection objects 22
targets 286 workflows 66
database sequences delimited flat files
dropping during recovery 307 code page, sources 155, 487
dropping orphaned sequences 307 code page, targets 182
pushdown optimization 307 consecutive delimiters 487
troubleshooting 307 escape character, sources 155, 487
database views numeric data handling 157
creating with pushdown optimization 305 quote character, sources 154, 487
dropping during recovery 307 quote character, targets 182
dropping orphaned views 307 row settings 155
pushdown optimization 307 session properties, sources 154
troubleshooting 307 session properties, targets 182
databases sources 487
configuring a connection 31 delimited sources
connection requirements 31, 36, 40 number of rows to skip 487
connections 28 delimited targets
environment SQL 29 session properties 495
selecting code pages 20 delimiter
datatypes session properties, sources 154
See also PowerCenter Designer Guide session properties, targets 182
Decimal 183 directories
Double 183 for historical aggregate data 452
Float 183 shared caches 459
Integer 183 workspace file 4
Money 184 disabled
numeric 184 status 346
padding bytes for fixed-width targets 183 disabling
Real 184 tasks 99
date time workflows 76
format 3 displaying
dates Expression Editor 73
configuring 3 Integration Services in Workflow Monitor 338
formats 3 domain name
DB2 database connections 32, 41
See also IBM DB2 dropping
bulk loading guidelines 175 indexes 172
commit interval 175 DTM (Data Transformation Manager)
$DBConnection buffer size 124
naming convention 133 DTM Buffer Pool Size
using 133 session property 478
deadlock retries 124 DTM buffer size errors
See also PowerCenter Administrator Guide deadlock retry 124
configuring 171 DTM buffer size requirement
PM_RECOVERY table 171 configuring 124
session 479 durable subscription name
target connection groups 177 variable types for JMS 409
decimal $DynamicPartitionCount
high precision handling 139 description 133
decimal arithmetic dynamic partitioning
See high precision based on number of CPUs 251
Decision tasks based on number of nodes in grid 251
creating 105 based on number of partitions 251
decision condition variable 103 description 250
definition 103 disabled 251
description 96 number of partitions, parameter types 408
example 104 performance 250

Index 523
rules and guidelines 251 end of file
session option 145 transaction control 212
using source partitions 251 end options
using with partition types 252 end after 75
end on 75
forever 75
E endpoint URL
in Web Service application connections 57
edit null characters
parameter and variable types 406, 407, 410
session properties 494
web services, parameter and variable types 406
Edit Partition Point
enhanced security
dialog box options 256
enabling 7
editing
environment SQL
delimited file properties for sources 486
configuring 29
delimited file properties for targets 494
guidelines for entering 30
metadata extensions 16
parameter and variable types 409
null characters 494
See also connection environment SQL
scheduling 76
See also transaction environment SQL
sessions 120
error handling
workflows 66
COBOL sources 156
effective dates
configuring 127
PeopleSoft, parameter and variable types 406
error log files 399
email
fixed-width file 156
See also Email tasks
options 401
attaching files 242, 246
overview 131
configuring a user on Windows 239, 245
PMError_MSG table schema 396
configuring the Integration Service on UNIX 238
PMError_ROWDATA table schema 394
configuring the Integration Service on Windows 239
PMError_Session table schema 397
distribution lists 240
pre- and post-session SQL 127
format tags 242
pushdown optimization 303
logon network security on Windows 240
transaction control 212
MIME format 239
error handling settings
multiple recipients 240
session properties 144
on failure 242
error log files
on success 242
directory, parameter and variable types 408
overview 237
name, parameter and variable types 408
post-session 242
overview 399
post-session, parameter and variable types 409
table name prefix length restriction 416
rmail 238
error log tables
service variables 245
creating 394
session properties 504
overview 394
specifying a Microsoft Outlook profile 240
error logs
suspending workflows 244
options 401
suspension, variable types 409
overview 393
text message 240
session errors 131
tips 245
error messages
user name 240
external loader 420
using other mail programs 246
error threshold
using service variables 245
pipeline partitioning 131
variables 242
stop on errors 131
workflows 241
variable types 408
worklets 241
errors
Email tasks
fatal 131
See also email
pre-session shell command 129
creating 241
stopping session on 144
description 96
threshold 131
overview 240
validating in Expression Editor 72
suspension email 81
Event-Raise tasks
variable types 407
configuring 107
enabling
declaring user-defined event 107
enhanced security 7
definition 105
past events in Event-Wait task 109
description 96
in worklets 113

524 Index
events session command 504
in worklets 113 fail task recovery strategy
predefined events 105 description 228, 229
user-defined events 105 failed
Event-Wait tasks status 346
definition 105 failing workflows
description 96 failing parent workflows 99, 103
file watch name, variable types 407 using Control task 103
for predefined events 108 fatal errors
for user-defined events 108 session failure 131
waiting for past events 109 file list
working with 107 See also PowerCenter Designer Guide
ExportSessionLogLibName creating 157
passing log events to an external library 377 creating for partitioned sources 264
Expression Editor generating with command 153
adding comments 72 merging target files 270
displaying 73 using for source file 157
syntax colors 73 file mode
using 72 SAP R/3 application connections 47
validating 77 file properties
validating expressions using 72 setting for sources 484
Expression transformation setting for targets 493
See also PowerCenter Transformation Guide file sources
pushdown optimization 317 code page, parameter and variable types 408
expressions directories, parameter and variable types 408
parameter and variable types 406 input file commands, parameter and variable types 408
pushdown optimization 298 Integration Service handling 155, 157
validating 72 names, parameter and variable types 408
external loader numeric data handling 157
behavior 420 partitioning 263
code page 419 session properties 151
configuring as a resource 420 file targets
connections 36 code page, parameter and variable types 408
DB2 421 partitioning 268
error messages 420 session properties 178
Integration Service support 419 file-based ledger
loading multibyte data 426, 427, 428 TIB/Rendezvous application connections, configuring 55
on Windows systems 420 filter conditions
Oracle 425 adding 290
overview 419 in partitioned pipelines 262
processing subseconds 420 parameter and variable types 407
session properties 22 WebSphere MQ, parameter and variable types 409
setting up Workflow Manager 439 Filter transformation
Sybase IQ 427 See also PowerCenter Transformation Guide
Teradata 429 pushdown optimization 317
using with partitioned pipeline 268 filtering
external loader connections deleted tasks in Workflow Monitor 338
parameter types 408 Integration Services in Workflow Monitor 338
password, parameter types 410 tasks in Gantt Chart view 338
session parameter 133 tasks in Task View 349
user name, parameter types 410 Find in Workspace tool
External Procedure transformation overview 8
See also PowerCenter Transformation Guide Find Next tool
initialization properties, variable types 406 overview 8
partitioning guidelines 280 fixed-width files
Extract Date code page 485
PeopleSoft, parameter and variable types 406 code page, sources 153
code page, targets 182
error handling 156
F multibyte character handling 156
null characters, sources 153, 485
Fail Task if Any Command Fails
null characters, targets 182
in Command Tasks 102
numeric data handling 157

Index 525
padded bytes in fixed-width targets 183 orthogonal links 5
source session properties 153 resetting 5
target session properties 181 schedule 3
writing to 183, 184 solid lines for links 5
fixed-width sources Timer task 3
session properties 485 fractional seconds precision
fixed-width targets Teradata FastExport attribute 161
session properties 494 FTP
flat file definitions accessing source files 445
escape character, sources 155 accessing target files 445
Integration Service handling, targets 183 connecting to file targets 268
quote character, sources 154 connection names 34
quote character, targets 182 connection properties 34
session properties, sources 152 connections for ABAP integration 48
session properties, targets 179 creating a session 445
flat file logging defining connections 34
error log file directory, configuring 401 defining default remote directory 35
error log file name, configuring 401 defining host names 35
error log type, configuring 401 mainframe guidelines 35
flat files overview 443
code page, sources 153 partitioning targets 447
code page, targets 182 remote directory, parameter and variable types 410
configuring recovery 231 remote file name, parameter and variable types 408
creating footer 179 resilience 35
creating headers 179 retry period 35
delimiter, sources 154 session properties 480, 490
delimiter, targets 182 SFTP 443
Footer Command property 179, 269 Use SFTP 35
generating source data 152 FTP connections
generating with command 152 creating 34
Header Command property 179, 269 parameter types 408
Header Options property 179, 269 password, parameter types 410
multibyte data 185 session parameter 133
null characters, sources 153 user name parameter types 410
null characters, targets 182 $FTPConnection
numeric data handling 157 using 133
output file session parameter 133 full pushdown optimization
precision, targets 184, 185 description 295
preserving input row order 267 full recovery
processing with command 180 description 229
shift-sensitive target 185 functions
source file session parameter 133 available in databases 299
writing targets by transaction 185 pushdown optimization 299
flush latency Session Log Interface 388
configuring 195
description 195
Flush Session Recovery Data (property) G
Integration Service 199
Gantt Chart
fonts
configuring 341
format options 5
filtering 338
setting 5
listing tasks and workflows 347
footer
navigating 347
creating in file targets 179, 269
opening and closing folders 339
parameter and variable types 409
organizing 347
Footer Command
overview 335
flat file targets 179, 269
searching 348
format
time increments 348
date time 3
time window, configuring 341
format options
using 347
color themes 5
general options
colors 5
arranging workflow vertically 4
date and time 3
configuring 4
fonts 5

526 Index
in-place editing 4 high availability
launching Workflow Monitor 5 WebSphere MQ, configuring 61
open editor 4 high precision
panning windows 4 Bigint datatype 139
reload task or workflow 4 Decimal datatype 139
repository notifications 5 enabling 479
session properties 474 handling 139
show background in partition editor and DBMS based history names
optimization 5 in Workflow Monitor 346
show expression on a link 5 host names
show full name of task 4 for FTP connections 35
General tab in session properties HTTP transformation
in Workflow Manager 474 pipeline partitioning 270
generating threads 271
commits with source-based commit 209
globalization
See also PowerCenter Administrator Guide I
database connections 164
IBM DB2
overview 164
connect string example 29
targets 164
connection with client authentication 29
grid
database partitioning 281, 284, 286
cache requirements 368
IBM DB2 EE
configuring resources 370
attributes 423
configuring session properties 370
connecting with client authentication 36, 422
configuring workflow properties 370
external loader connections 36
distributing sessions 366, 369
external loading 421
distributing workflows 366, 369
IBM DB2 EEE
enabling sessions to run 146
attributes 424
Integration Service behavior 369
connecting with client authentication 36, 422
Integration Service property settings 369
external loader connections 36
overview 365
external loading 421
pipeline partitioning 367
icons
recovering sessions 369
Workflow Monitor 337
recovering workflows 369
worklet validation 116
requirements 370
idle time
running sessions 366
configuring 195
specifying maximum memory limits 126
incremental aggregation
cache partitioning 463
changing session sort order 451
H configuring 478
hash auto-key partitioning configuring the session 452
description 250 deleting files 451
overview 287 Integration Service data movement mode 451
hash partitioning moving files 451
adding hash keys 288 overview 449
description 281 partitioning data 451
hash user keys preparing to enable 452
description 250 processing 450
hash user keys partitioning reinitializing cache 450
overview 287 incremental changes
performance 287 capturing 452
header incremental recovery
creating in file targets 179, 269 description 229
parameter and variable types 409 index caches
Header Command for incremental aggregation 451
flat file targets 179, 269 naming convention 457
Header Options indexes
flat file targets 179, 269 creating directory 452
heterogeneous sources dropping for target tables 172
defined 147 finding 451
heterogeneous targets recreating for target tables 172
overview 187

Index 527
indicator files
predefined events 107
J
INFA_AbnormalSessionTermination Java Classpath
Session Log Interface 391 session property 477
INFA_EndSessionLog Java transformation
Session Log Interface 390 pipeline partitioning 270
INFA_InitSessionLog session level classpath 477
Session Log Interface 389 threads 271
INFA_OutputSessionLogFatalMsg 42
Session Log Interface 390 JMS application connections
INFA_OutputSessionLogMsg configuring 42
Session Log Interface 389 JMS Connection Factory Name (property) 42
Informix JMS Destination (property) 42
connect string syntax 29 JMS Destination Type (property) 42
row-level locking 268 JMS Password (property) 43
in-place editing JMS Recovery Destination (property) 43
enabling 4 JMS User Name (property) 43
input link type properties 42
selecting for task 98 JMS Connection Factory Name (property)
Input Type JMS application connections 42
file source partitioning property 265 JMS Destination
flat file source property 152 parameter and variable types 410
$InputFile JMS Destination (property)
naming convention 133 JMS application connections 42
using 133 JMS Destination Type (property)
Integration Service JMS application connections 42
assigning a grid 369 JMS Password (property)
assigning workflows 69 JMS application connections 43
behavior on a grid 369 JMS Recovery Destination (property)
calling functions in the Session Log Interface 387 JMS application connections 43
commit interval overview 207 JMS User Name (property)
connecting in Workflow Monitor 337 JMS application connections 43
external loader support 419 JNDI application connections
filtering in Workflow Monitor 338 See also connections
grid overview 365 configuring 42
handling file targets 183 joiner cache
monitoring details in the Workflow Monitor 354 description 464
online and offline mode 337 Joiner transformation
pinging in Workflow Monitor 337 See also PowerCenter Transformation Guide
removing from the Navigator 3 cache partitioning 462, 465
running sessions on a grid 366 caches 464
selecting 68 configure caches 465, 470
tracing levels 384 inputs for cache calculator 466
truncating target tables 170 joining sorted flat files 273
using FTP 34 joining sorted relational data 275
using SFTP 34 partitioning 465
version in session log 383 partitioning guidelines 280
Integration Service code page pushdown optimization 318
See also Integration Service
affecting incremental aggregation 451
Integration Service handling K
file targets 183 Keep absolute input row order
fixed-width targets 184, 185 session properties 267
multibyte data to file targets 185 Keep relative input row order
shift-sensitive data, targets 185 session properties 267
Integration Service Monitor key range partitioning
system resource usage 355 adding 288
is staged adding key ranges 290
FastExport session attribute 161 adding partition key 289
description 250, 281
Partitions View 255
performance 289

528 Index
pushdown optimization 311 configuring in sessions 142
keyboard shortcuts file name prefix, parameter and variable types 406
Workflow Manager 17 lookup databases
keys database connection session parameter 133
constraint-based loading 172 lookup files
lookup file session parameter 133
lookup source files
L using parameters 133
Lookup SQL Override option
latency
parameter and variable types 406
description 192
Lookup transformation
launching
See also Transformation Guide
Workflow Monitor 5, 337
See also PowerCenter Transformation Guide
Ledger File (property)
adding to concurrent workflows 332
TIB/Rendezvous application connections, configuring 55
cache partitioning 277, 462, 467
line sequential buffer length
caches 467
configuring for sessions 142
configure caches 468
sources 155
connection information, parameter and variable types 407
links
inputs for cache calculator 468
AND 98
pushdown optimization 318
condition 70
resilience 30
example link condition 71
source file, parameter and variable types 408
linking tasks concurrently 70
$LookupFile
linking tasks sequentially 71
using 133
loops 70
lookups
OR 98
persistent cache 467
orthogonal 5
loops
show expression on a link 5
invalid workflows 70
solid lines 5
specifying condition 71
using Expression Editor 72
variable types 407
M
variables in 85 mainframe
working with 70 FTP guidelines 35
List Tasks mapping parameters
in Workflow Monitor 347 See also PowerCenter Designer Guide
Load Balancer in parameter files 404
assigning priorities to tasks 371 in session properties 136
assigning resources to tasks 372 overriding 136
workflow settings 371 passing values between sessions 137
$LoaderConnection $$PushdownConfig 308
using 133 mapping variables
log codes See also PowerCenter Designer Guide
See PowerCenter Troubleshooting Guide available in databases 299
log files in parameter files 404
See also session log files in partitioned pipelines 253
See also workflow log files passing values between sessions 137
archiving 379, 380 pushdown optimization 299
real-time sessions 377 mappings
session log 476 configuring pushdown optimization 294
using a shared library 377 session failure from partitioning 254
writing 378 max sessions
log options FastExport attribute 160
session properties 143 Maximum Days
logging Workflow Monitor 340
pushdown optimization 303 maximum memory limit
logtable name configuring for caches 125
FastExport attribute 160 configuring for session caches 142
lookup cache percentage of memory for session caches 142
description session on a grid 126
467 Maximum Workflow Runs
lookup caches Workflow Monitor 340
configuring concurrent for sessions 142

Index 529
memory editing 16
caches 456 overview 15
configuring automatic settings 124 session properties 506
memory requirements Microsoft Access
DTM buffer size 124 pipeline partitioning 268
session cache size 125 Microsoft Outlook
Merge Command configuring an email user 239, 245
description 269 configuring the Integration Service 239
flat file targets 179 Microsoft SQL Server
parameter and variable types 408 commit interval 175
Merge File Directory connect string syntax 29
description 269 MIME format
flat file target property 179 email 239
parameter and variable types 408 monitoring
Merge File Name See also Workflow Monitor
description 269 command tasks 358
flat file target property 179 failed sessions 359
parameter and variable types 408 folder details 355
Merge Type Integration Service details 354
description 269 Repository Service details 354
flat file target property 179 session details 359
merging target files targets 360
commands 269 tasks details 357
concurrent merge 270 worklet details 357
file list 270 MOVINGAVG function
FTP 268 See also PowerCenter Transformation Language Reference
FTP file targets 447 partitioning restrictions 280
local connection 268, 269 MOVINGSUM function
sequential merge 270 See also PowerCenter Transformation Language Reference
session properties 269, 492 partitioning restrictions 280
message count MSMQ queue connections
configuring 195 See also connections
message processing configuring 43
real-time sessions 198, 200, 201 multibyte data
recovery files 198 character handling 156
recovery queues 201 Oracle external loader 426
recovery tables 200 Sybase IQ external loader 427
recovery topics 201 writing to files 185
rules and guidelines 203 multiple group transformations
message queue partitioning 249
processing real-time data 192 multiple input group transformations
using with partitioned pipeline 268 creating partition points 260
Message Queue queue connections multiple sessions
configuring for WebSphere MQ 60 validating 130
message recovery
description 197
enabling 197 N
prerequisites 197
naming conventions
real-time sessions 197, 199, 200, 201
session parameters 133
recovery files 197, 199
native connect string
recovery queues 197, 201
See connect string
recovery tables 197, 200
navigating
recovery topics 197, 201
workspace 7
rules and guidelines 204
non-persistent variables
session recovery data flush 199
definition 91
messages
non-reusable sessions
processing real-time data 192
caches 459
messages and message queues
non-reusable tasks
real-time data 192
inherited changes 98
metadata extensions
promoting to reusable 97
configuring 518
normal loading
creating 15
session properties 491
deleting 17

530 Index
Normal tracing levels fixed-width flat file target 426
definition 384 multibyte data 426
Normalizer transformation partitioned target files 426
using partition points 260 processing subseconds 426
notifications reject file 425
See repository notifications Output File Directory property
null characters FTP targets 447
editing 494 parameter and variable types 409
file targets 182 partitioning target files 269
fixed-width targets 186 Output File Name property
Integration Service handling 156 flat file targets 180
session properties, targets 181 FTP targets 447
targets 494 parameter and variable types 409
number of CPUs partitioning target files 269
setting for dynamic partitioning 251 output files
number of nodes in grid session properties 492
setting with dynamic partitioning 251 targets 180
number of partitions Output is Deterministic
overview 249 transformation property 231
performance 249 Output is Repeatable
session parameter 133 transformation property 231
setting for dynamic partitioning 251 Output Type property
numeric values flat file targets 179
reading from sources 157 partitioning file targets 269
$OutputFile
naming convention 133
O using 133
overriding
objects
Teradata loader control file 429
viewing older versions 11
tracing levels in sessions 144
older versions of objects
owner
viewing 11
connection object 27
open transaction
owner name
definition 215
truncating target tables 171
operating system profile
override 79
operators
available in databases 298
P
pushdown optimization 298 packet size
Optimize throughput database connections 32, 41
session properties 267 page setup
optimizing configuring 7
data flow 364 parameter files
options (Workflow Manager) comments, adding 411
format 4, 5 configuring concurrent workflow instances 328
general 4 datetime formats 416
miscellaneous 4 defining properties in 405
solid lines for links 5 description 403
OR links example of use 415
input type 98 guidelines for creating 410, 415
Oracle headings 411
bulk loading guidelines 175 input fields that accept parameters and variables 406
commit intervals 175 location, configuring 412
connect string syntax 29 name, configuring 412
connection with OS Authentication 29 null values, entering 412
database partitioning 281, 284 overriding connection attributes 26
Oracle external loader overview 403
attributes 426 parameter and variable types in 404
connecting with OS Authentication 36, 426 precedence of 414
data precision 426 sample parameter file 412
delimited flat file target 426 scope of parameters and variables in 410
external loader connections 36 sections 410
external loader support 419, 425 session parameter file name, variable types 408, 413

Index 531
specifying which to use 404 Sybase IQ 268
structure of 410 transformations 280
template file 26 unconnected transformations 261
tips for creating 417 XML targets 280
troubleshooting 416 partition-level attributes
using variables to specify 413 description 252
using with pmcmd 414 partitions
using with sessions 413 adding 256
using with workflows 413 deleting 256
parameters description 249
database connection 135 entering description 256
defining in parameter files 405 merging for pushdown optimization 311
input fields that accept parameters 406 merging target data 269
overview of types 404 properties 256
session 132 scaling 250
partition count session properties 269
session parameter 133 pass-through partition type
partition groups description 250
description 367 overview 282
stages 367 performance 291
partition keys processing 291
adding 288, 289 pushdown optimization 311
adding key ranges 290 PeopleSoft application connections
rows with null values 290 configuring 44
rules and guidelines 290 performance
partition names cache settings 459
setting 256 commit interval 209
partition points data, collecting 478
adding and deleting 259 data, writing to repository 478
adding, steps 257 performance counters
Custom transformation 270, 271 overview 362
editing 255 performance detail files
HTTP transformation 270, 271 understanding counters 362
Java transformation 270, 271 performance details
Joiner transformation 272 in performance details file 361
Lookup transformation 277 in Workflow Monitor 361
overview 248 viewing 361
partition types performance details files
changing 256 viewing 361
default 283 performance settings
description 281 session properties 478
key range 288 performance tuning
overview 250 See Performance Tuning Guide
pass-through 291 permissions
performance 282 connection object 27
round-robin 292 connection objects 27
setting 283 database 27
using with partition points 283 editing sessions 120
partitioning persistent variables
incremental aggregation 451 definition 91
Joiner transformation 465 in worklets 114
performance 292 pinging
pipeline lookup source table 278 Integration Service in Workflow Monitor 337
using FTP with multiple targets 444 pipeline lookup
partitioning options partitioning the source table 278
configuring dynamic 145 pipeline partitioning
configuring number 145 adding hash keys 288
session properties 145 adding key ranges 290
partitioning restrictions cache 252
Informix 268 concurrent connections 268
number of partitions 254 configuring a session 255
numerical functions 280 configuring for sorted data 272
relational targets 268 configuring pushdown optimization 310

532 Index
configuring to optimize join performance 272 PM_REC_STATE table
Custom transformation 270 creating manually 224
database compatibility 268 description 223
description 247, 259, 281 real-time sessions 199
dynamic partitioning 250 PM_RECOVERY table
editing partition points 255 creating manually 224
error threshold 131 deadlock retries 171
example of use 282 deadlock retry 223
external loaders 268, 421 description 223
file lists 264 format 223
file sources 263 PM_TGT_RUN_ID
file targets 268 creating manually 224
filter conditions 262 description 223
FTP file targets 447 format 224
guidelines 263 PMError_MSG table
hash auto-keys partitioning 287 schema 396
hash user keys partitioning 287 PMError_ROWDATA table
HTTP transformation 270 schema 394
Java transformation 270 PMError_Session table
Joiner transformation 272 schema 397
key range 288 $PMFailureEmailUser
loading to Informix 268 definition 245
mapping variables 253 tips 246
merging target files 268, 269, 492 PmNullPasswd
message queues 268 reserved word 29
multiple group transformations 249 PmNullUser
numerical functions restrictions 280 IBM DB2 client authentication 29
object validation 254 Oracle OS Authentication 29
on a grid 367 reserved word 29
partition keys 288, 289 $PMSessionLogFile
partitioning indirect files 264 using 133
pass-through partitioning type 291 $PMSuccessEmailUser
performance 287, 289, 292 definition 245
pipeline stage 247 tips 246
recovery 131 $PMWorkflowLogDir
reject file 188 definition 379
relational targets 267 $PMWorkflowRunId
round-robin partitioning 292 concurrent workflows 328
rules 254 $PMWorkflowRunInstanceName
Sequence Generator transformation 279 concurrent workflows 328
session properties 496 post-session command
sorted flat files 273 session properties 501
sorted relational data 275 shell command properties 504
Sorter transformation 276, 279 post-session email
SQL queries 262 See also email
threads and partitions 249 overview 242
Transaction Control transformation 283 parameter and variable types 409
valid partition types 283 session options 505
pipeline stage session properties 501
description 247 post-session shell command
pipelines configuring non-reusable 128
See also source pipelines configuring reusable 129
active sources 178 creating reusable Command task 129
data flow monitoring 364 parameter and variable types 407
description 247, 259, 281 using 127
$PMStorageDir post-session SQL commands
session state of operations 222 entering 127
workflow state of operations 222 post-session variable assignment
$PMWorkflowCount datatype conversion 138
archiving log files 381 performing after failure 137
$PMWorkflowLogDir performing on success 137
archiving workflow logs 381 post-worklet variable assignment
performing 115

Index 533
PowerCenter Repository Reports configuring partitioning 310
viewing in Workflow Manager 83 configuring sessions 309
PowerCenter resources creating database views 305
See resources database sequences 307
PowerChannel database views 307
configuring a database connection 40 error handling 303
PowerChannel database connections Expression transformation 317
configuring 40 expressions 298
PowerExchange Client for PowerCenter Filter transformation 317
real-time data 192 full pushdown optimization 295
Pre 85 Timestamp Compatibility option functions 299
setting 143 Joiner transformation 318
pre- and post-session SQL key range partitioning, using 311
commands, parameter and variable types 406, 407, 408 loading to targets 312
entering 127 logging 303
guidelines 127 Lookup transformation
precision mapping variables 299
flat files 185 mappings 294
writing to file targets 184 merging partitions 311
predefined events native database drivers 295
waiting for 108 ODBC drivers 295
predefined variables operators 298
in Decision tasks 103 overview 293
preparing to run parameter types 408
status 346 pass-through partition type 311
pre-session shell command performance issues 295
configuring non-reusable 128 $$PushdownConfig parameter, using 308
configuring reusable 129 recovery 303
creating reusable Command task 129 Router transformation 320
errors 129 rules and guidelines 312
session properties 501 Sequence Generator transformation 320
using 127 sessions 294
pre-session SQL commands Sorter transformation 321
entering 127 source database partitioning 286
pre-session variable assignment Source Qualifier transformation 321
datatype conversion 138 source-side optimization 294
performing 137 SQL generated 294, 295
pre-worklet variable assignment SQL versus ANSI SQL 295
performing 115 targets 322
printing target-side optimization 294
page setup 7 temporary sequences 305
priorities temporary views 305
assigning to tasks 371 transformations 315
Private Key File Name Union transformation 323
SFTP 35 Update Strategy transformation 324
Private Key File Password Pushdown Optimization Viewer
SFTP 35 viewing pushdown groups 312
privileges
See also PowerCenter Administrator Guide
Properties tab in session properties Q
in Workflow Manager 475
queue connections
Public Key File Name
MSMQ 43
SFTP 35
parameter types 408
$$PushdownConfig
session parameter 133
description 308
testing WebSphere MQ 61
pushdown group
WebSphere MQ 60
viewing 312
$QueueConnection
pushdown groups
using 133
description 312
quoted identifiers
Pushdown Optimization Viewer, using 312
reserved words 176
pushdown optimization
adding transformations to mappings 312
Aggregator transformation 316

534 Index
R recovering workflows
recovering instances by run ID 327
rank cache recovering workflows by instance name 326
description 468 recovery
Rank transformation See also message recovery
See also PowerCenter Transformation Guide completing unrecoverable sessions 236
cache partitioning 462, 468 dropping database sequences 307
caches 468 dropping database views 307
configure caches 469 flat files 231
inputs for cache calculator 469 full recovery 229
using partition points 260 incremental 229
reader overview 221
selecting for Teradata FastExport 160 pipeline partitioning 131
reader time limit PM_RECOVERY table format 223
configuring 195 PM_TGT_RUN_ID table format 224
real-time data pushdown optimization 303
changed source data 192 real-time sessions 197
messages and message queues 192 recovering a task 234
overview 192 recovering a workflow from a task 235
supported products 206 recovering by instance name 326
web service messages 192 recovering workflows by run ID 327
real-time Flush Latency resume from last checkpoint 228, 229
configuring 195 rules and guidelines 235
real-time processing SDK sources 231
description 191 session state of operations 222
sample mapping 204 sessions on a grid 369
real-time products strategies 228
overview 206 target recovery tables 223
real-time sessions validating the session for 230
aborting 202 workflow state of operations 222
cold start 203 workflows on a grid 369
commit type, configuring 196 recovery cache folder
configuring 194 variable types for JMS 409
description 192 variable types for TIBCO 409
flush latency, configuring 195 variable types for webMethods 409
idle time, configuring 195 variable types for WebSphere MQ 409
log files 377 recovery files
message count, configuring 195 message processing 198
message processing 198, 200, 201 message recovery 197, 199
message recovery 199, 200, 201 recovery queues
overview 191 message processing 201
PM_REC_STATE table 199 message recovery 197, 201
reader time limit, configuring 195 recovery strategy
recovering 203 fail task and continue workflow 228, 229
resilience 204 restart task 228, 229
restarting 202 resume from last checkpoint 228, 229
resuming 202 recovery tables
rules and guidelines 203 description 223
sample mapping 204 manually creating from scripts 224
session logs 377 message processing 200
stopping 202 message recovery 197, 200
supported products 206 recovery topics
terminating conditions, configuring 195 message processing 201
transformation scope 215 message recovery 197, 201
transformations 203 recreating
truncating target tables 171 indexes 172
recoverable tasks reinitializing
description 227 aggregate cache 450
recovering reject file
sessions containing Incremental Aggregator 222 changing names 188
sessions from checkpoint 229 column indicators 189
with repeatable data in sessions 230 locating 188
Oracle external loader 425

Index 535
parameter and variable types 409 reserved words
pipeline partitioning 188 generating SQL with 176
reading 188 resword.txt 176
row indicators 189 reserved words file
session parameter 133 creating 176
session properties 168, 180, 491, 493 resilience
transaction control 212 database connection 30
viewing 188 FTP 35
reject file directory real-time sessions 204
parameter and variable types 409 WebSphere MQ, configuring 61
target file properties 269 resources
Reject File Name assigning external loader 420
description 269 assigning to tasks 372
flat file target property 180 restart task recovery strategy
relational connections description 228, 229
See database connections restarting tasks
relational database connections in Workflow Monitor 343
See database connections restarting tasks and workflows without recovery
relational database logging in Workflow Monitor 344
error log type, configuring 401 resume from last checkpoint
relational databases recovery strategy 228, 229
copying a relational database connection 32 resume recovery strategy
replacing a relational database connection 33 using recovery target tables 223
relational sources using repeatable data 230
session properties 149 retry period
relational targets FTP 35
partitioning 267 reusable sessions
partitioning restrictions 268 caches 459
session properties 166, 491 reusable tasks
Relative time inherited changes 98
specifying 109 reverting changes 98
Timer task 109 reverting changes
reload task or workflow tasks 98
configuring 4 RFC/BAPI application connections
removing See also connections
Integration Service 3 configuring 50
renaming rmail
repository objects 10 See also email
repeat options configuring 238
customizing 75 rolling back data
repeatable data transaction control 212
recovering workflows 230 round-robin partitioning
with sources 230 description 250, 281, 292
with transformations 231 Router transformation
repositories See also PowerCenter Transformation Guide
adding 9 pushdown optimization 320
connecting in Workflow Monitor 337 row error logging
entering descriptions 10 active sources 178
repository folder row indicators
monitoring details in the Workflow Monitor 355 reject file 189
repository notifications rows to skip
receiving 5 delimited files 487
repository objects run options
comparing 13 run continuously 75
configuring 9 run on demand 75
rename 10 service initialization 75
Repository Service running
monitoring details in the Workflow Monitor 354 status 346
notification in Workflow Monitor 340 workflows 79
notifications 5 runtime location
Request Old (property) variable types 406
TIB/Rendezvous application connections, configuring 56 runtime partitioning
setting in session properties 251

536 Index
S in parameter files 404
service variables
Salesforce application connections email 245
See also connections in parameter files 404
accessing Sandbox 46 session
configuring 45 state of operations 222
SAP ALE IDoc Reader application connections session command settings
See also connections session properties 502
configuring 49 session configuration objects
SAP ALE IDoc Writer application connections creating 146
See also connections session properties 141
configuring 49 understanding 141
SAP NetWeaver application connections using in a session 146
configuring 46 session errors
SAP NetWeaver BI application connections handling 131
configuring 51 session events
SAP R/3 passing to an external library 377
ABAP integration 48 $PMSessionLogCount
ALE integration 49 archiving session logs 382
SAP R/3 application connections session log count
See also connections variable types 408
configuring 47 $PMSessionLogDir
stream and file mode sessions 47 archiving session logs 381
stream mode sessions 47 session log files
scheduled archiving 379
status 346 time stamp 380
scheduling Session Log Interface
concurrent workflows 73 description 387
configuring 75 functions 388
creating reusable scheduler 74 guidelines 388
disabling workflows 76 implementing 387
editing 76 INFA_AbnormalSessionTermination 391
end options 75 INFA_EndSessionLog 390
error message 74 INFA_InitSessionLog 389
run every 75 INFA_OutputSessionLogFatalMsg 390
run once 75 INFA_OutputSessionLogMsg 389
run options 75 Integration Service calls 387
schedule options 75 session logs
start date 75 changing locations 381
start time 75 changing name 381
workflows 73, 513 directory, variable types 407
scheduling workflows enabling and disabling 381
concurrent workflows 332 external loader error messages 420
script files file name, parameter types 407
parameter and variable types 407 generating using UTF-8 379
SDK sources Integration Service version and build 383
recovering 231 location 379, 476
searching naming 379
versioned objects in the Workflow Manager 11 passing to external library 387
Workflow Manager 8 real-time sessions 377
Workflow Monitor 348 sample 383
Sequence Generator transformation saving 143
See also PowerCenter Transformation Guide session parameter 133
adding to concurrent workflows 333 tracing levels 384
partitioning 279 viewing in Workflow Monitor 345
partitioning guidelines 261, 280 workflow recovery 235
pushdown optimization 320 session on grid
sequential merge description 366
file targets 270 partitioning for Sequence Generator transformations 279
service levels session on grid settings
assigning to tasks 371 session properties 145
service process variables session parameter file name
in Command tasks 128 variable types 408, 413

Index 537
session parameters target connection settings 480, 490
application connection parameter 133 target connections 166
built-in 132 target load options 174, 491
database connection parameter 133 target-based commit 218
external loader connection parameter 133 targets 165
file name, variable types 408, 413 Transformation node 495
FTP connection parameter 133 transformations 495
in parameter files 404 session recovery data flush
naming conventions 133 message recovery 199
number of partitions 133 session retry on deadlock
overview 132 See also PowerCenter Administrator Guide
passing values between sessions 137 overview 171
queue connection parameter 133 session statistics
reject file parameter 133 viewing in the Workflow Monitor 357
session log parameter 133 sessions
setting as a resource 136 See also session properties
source file parameter 133 aborting 83, 130
target file parameter 133 apply attributes to all instances 121
user-defined 132 assigning resources 372
session properties assigning variables pre- and post-session 137
advanced settings 142 assigning variables pre- and post-session, procedure 139
buffer sizes 142 configuring for multiple source files 158
Components tab 501 configuring for pushdown optimization 309
Config Object tab overview 141 configuring to optimize join performance 272
constraint-based loading 142, 174 creating 120
delimited files, sources 154 datatype conversion when passing variables 138
delimited files, targets 182 definition 119
edit delimiter 486, 494 description 96
edit null character 494 distributing over grids 366, 369
email 242, 504 editing 120
error handling settings 144 email 237
external loader 22 external loading 419, 439
FastExport sources 161 failure 131, 254
fixed-width files, sources 153 full pushdown optimization 295
fixed-width files, targets 181 metadata extensions in 15
FTP files 480, 490 monitoring counters 362
general settings 474 monitoring details 359
General tab 474 multiple source files 157
log option settings 143 overriding connection attributes 24
lookup caches 142 overriding source table name 151, 483
Metadata Extensions tab 506 overriding target table name 176, 491
null character, targets 181 overview 119
on failure email 242 parameters 132
on success email 242 passing information between 137
output files, flat file 492 passing information between, example 137
partition attributes 256 properties reference 473
partitioning options settings 145 pushdown optimization 294
Partitions View 496 recovering on a grid 369
performance settings 478 running on a grid 366
post-session email 242 source-side pushdown optimization 294
post-session shell command 504 stopping 83, 130
Properties tab 475 target-side pushdown optimization 294
reject file, flat file 180, 493 task progress details 357
reject file, relational 168, 491 test load 169, 181
relational sources 149 truncating target tables 170
relational targets 166 using FTP 445
session command settings 502 using SFTP 445
session on grid settings 145 validating 129, 130
session retry on deadlock 171 viewing details in the Workflow Monitor 359
sort order 451 viewing failure information in the Workflow Monitor 359
source connections 149 viewing performance details 361
sources 148 viewing statistics in the Workflow Monitor 357
table name prefix 176

538 Index
Set Control Value source commands
PeopleSoft, parameter and variable types 406 generating file list 152
Set File Properties generating source data 152
description 152, 180 $Source connection value
SetID setting 23, 476
PeopleSoft, parameter and variable types 406 parameter and variable types 408
SFTP source data
See also FTP capturing changes for aggregation 449
authentication methods 35 source databases
configuring connection 35 database connection session parameter 133
creating a session 445 Source File Directory
defining Private Key File Name 35 description 446
defining Private Key File Password 35 Source File Name
defining Public Key File Name 35 description 152, 265, 446
description 443 Source File Type
key file location 445 description 152, 265, 446
running a session on a grid 445 source files
shared library accessing through FTP 443, 445
implementing the Session Log Interface 388 configuring for multiple files 157, 158
passing log events to an external library 377 delimited properties 487
shell commands fixed-width properties 485
executing in Command tasks 102 session parameter 133
make reusable 129 session properties 152, 265, 484
parameter and variable types 407 using parameters 133
post-session 127 wildcard characters 153
post-session properties 504 source location
pre-session 127 session properties 152, 265, 484
using Command tasks 100 source pipelines
using parameters and variables 101, 128 description 247, 259, 281
shortcuts Source Qualifier transformation
keyboard 17 See also PowerCenter Transformation Guide
Siebel application connections pushdown optimization 321
See also connections pushdown optimization, SQL override 305
configuring 53 using partition points 260
sleep source tables
FastExport attribute 160 overriding table name 151, 483
sort order parameter and variable types 406
See also session properties source-based commit
affecting incremental aggregation 451 active sources 209
preserving for input rows 267 configuring 196
sorted flat files description 208
partitioning for optimized join performance 273 real-time sessions 196
sorted ports sources
caching requirements 463 code page 155
sorted relational data code page, flat file 153
partitioning for optimized join performance 275 commands 152, 263
sorter cache connections 149
description 469 delimiters 154
naming convention 457 dynamic files names 153
Sorter transformation escape character 487
See also PowerCenter Transformation Guide generating file list 153
cache partitioning 462, 469 generating with command 152
caches 469 line sequential buffer length 155
inputs for cache calculator 470 monitoring details in the Workflow Monitor 360
partitioning 279 multiple sources in a session 157
partitioning for optimized join performance 276 null characters 153, 156, 485
pushdown optimization 321 overriding source table name 151, 483
work directory, variable types 407 overriding SQL query, session 150
$Source partitioning 263
how Integration Service determines value 23 preserving input row sort order 267
multiple sources 23 quote character 487
session properties 476 reading concurrently 265
resilience 30

Index 539
session properties 148, 264 stop on errors
specifying code page 485, 487 session property 144
wildcard characters 153 stopped
source-side pushdown optimization status 346
description 294 stopping
SQL in Workflow Monitor 344
configuring environment SQL 29 Integration Service handling 82
generated for pushdown optimization 294, 295 sessions 83
guidelines for entering environment SQL 30 status 346
overriding query at session level 150 tasks 82
queries in partitioned pipelines 262 using Control tasks 103
SQL override workflows 82
pushdown optimization 305 Stored Procedure transformation
SQL query call text, parameter and variable types 407
overriding at session level 150 connection information, parameter and variable types 407
parameter and variable types 407 stream mode
staging files SAP R/3 application connections 47
SAP file name and directory, variable types 409 subject
start date and time default subject for TIBCO messages 56
scheduling 75 subseconds
Start tasks external loading 420
definition 63 Oracle external loader 426
Start Workflow Advanced trimming for pre-8.5 compatibility 143
starting concurrent workflows 329 succeeded
starting status 346
selecting a service 68 Suspend on Error
start from task 80 description 510
starting a part of a workflow 80 suspended
starting tasks 80 status 81, 346
starting workflows using Workflow Manager 79 suspending
Workflow Monitor 337 behavior 81
workflows 79 email 81, 244
state of operations status 81, 346
checkpoints 222, 229 workflows 81
session recovery 222 suspension email
workflow recovery 222 variable types 409
statistics Sybase ASE
for Workflow Monitor 339 commit interval 175
viewing 339 connect string example 29
status Sybase IQ
aborted 346 partitioning restrictions 268
aborting 346 Sybase IQ external loader
disabled 346 attributes 428
failed 346 connections 36
in Workflow Monitor 346 data precision 428
preparing to run 346 delimited flat file targets 427
running 346 fixed-width flat file targets 428
scheduled 346 multibyte data 427
stopped 346 optional quotes 427
stopping 346 overview 427
succeeded 346 support 419
suspended 81, 346 system resource usage
suspending 81, 346 Integration Service Monitor 355
tasks 346
terminated 346
terminating 346 T
unknown status 347
table name prefix
unscheduled 347
relational error logs, length restriction 416
waiting 347
relational error logs, parameter and variable types 408
workflows 346
target owner 176
stop on
target, parameter and variable types 409
error threshold 131
pre- and post-session SQL errors 127

540 Index
table names commands 180
overriding source table name 151, 483 connection settings 490
overriding target table name 176, 491 connections 166
table owner name database connections 164
parameter and variable types 408 deleting partition points 260
session properties 151 delimiters 182
targets 176 file writer 165
$Target globalization features 164
how Integration Service determines value 23 heterogeneous 187
multiple targets 23 load, session properties 174, 491
session properties 477 merging output files 268, 269
target commands monitoring details in the Workflow Monitor 360
processing target data 180 multiple connections 187
targets 269 multiple types 187
using with partitions 269 null characters 182
target connection groups output files 180
committing data 208 overriding target table name 176, 491
constraint-based loading 173 partitioning 267, 268
description 177 processing with command 180
Transaction Control transformation 217 pushdown optimization 322
target connection settings relational settings 491
session properties 480, 490 relational writer 165
$Target connection value resilience 30
setting 23, 477 session properties 165, 166
parameter and variable types 408 specifying null character 494
target databases truncating tables 170
database connection session parameter 133 truncating tables, real-time sessions 171
target files using pushdown optimization 312
appending 269 writers 165
delimited 495 target-side pushdown optimization
fixed-width 494 description 294
session parameter 133 Task Developer
target load order creating tasks 96
constraint-based loading 173 displaying and hiding tool name 4
target owner Task view
table name prefix 176 configuring 341
target properties displaying 349
bulk mode 167 filtering 349
test load 167 hiding 341
update strategy 167 opening and closing folders 339
using with source properties 169 overview 335
target recovery tables using 349
description 223 tasks
manually creating 224 aborted 346
target tables aborting 82, 346
overriding table name 176, 491 adding in workflows 66
parameter and variable types 406 arranging 9
truncating 170 assigning resources 372
truncating, real-time sessions 171 Assignment tasks 100
target update automatic recovery 229
parameter and variable types 406 cold start 344
target-based commit Command tasks 100
configuring 196 configuring 97
real-time sessions 196 Control task 103
WriterWaitTimeOut 208 copying 12
target-based commit interval creating 96
description 208 creating in Task Developer 96
targets creating in Workflow Designer 97
See also target connection groups Decision tasks 103
accessing through FTP 443, 445 disabled 346
code page 182, 494, 495 disabling 99
code page compatibility 164 email 240
code page, flat file 182 Event-Raise tasks 105

Index 541
Event-Wait tasks 105 Teradata Warehouse Builder
failed 346 attributes 437
failing parent workflow 99 operators 437
in worklets 113 terminated
inherited changes 98 status 346
instances 98 terminating
list of 95 status 346
Load Balancer settings 371 terminating conditions
monitoring details 357 configuring 195
non-reusable 66 Terse tracing levels
overview 95 See also PowerCenter Designer Guide
preparing to run 346 defined 384
promoting to reusable 97 test load
recovery strategies 228 bulk loading 169
restarting in Workflow Monitor 343 enabling 476
restarting without recovery in Workflow Monitor 344 file targets 181
reusable 66 number of rows to test 476
reverting changes 98 relational targets 169
running 346 threads
show full name 4 Custom transformation 271
starting 80 HTTP transformation 271
status 346 Java transformation 271
stopped 346 partitions 249
stopping 82, 346 TIB/Adapter SDK application connections
stopping and aborting in Workflow Monitor 344 See also connections
succeeded 346 configuring 57
Timer tasks 109 properties 56
using Tasks toolbar 66 TIB/Rendezvous application connections
validating 77 See also connections
Tasks toolbar configuring 57
creating tasks 97 properties 55
TDPID TIB/Repository
description 160 TIB/Adapter SDK repository URL, variable types 406
temporary file TIBCO application connections
Teradata FastExport attribute 161 configuring 54
tenacity time
FastExport attribute 160 configuring 3
Teradata formats 3
connect string example 29 time increments
Teradata external loader Workflow Monitor 348
code page 429 time stamps
connections 36 session log files 380
control file content override, parameter and variable types 409 session logs 382
date format 429 workflow log files 380
FastLoad attributes 435 workflow logs 381
MultiLoad attributes 431 Workflow Monitor 335
overriding the control file 429 time window
support 419 configuring 341
Teradata Warehouse Builder attributes 437 Timer tasks
TPump attributes 433 absolute time 109
Teradata FastExport definition 109
changing the source connection 160 description 96
connection attributes 160 relative time 109
creating a connection 159 subseconds in variables 109
description 159 variables in 85, 407
overriding the control file 161 tool names
rules and guidelines 162 displaying and hiding 4
selecting the reader 160 toolbars
session attributes description 161 adding tasks 66
staging data 161 creating tasks 97
steps for using 159 using 8
TDPID attribute 160 Workflow Manager 8
temporary file, variable types 409 Workflow Monitor 342

542 Index
tracing levels using with target properties 169
See also PowerCenter Designer Guide Treat Source Rows As property
Normal 384 overview 150
overriding in the session 144 trees
session 384 PeopleSoft, parameter and variable types 406
setting 384 truncating
Terse 384 Table Name Prefix 171
Verbose Data 384 target tables 170
Verbose Initialization 384
transaction
defined 215 U
transaction boundary
unconnected transformations
dropping 215
partitioning restrictions 261
transaction control 215
Union transformation
transaction control
See also PowerCenter Transformation Guide
bulk loading 212
pushdown optimization 323
end of file 212
UNIX systems
Integration Service handling 212
email 238
open transaction 215
external loader behavior 420
overview 215
unknown status
points 215
status 347
real-time sessions 215
unscheduled
reject file 212
status 347
rules and guidelines 217
unscheduling
transformation error 212
workflows 74
transformation scope 215
update strategy
user-defined commit 212
target properties 167
Transaction Control transformation
Update Strategy transformation
partitioning guidelines 283
See also PowerCenter Transformation Guide
target connection groups 217
constraint-based loading 173
transaction control unit
pushdown optimization 324
See also target connection group
using with target and source properties 169
description 217
updating
transaction environment SQL
incrementally 452
configuring 29, 30
URL
parameter and variable types 409
adding through business documentation links 72
transaction generator
user-defined commit
active sources 178
See also transaction control
effective and ineffective 178
bulk loading 212
transaction control points 215
user-defined events
transformation expressions
declaring 107
parameter and variable types 406
example 106
transformation scope
waiting for 108
description 215
user-defined joins
real-time processing 215
parameter and variable types 407
transformations 216
transformations
caches 456
configuring pushdown optimization 315
V
partitioning restrictions 280 validating
producing repeatable data 231 expressions 72, 77
real-time sessions 203 multiple sessions 130
recovering sessions with Incremental Aggregator 222 session for recovery 230
session properties 495 tasks 77
Transformations node workflows 77, 78
properties 495 worklets 116
Transformations view variable values
session properties 479 calculating across partitions 253
Treat Error as Interruption variables
See Suspend on Error $PMWorkflowRunId 328
effect on worklets 111 $PMWorkflowRunInstanceName 328
Treat Source Rows As defining in parameter files 405
bulk loading 174 email 242

Index 543
in Command tasks 101 Windows Start Menu
input fields that accept variables 406 accessing Workflow Monitor 337
overview of types 404 Windows systems
workflow 85 email 239
Verbose Data tracing level external loader behavior 420
See also PowerCenter Designer Guide logon network security 240
configuring session log 384 workflow
Verbose Initialization tracing level state of operations 222
See also PowerCenter Designer Guide Workflow Composite Report
configuring session log 384 viewing 83
versioned objects Workflow Designer
See also PowerCenter Repository Guide creating tasks 97
Allow Delete without Checkout option 6 displaying and hiding tool name 4
checking in 10 workflow instance
checking out 10 adding workflow instances 329
comparing versions 11 creating dynamically 330
searching for in the Workflow Manager 11 starting and stopping 329
viewing 11 starting from command line 330
viewing multiple versions 11 using $PMWorkflowRunInstanceName variable 328
viewing viewing in Workflow Monitor 331
older versions of objects 11 workflow log files
reject file 188 archiving 379
views configuring 380
See also database views time stamp 380
viewing concurrent workflows 331
workflow logs
W changing locations 381
changing name 380
waiting
enabling and disabling 380
status 347
file name and directory, variable types 409
web links
locating 379
adding to expressions 72
naming 379
web service messages
viewing in Workflow Monitor 345
real-time data 192
workflow log count, variable types 409
Web Services application connections
Workflow Manager
See also connections
adding repositories 9
configuring 57
arrange 9
endpoint URL 57
checking out and in versioned objects 10
Web Services Hub
configuring for multiple source files 158
running concurrent workflows 327
connections overview 19
webMethods application connections
copying 12
See also connections
CPI-C connection 47
configuring 59
customizing options 4
WebSphere MQ queue connections
database connections 31, 40
See also connections
date and time formats 3
configuring 60
display options 4
testing 61
entering object descriptions 10
wildcard characters
external loader connections 36
configuring source files 153
FTP connections 34
windows
general options 4
customizing 8
JMS connections 42
displaying and closing 8
JNDI connections 42
docking and undocking 8
messages to Workflow Monitor 340
fonts 5
MSMQ queue connections 43
Navigator 2
overview 1
Output 2
PeopleSoft connections 44
overview 2
printing the workspace 7
panning 4
relational database connections 28
reloading 4
RFC/BAPI connections 50
Workflow Manager 2
running sessions on a grid 365
Workflow Monitor 335
running workflows on a grid 365
workspace 2
Salesforce connections 45
SAP ALE IDoc Reader connections 49

544 Index
SAP ALE IDoc Writer connections 49 stopping or aborting tasks and workflows 344
SAP NetWeaver BI connections 51 switching views 336
SAP NetWeaver connections 46 Task view 335
SAP R/3 connections 47 task view options 341
searching 8 time 335
searching for versioned objects 11 time increments 348
SFTP connections 34 toolbars 342
Siebel connections 53 viewing command task details 358
TIB/Adapter SDK connections 57 viewing concurrent workflows 331
TIB/Rendezvous connections 57 viewing folder details 355
TIBCO connections 54 viewing history names 346
toolbars 8 viewing Integration Service details 354
tools 2 viewing performance details 361
validating sessions 130 viewing repository details 354
versioned objects 10 viewing session details 359
viewing reports 83 viewing session failure information 359
Web Services connections 57 viewing session logs 345
webMethods connections 59 viewing session statistics 357
WebSphere MQ connections 60 viewing source details 360
windows 2, 8 viewing target details 360
zooming the workspace 9 viewing task progress details 357
Workflow Monitor viewing workflow details 356
See also monitoring viewing workflow logs 345
advanced options 341 viewing worklet details 357
closing folders 339 workflow and task status 346
cold start tasks or workflows 344 workflow properties
configuring 340 configuring 510
connecting to Integration Service 337 Events tab 517
connecting to repositories 337 General tab 510
customizing columns 341 Metadata Extension tab 518
deleted Integration Services 337 Properties tab 511
deleted tasks 338 Schedule tab 513
disconnecting from an Integration Service 337 service levels 371
displaying services 338 suspension email 244
filtering deleted tasks 338 Variables tab 516
filtering services 338 workflow run ID
filtering tasks in Task View 338, 349 description 327
Gantt Chart view 335 viewing in the workflow log 332
Gantt chart view options 341 workflow tasks
general options 340 reusable and non-reusable 98
hiding columns 341 workflow variables
hiding services 338 built-in variables 86
icon 337 creating 92
launching 337 datatypes 86, 92
launching automatically 5 datetime formats 92
listing tasks and workflows 347 default values 88, 91
Maximum Days 340 in parameter files 404
Maximum Workflow Runs 340 keywords 86
monitor modes 337 naming convention 92
navigating the Time window 347 non-persistent variables 91
notification from Repository Service 340 passing values to and from sessions 137
opening folders 339 passing values to and from worklets 114
overview 335 persistent variables 91
performing tasks 342 predefined 86
pinging the Integration Service 337 start and current values 91
receive messages from Workflow Manager 340 user-defined 90
resilience to Integration Service 337 using 85
restarting tasks or workflows without recovery 344 using in expressions 88
restarting tasks, workflows, and worklets 343 workflows
searching 348 aborted 346
Start Menu 337 aborting 82, 346
starting 337 adding tasks 66
statistics 339 assigning Integration Service 69

Index 545
branches 64 Workflow Monitor maximum days 340
cold start workflows 344 Worklet Designer
configuring concurrent with same name 327 displaying and hiding tool name 4
configuring instance names 329 worklet variables
configuring unique instances 326 in parameter files 404
copying 12 passing values between worklets 114
creating 65 passing values to and from sessions 137
definition 63 worklets
deleting 66 adding tasks 113
developing 64, 65 adding to concurrent workflows 332
disabled 346 assigning variables pre- and post-worklet 114
disabling 76 assigning variables pre- and post-worklet, procedure 115
dispatching tasks 371 configuring properties 112
distributing over grids 366, 369 create non-reusable worklets 112
editing 66 create reusable worklets 112
email 241 declaring events 113
events 63 developing 111
fail parent workflow 99 email 241
failed 346 fail parent worklet 99
guidelines 64 metadata extensions in 15
links 63 monitoring details in the Workflow Monitor 357
metadata extensions in 15 overriding variable value 114
monitor 64 overview 111
override Integration Service 79 parameters tab 114
override operating system profile 79 passing information between 114
overview 63 passing information between, example 115
parameter file 91 persistent variable example 114
preparing to run 346 persistent variables 114
properties reference 509 restarting in Workflow Monitor 343
recovering on a grid 369 status 346
restarting in Workflow Monitor 343 suspended 346
restarting without recovery in Workflow Monitor 344 suspending 111, 346
run type 357 validating 116
running 79, 346 variables 114
running on a grid 366 waiting 347
scheduled 346 workspace
scheduling 73 colors 5
scheduling concurrent instances 73 colors, setting 5
scheduling concurrent workflows 332 file directory 4
selecting a service 64 fonts, setting 5
service levels 371 navigating 7
starting 79 printing 7
starting concurrent workflows with pmcmd 330 zooming 9
starting with advanced options 79 writers
status 81, 346 session properties 488
stopped 346 WriterWaitTimeOut
stopping 82, 346 target-based commit 208
stopping and aborting in Workflow Monitor 344 writing
succeeded 346 multibyte data to files 185
suspended 346 to fixed-width files 183, 184
suspending 81, 346
suspension email 244
terminated 346 X
terminating 346
XML sources
unknown status 347
numeric data handling 157
unscheduled 347
XML target
unscheduling 74
caches 470
using tasks 95
configure caches 470
validating 77
XML target cache
variables 85
description 470
viewing details in the Workflow Monitor 356
variable types 406
viewing reports 83
waiting 347

546 Index
XML targets
active sources 178
partitioning restrictions 280
target-based commit 208

Z
zooming
Workflow Manager 9

Index 547
548 Index

S-ar putea să vă placă și