Documente Academic
Documente Profesional
Documente Cultură
By PenchalaRaju.Yanamala
Transformation type:
Active/Passive
Connected
When Data Transformation Engine runs a service, it writes the output data, or it
returns output data to the Integration Service. When Data Transformation Engine
returns output to the Integration Service, it returns XML data. You can configure
the Unstructured Data transformation to return the XML in an output port, or you
can configure output groups to return row data.
Configuring the Unstructured Data Option
1.Install PowerCenter.
Install Data Transformation. For information about installing Data
2.Transformation, see the Data Transformation Administrator Guide.
3.Configure the Data Transformation repository folder.
<Data_Transformation_install_dir>\ServiceDB
If Data Transformation Studio can access the remote file system, you can
change the Data Transformation repository to a remote location and deploy
services directly from Data Transformation Studio to the system that runs the
Integration Service. For more information about deploying services to remote
machines, see the Data Transformation Studio User Guide.
When you create a project in Data Transformation Studio, you choose a Data
Transformation service type to define the project. Data Transformation has the
following types of services that transform data:
For more information about creating projects with Data Transformation, see
Getting Started with Data Transformation.
Properties Tab
Configure the Unstructured Data transformation general properties on the
Properties tab.
The following table describes properties on the Properties tab that you can
configure:
Property Description
Tracing Level The amount of detail included in the session log when you run a
session containing this transformation. Default is Normal.
IsPartitionable The transformation can run in more than one partition. Select one
of the following options:
-No. The transformation cannot be partitioned.
Locally. The transformation can be partitioned, but the Integration
-Service must run all partitions in the pipeline on the same node.
Across Grid. The transformation can be partitioned, and the
Integration Service can distribute each partition to different
-nodes.
Default is Across Grid.
Output is The order of the output data is consistent between session runs.
Repeatable Never. The order of the output data is inconsistent between
-session runs.
Based On Input Order. The output order is consistent between
session runs when the input data order is consistent between
-session runs.
Always. The order of the output data is consistent between
session runs even if the order of the input data is inconsistent
-between session runs.
Default is Never for active transformations. Default is Based On
Input Order for passive transformation runs.
Output is Indicates whether the transformation generates consistent output
Deterministic data between session runs. Enable this property to perform
recovery on sessions that use this transformation.
The following table describes the attributes on the UDT settings tab:
Attribute Description
InputType Type of input data that the Unstructured Data transformation
passes to Data Transformation Engine. Choose one of the
following input types:
-Buffer. The Unstructured Data transformation receives source
data in the InputBuffer port and passes data from the port to Data
Transformation Engine.
File. The Unstructured Data transformation receives a source file path in
the InputBuffer port and passes the source file path to Data
-Transformation Engine. Data Transformation Engine opens the source
file.
OutputType Type of output data that the Unstructured Data transformation or
Data Transformation Engine returns. Choose one of the following
output types:
Buffer. The Unstructured Data transformation returns XML data
through the OutputBuffer port unless you configure a relational
hierarchy of output ports. If you configure a relational hierarchy of
-ports, the Unstructured Data transformation does not write to the
OutputBuffer port.
File. Data Transformation Engine writes the output to a file. It
does not return the data to the Unstructured Data transformation
-unless you configure a relational hierarchy of ports in the
Unstructured Data transformation.
Splitting.The Unstructured Data transformation splits a large XML
-output file into smaller files that can fit in the OutputBuffer port. You
must pass the split XML files to the XML Parser transformation.
ServiceName Name of the Data Transformation service to run. The service must
be present in the local Data Transformation repository.
Streamer Buffer size of the data that the Unstructured Data transformation
Chunk Size passes to Data Transformation Engine when the Data
Transformation service runs a streamer. Valid values are 1-
1,000,000 KB. Default is 256 KB.
Dynamic Run a different Data Transformation service for each input row.
Service Name When Dynamic Service Name is enabled, the Unstructured Data
transformation receives the service name in the Service Name
input port.
When Dynamic Service name is disabled, the Unstructured Data
transformation runs the same service for each input row. The
Service Name attribute in the UDT Settings must contain a service
name. Default is disabled.
Status Tracing Set the level of status messages from the Data Transformation
Level service.
Description Only. Return a status code and a short description to
-indicate if the Data Transformation service was successful or if it
failed.
-Full Status. Return a status code and a status message from the
Data Transformation service in XML.
-None. Do not return status from the Data Transformation service.
Default is none.
You can view status messages from the Data Transformation service. Set the
status tracing level to Description Only or Full Status. The Designer creates the
UDT_Status_Code port and the UDT_Status_Message output ports in the
Unstructured Data transformation.
When you choose Description Only, Data Transformation Engine returns a status
code and one of the following status messages:
Status Code Status Message
1 Success
2 Warning
3 Failure
4 Error
5 Fatal Error
When you choose Full Status, Data Transformation Engine returns a status code
and the error message from the Data Transformation service. The message is in
XML format.
Table 27-2 describes other Unstructured Data transformation ports that the
Designer creates when you configure the transformation:
The input type determines the type of data that the Integration Service passes to
Data Transformation Engine. The input type determines whether the input is data
or a source file path.
If you do not define output groups and ports, the Unstructured Data
transformation returns data based on the output type.
Adding Ports
A Data Transformation service might require multiple input files, file names, and
parameters. It can return multiple output files. When you create an Unstructured
Data transformation, the Designer creates one InputBuffer port and one
OutputBuffer port. If you need to pass additional files or file names between the
Unstructured Data transformation and Data Transformation Engine, add the input
or output ports. You can add ports manually or from the Data Transformation
service.
The following table describes the ports you can create on the UDT Ports tab:
Note: You must configure a service name to populate ports from a service.
To run a different Data Transformation service for each source row, enable the
Dynamic Service Name attribute. Pass the service name with each source row.
The Designer creates the ServiceName input port when you enable dynamic
service names.
When you enable dynamic service names, you cannot create ports from a Data
Transformation service.
Relational Hierarchies
To pass row data to relational tables or other targets, configure output ports on
the Relational Hierarchy tab. You can define groups of ports and define a
relational structure for the groups.
When you configure output groups, the output groups represent the relational
tables or the targets that you want to pass the output data to. Data
Transformation Engine returns rows to the group ports instead of writing an XML
file to the OutputBuffer port. The transformation writes rows based on the output
type.
Create a hierarchy of groups in the left pane of the Relational Hierarchy tab. All
groups are under the root group called PC_XSD_ROOT. You cannot delete the
root. Each group can contain ports and other groups. The group structure
represents the relationship between target tables. When you define a group
within a group, you define a parent-child relationship between the groups. The
Designer defines a primary key-foreign key relationship between the groups with
a generated key.
Select a group to display the ports for the group. You can add or delete ports in
the group. When you add a port, the Designer creates a default port
configuration. Change the port name, datatype, and precision. If the port must
contain data select Not Null. Otherwise, the output data is optional.
When you view the Unstructured Data transformation in the workspace, each
port in a transformation group has a prefix that contains the group name.
When you delete a group, you delete the ports in the group and the child groups.
To export the group hierarchy from the Relational Hierarchy tab, click Export to
XML Schema. Choose a name and a location for the .xsd file. Choose a location
that you can access when you import the schema with Data Transformation
Studio.
The Designer creates a XML schema file with the following namespace:
"www.informatica.com/UDT/XSD/<mappingName_<Transformation_Name>>"
<!-- ===== This file has been generated by Informatica PowerCenter ===== -->
If you modify the schema, the Data Transformation Engine might return data that
is not the same format as the output ports in the Unstructured Data
transformation.
The XML elements in the schema represent the output ports in the hierarchy.
Columns that can contain null values have a minOccurs=0 and maxOccurs=1
XML attribute
Mappings
The Data Transformation Serializer component can generate any output from
XML. It can generate HTML or binary files such as Microsoft Word or Microsoft
Excel. When the output is binary data, Data Transformation Engine writes the
output to a file instead of passing it back to the Unstructured Data transformation.
You can extract order information from a Microsoft Word document and write the
order information to an order header table and an order detail table. Configure an
Unstructured Data transformation to call a Data Transformation parser service
and pass the name of each Word document to parse. The Data Transformation
Engine opens the Word document, parses it, and returns the rows to the
Unstructured Data transformation. The Unstructured Data transformation passes
the order header and order details to the relational targets.
Source Qualifier transformation. Passes each Microsoft Word file name to the
Unstructured Data transformation. The source file name contains the complete
path to the file that contains order information.
Unstructured Data transformation. The input type is file. The output type is
buffer. The transformation contains an order header output group and an order
detail output group. The groups have a primary key-foreign key relationship.
The Unstructured Data transformation receives the source file name in the
InputBuffer port. It passes the name to Data Transformation Engine. Data
Transformation Engine runs a parser service to extract the order header and
order detail rows from the Word document. Data Transformation Engine returns
the data to the Unstructured Data transformation. The Unstructured Data
transformation passes data from the order header group and order detail group
to the relational targets.
Relational targets. Receive the rows from the Unstructured Data
transformation.
You can extract employee names and addresses from an XML file and create a
Microsoft Excel sheet with the list of names.
The Data Transformation Parser and Mapper components can transform data
from any format and generate XML data. When the XML data is large, you can
split the XML into segments and pass the segments to an XML Parser
transformation. The XML Parser transformation receives the segments and
processes the XML data as one document.
When you configure the Unstructured Data transformation to split XML output,
the Unstructured Data transformation returns XML based on the OutputBuffer
port size. If the XML file size is greater than the output port precision, the
Integration Service divides the XML into files equal to or less than the port size.
The XML Parser transformation parses the XML and passes the rows to
relational tables or other targets.
For example, you can extract the order header and detail information from
Microsoft Word documents with a Data Transformation parser service.
Source Qualifier transformation. Passes the Word document file name to the
Unstructured Data transformation. The source file name contains the complete
path to the file that contains order information.
Unstructured Data transformation. The input type is file. The output type is
splitting. The Unstructured Data transformation receives the source file name in
the InputBuffer port. It passes the file name to Data Transformation Engine.
Data Transformation Engine opens the source file, parses it, and returns XML
data to the Unstructured Data transformation.
The Unstructured Data transformation receives the XML data, splits the XML file
into smaller files, and passes the segments to an XML Parser transformation.
The Unstructured Data transformation returns data in segments less than the
OutputBuffer port size. When the transformation returns XML data in multiple
segments, it generates the same pass-through data for each row. The
Unstructured Data transformation returns data in pass-through ports when a row
is successful or not successful.
The XML Parser transformation. The Enable Input Streaming session property
is enabled. The XML Parser transformation receives the XML data in the
DataInput port. The input data is split into segments. The XML Parser
transformation parses the XML data into order header and detail rows. It passes
order header and detail rows to relational targets. It returns the pass-through
data to a Filter transformation.
Filter transformation. Removes the duplicate pass-through data before
passing it to the relational targets.
Relational targets. Receive data from each group in the XML Parser
transformation and the Filter transformation.
Use the following rules and guidelines when you create an unstructured data
mapping:
When you configure hierarchical groups of output ports, the Integration Service
writes to the groups of ports instead of writing to the OutputBuffer port. The
Integration Service writes to the groups of ports regardless of the output type
you define for the transformation.
If an Unstructured Data transformation has the File output type, and you have
not defined group output ports, you must link the OutputBuffer port to a
downstream transformation. Otherwise, the mapping is invalid. The
OutputBuffer port contains the output file name when the Data Transformation
service writes the output file.
Enable Dynamic Service Name to pass a service name to the Unstructured
Data transformation in the Service Name input port. When you enable Dynamic
Service Name, the Designer creates the Service Name input port.
You must configure a service name with the Unstructured Data transformation
or enable the Dynamic Service Name option. Otherwise the mapping is invalid.
Link XML output from the Unstructured Data transformation to an XML Parser
transformation.