Sunteți pe pagina 1din 50

How to Integrate the MDM Hub with

Informatica Platform for Staging

© 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by
any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. All
other company and product names may be trade names or trademarks of their respective owners and/or copyrighted
materials of such owners.
Abstract
You can integrate the MDM Hub with the Informatica platform to perform the staging and cleansing functions of the
MDM Hub. This article explains how to integrate the MDM Hub with the Informatica platform and perform staging.

Supported Versions
• MDM Multidomain Edition 10.x
• Informatica 9.6.1 HotFix x

Table of Contents
MDM Hub Integration with Informatica Platform Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
MDM Hub Integration with Informatica Platform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Informatica Platform Staging Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Model Repository Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Source System Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Staging Table Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Data Source Connection Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Informatica Platform Staging Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Complete Integration Prerequisites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Prepare the MDM Hub for Staging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Step 1. Configure Source Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Step 2. Add Staging Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Prepare the Developer Tool for Synchronization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Step 1. Create a Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Synchronize the Model Repository with the Hub Store. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Model Repository Service Connection Parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Step 1. Configure the Model Repository Service Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Step 2. Enable Staging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Step 3. Synchronize with the Model Repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Complete the Staging Setup in the Developer Tool. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Step 1. Review the Generated Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Step 2. Create the Connection to the Source System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Step 3. Add the Connection to the Connection Explorer View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Step 4. Create a Physical Data Object for the Source Connection. . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Step 5. Create a Connection for the Target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Step 6. Add the Connection to the Connection Explorer View. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Step 7. Add the Connection to the Physical Data Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Step 8. Add Transformations to Mapplets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Configure and Run the Mappings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Step 1. Configure the Mappings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2
Step 2. Run the Mappings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Staging Table Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Disable Staging for a Single Staging Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Disable Informatica Platform Staging for All Staging Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Enable Informatica Platform Staging for All Staging Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Synchronize the Changes for all the Staging Tables with the Model Repository. . . . . . . . . . . . . . . . . . 49
Additional Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

MDM Hub Integration with Informatica Platform Overview


You integrate the MDM Hub with the Informatica platform when you want to perform Informatica platform staging.
When you integrate the MDM Hub with the Informatica platform, the Data Integration Service loads source data directly
into the MDM Hub staging tables.

To integrate the MDM Hub with the Informatica platform, install the MDM Hub and the Informatica platform
components, such as the Informatica application services and Informatica Developer (the Developer tool). After you
install the services, create the Data Integration Service and the Model Repository Service. Use the Hub Console to
integrate the MDM Hub with the Informatica platform.

Synchronize changes to the staging table with the Model repository. The synchronization process creates the data
objects that the Data Integration Service uses for staging.

After the synchronization, you create mappings that load data from the data source to the staging tables. When you run
the mappings, the Data Integration Service loads source data directly into the MDM Hub staging tables. Data does not
go through the landing tables.

If you want to perform cleanse operations during the stage process, configure transformations in the mapping.

Note: Use Informatica platform staging as the preferred method of staging. When you perform Informatica platform
staging, you cannot set up delta detection, hard delete detection, and audit trails.

3
MDM Hub Integration with Informatica Platform
To perform Informatica platform staging, integrate the MDM Hub with the Informatica platform. The integration requires
the setup of the MDM Hub and Informatica platform components and connections to the data sources.

Informatica Platform Staging Components


Integrate the MDM Hub with the Informatica platform for Informatica platform staging.

The following image shows the integration components:

Informatica platform staging includes the following components:

Hub Console
A client application to access the MDM Hub features. Use the Hub Console to configure the Hub Store to
connect to and synchronize with the Model repository. Also, you use the Hub Console to enable the MDM
Hub staging tables for Informatica platform staging.

Hub Server
A J2EE application that processes data within the Hub Store and integrates the MDM Hub with the
Informatica platform. The Hub Server is the run-time component that manages core and common services for
the MDM Hub.

Hub Store
Stores and consolidates business data for the MDM Hub. You enable staging tables in a specific Operational
Reference Store in the Hub Store for Informatica platform staging. When you synchronize the staging tables
with the Model repository, the Model Repository Service updates the Model repository with the Hub Store
metadata. During the stage process, data from source systems move to a staging table that is associated
with a base object in the Hub Store.

4
Developer tool
Application client that you can use to edit the data objects and mapplets that the synchronization process
creates. Use the Developer tool to create the mapping that you must run for the stage process. Objects that
you can view in the Developer tool are stored in the Model repository and are run by the Data Integration
Service. To perform staging, run mappings in the Developer tool.

Data Integration Service


An application service in the Informatica domain that loads data into the staging tables through the mappings
that you create in the Developer tool. The Data Integration Service processes the requests it receives from
the Developer tool, runs the mappings, and loads data into the MDM Hub staging tables.

Model Repository Service


An application service that manages the Model repository. The Data Integration Service depends on the
Model Repository Service. When you access a Model repository object from the Developer tool or the Data
Integration Service, a request is sent to the Model Repository Service. The Model Repository Service
fetches, inserts, and updates the metadata in the Model repository database tables.

Model repository
A repository that stores metadata in a relational database. It stores metadata from the MDM Hub. When you
synchronize the MDM Hub staging tables with the Model repository, the metadata from the Hub Store is
synchronized with the Model repository. The synchronization process creates objects based on the staging
tables.

Model Repository Objects


When you synchronize the Model repository with the MDM Hub, the synchronization process creates a folder based on
the name of the Operational Reference Store within which it creates data objects and mapplets. To view the objects,
use the Developer tool.

Note: All the object names are based on the staging table name.

The synchronization process creates the following objects:

Physical data object


A physical representation of data of the staging tables. You can create, edit, and delete physical data objects.
When you configure the Informatica platform staging process, the synchronization process creates physical
data objects with the name C_<staging table name>. One of the physical data objects that the
synchronization process creates is a customized data object that is reusable with one or more relational
resources.

Logical data object model


A model that describes the structure and flow of data into the staging table. The synchronization process
creates one logical data object model for each staging table in the MDM Hub. The logical data object model
has the name C_<staging table name>_Model. The model contains logical data objects and a mapplet. You
can edit and delete logical data objects and the mapplet in a logical data object model.

Logical data object


An object in a logical data object model that describes an MDM Hub staging table. When you synchronize the
Model repository with the MDM Hub metadata, the synchronization creates one logical data object with the
name C_<staging table name>_LDO. The logical data object contains a logical data object read mapping and
a logical data object write mapping.

5
Logical data object read mapping
Contains physical data objects as input and a logical data object as output. In a mapping, the Data
Integration Service reads data from the mapping source and makes the data available to view in the logical
data object read mapping.

Logical data object write mapping


Contains a logical data object as input. The logical data object write mapping writes to the target staging
table in the Hub Store. In a mapping, the Data Integration Service processes the data through a mapplet
before the mapping writes to the MDM Hub staging table.

Mapplet
A reusable object that contains an input transformation and an output transformation. The input
transformation can connect to an upstream transformation in a mapping to fetch source data. The output
transformation can connect to a downstream transformation in the mapping to transfer the transformed data
to the target staging tables. When you synchronize the Model repository with the MDM Hub metadata, the
synchronization creates mapplets with the name C_<staging table name>_Mapplet. You can edit and delete
mapplets.

You need to create the following objects to perform Informatica platform staging:

Physical data object


A physical representation of source data. You can create, edit, and delete physical data objects. You need to
create physical data objects to connect to the source data.

Mapping
A set of inputs and outputs that represent the data flow between sources and target staging tables. Mappings
can be linked by transformation objects that define the rules for data transformation. The Data Integration
Service uses the instructions that you configure in the mapping to read data from the source, transform the
data, and write data to staging tables.

Source System Properties


You define the source system that must contribute to the MDM Hub. A source system definition is external to the MDM
Hub. You define source systems for the MDM Hub, in the Systems and Trust tool of the Model workbench.

The following table describes the properties of a source system definition in the MDM Hub:

Property Description

Name Unique, descriptive name for the source system.

Primary Key A unique identifier for the source system that the MDM Hub adds as a prefix to the primary key
value for the source system. The value is read only.

State Management Specifies whether to override the record state of all other source systems that contribute to the
override system MDM Hub. Enable the property to override the record state of all other source systems. Disable
the property if you do not want to override the record state of all other source systems that
contribute to the MDM Hub. Default is disabled.

Description Optional. Description for the source system.

6
Staging Table Properties
You can create and manage staging tables through the Hub Console. You configure some staging table properties
when you create a staging table.

The following table describes the staging table properties that you configure when you create a staging table:

Property Description

Display Name Name of the staging table as it appears in the Hub Console.

Physical Name Name of the staging table in the database. The MDM Hub suggests a physical name for the
staging table based on the display name that you enter.

System Source system of the staging table data.

Preserve Source Specifies whether the MDM Hub must use key values from the source system or use the key
System Keys values that the MDM Hub generates. Enable to use key values from the source system. Disable
to use the key values that the MDM Hub generates. Default is disabled.
Note: During the stage process, if multiple records contain the same PKEY_SRC_OBJECT, the
surviving record is the one with the most recent LAST_UPDATE_DATE. The other records are
sent to the reject table.

Highest Reserved Number by which the key must increase after the first load. The property appears if you enable
Key the Preserve Source System Keys check box.

Data Tablespace Name of the data tablespace for the staging table.

Index Tablespace Name of the index tablespace for the staging table.

Description Description of the staging table.

Cell Update Enables the MDM Hub to update the cell in the target table if the value in the incoming record
from the staging table is the same.

Columns Columns in the staging table.

7
Data Source Connection Properties
You can create and manage connections to data sources through Informatica clients. Create connections to import
data from source systems. Create and manage connections to Oracle, IBM DB2, and Microsoft SQL Server by
specifying the appropriate connection properties.

IBM DB2 Connection Properties


Use an IBM DB2 connection to access IBM DB2. An IBM DB2 connection is a relational database connection. You can
create and manage an IBM DB2 connection in the Developer tool.

The following table describes IBM DB2 connection properties:

Property Description

Database Type The database type.

Name Name of the connection. The name is not case sensitive and must be unique within the
domain. The name cannot exceed 128 characters, contain spaces, or contain the
following special characters:
~ ` ! $ % ^ & * ( ) - + = { [ } ] | \ : ; " ' < , > . ? /

ID String that the Data Integration Service uses to identify the connection. The ID is not
case sensitive. It must be 255 characters or less and must be unique in the domain.
You cannot change this property after you create the connection. Default value is the
connection name.

Description The description of the connection. The description cannot exceed 765 characters.

User Name The database user name.

Password The password for the database user name.

Pass-through security Enables pass-through security for the connection. When you enable pass-through
enabled security for a connection, the domain uses the client user name and password to log
into the corresponding database, instead of the credentials defined in the connection
object.

Connection String for data The IBM DB2 connection URL used to access metadata from the database.
access dbname
Where dbname is the alias configured in the IBM DB2 client.

Metadata Access The connection string used to access the metadata.


Properties: Connection Use the following connection URL:
String
jdbc:informatica:db2://<host
name>:<port>;DatabaseName=<database name>

8
Property Description

AdvancedJDBCSecurityOpti Database parameters for metadata access to a secure database. Informatica treats the
ons value of the AdvancedJDBCSecurityOptions field as sensitive data and stores the
parameter string encrypted.
To connect to a secure database, include the following parameters:
- EncryptionMethod. Required. Indicates whether data is encrypted when transmitted over
the network. This parameter must be set to SSL.
- ValidateServerCertificate. Optional. Indicates whether Informatica validates the certificate
that is sent by the database server.
If this parameter is set to True, Informatica validates the certificate that is sent by the
database server. If you specify the HostNameInCertificate parameter, Informatica also
validates the host name in the certificate.
If this parameter is set to false, Informatica does not validate the certificate that is sent by
the database server. Informatica ignores any truststore information that you specify.
- HostNameInCertificate. Optional. Host name of the machine that hosts the secure
database. If you specify a host name, Informatica validates the host name included in the
connection string against the host name in the SSL certificate.
- TrustStore. Required. Path and file name of the truststore file that contains the SSL
certificate for the database.
- TrustStorePassword. Required. Password for the truststore file for the secure database.
Note: Informatica appends the secure JDBC parameters to the connection string. If you
include the secure JDBC parameters directly to the connection string, do not enter any
parameters in the AdvancedJDBCSecurityOptions field.

Data Access Properties: The connection string used to access data from the database.
Connection String For IBM DB2 this is <database name>.

Code Page The code page used to read from a source database or to write to a target database or
file.

Environment SQL SQL commands to set the database environment when you connect to the database.
The Data Integration Service runs the connection environment SQL each time it
connects to the database.

Transaction SQL SQL commands to set the database environment when you connect to the database.
The Data Integration Service runs the transaction environment SQL at the beginning of
each transaction.

Retry Period This property is reserved for future use.

Tablespace The tablespace name of the database.

SQL Identifier Character The type of character used to identify special characters and reserved SQL keywords,
such as WHERE. The Data Integration Service places the selected character around
special characters and reserved SQL keywords. The Data Integration Service also uses
this character for the Support Mixed-case Identifiers property.
Select the character based on the database in the connection.

9
Property Description

Support Mixed-case When enabled, the Data Integration Service places identifier characters around table,
Identifiers view, schema, synonym, and column names when generating and executing SQL
against these objects in the connection. Use if the objects have mixed-case or
lowercase names. By default, this option is not selected.

ODBC Provider ODBC. The type of database to which ODBC connects.


Select one of the following database options:
- Other
- Sybase
- Microsoft_SQL_Server
Default is Other.

Microsoft SQL Server Connection Properties


Use a Microsoft SQL Server connection to access Microsoft SQL Server. A Microsoft SQL Server connection is a
connection to a Microsoft SQL Server relational database. You can create and manage a Microsoft SQL Server
connection in the Developer tool.

The following table describes Microsoft SQL Server connection properties:

Property Description

Database Type The database type.

Name Name of the connection. The name is not case sensitive and must be unique within the
domain. The name cannot exceed 128 characters, contain spaces, or contain the
following special characters:
~ ` ! $ % ^ & * ( ) - + = { [ } ] | \ : ; " ' < , > . ? /

ID String that the Data Integration Service uses to identify the connection. The ID is not
case sensitive. It must be 255 characters or less and must be unique in the domain.
You cannot change this property after you create the connection. Default value is the
connection name.

Description The description of the connection. The description cannot exceed 765 characters.

Use trusted connection Enables the application service to use Windows authentication to access the database.
The user name that starts the application service must be a valid Windows user with
access to the database. By default, this option is cleared.

User Name The database user name.

Password The password for the database user name.

Pass-through security Enables pass-through security for the connection. When you enable pass-through
enabled security for a connection, the domain uses the client user name and password to log
into the corresponding database, instead of the credentials defined in the connection
object.

Metadata Access The connection string used to access the metadata.


Properties: Connection Use the following connection URL:
String
jdbc:informatica:sqlserver://<host
name>:<port>;DatabaseName=<database name>

10
Property Description

AdvancedJDBCSecurityOpti Database parameters for metadata access to a secure database. Informatica treats the
ons value of the AdvancedJDBCSecurityOptions field as sensitive data and stores the
parameter string encrypted.
To connect to a secure database, include the following parameters:
- EncryptionMethod. Required. Indicates whether data is encrypted when transmitted over
the network. This parameter must be set to SSL.
- ValidateServerCertificate. Optional. Indicates whether Informatica validates the certificate
that is sent by the database server.
If this parameter is set to True, Informatica validates the certificate that is sent by the
database server. If you specify the HostNameInCertificate parameter, Informatica also
validates the host name in the certificate.
If this parameter is set to false, Informatica does not validate the certificate that is sent by
the database server. Informatica ignores any truststore information that you specify.
- HostNameInCertificate. Optional. Host name of the machine that hosts the secure
database. If you specify a host name, Informatica validates the host name included in the
connection string against the host name in the SSL certificate.
- TrustStore. Required. Path and file name of the truststore file that contains the SSL
certificate for the database.
- TrustStorePassword. Required. Password for the truststore file for the secure database.
Not applicable for ODBC.
Note: Informatica appends the secure JDBC parameters to the connection string. If you
include the secure JDBC parameters directly to the connection string, do not enter any
parameters in the AdvancedJDBCSecurityOptions field.

Data Access Properties: The connection string used to access data from the database.
Connection String Use the following connection string:
<server name>@<database name>
If the database does not use the default port, use the following connection strings:
<server name>:<port>@<dbname>
<servername>/<instancename>:<port>@<dbname>

Code Page The code page used to read from a source database or to write to a target database or
file.

Domain Name The name of the domain.

Packet Size The packet size used to transmit data. Used to optimize the native drivers for Microsoft
SQL Server.

Owner Name The name of the owner of the schema.

Schema Name The name of the schema in the database. You must specify the schema name for the
Profiling Warehouse if the schema name is different from the database user name. You
must specify the schema name for the data object cache database if the schema name
is different from the database user name and you manage the cache with an external
tool.

Environment SQL SQL commands to set the database environment when you connect to the database.
The Data Integration Service runs the connection environment SQL each time it
connects to the database.

11
Property Description

Transaction SQL SQL commands to set the database environment when you connect to the database.
The Data Integration Service runs the transaction environment SQL at the beginning of
each transaction.

Retry Period This property is reserved for future use.

SQL Identifier Character The type of character used to identify special characters and reserved SQL keywords,
such as WHERE. The Data Integration Service places the selected character around
special characters and reserved SQL keywords. The Data Integration Service also uses
this character for the Support Mixed-case Identifiers property.
Select the character based on the database in the connection.

Support Mixed-case When enabled, the Data Integration Service places identifier characters around table,
Identifiers view, schema, synonym, and column names when generating and executing SQL
against these objects in the connection. Use if the objects have mixed-case or
lowercase names. By default, this option is not selected.

ODBC Provider ODBC. The type of database to which ODBC connects.


Select one of the following database options:
- Other
- Sybase
- Microsoft_SQL_Server
Default is Other.

Oracle Connection Properties


Use an Oracle connection to connect to an Oracle database. The Oracle connection is a relational connection type.
You can create and manage an Oracle connection in the Developer tool.

The following table describes Oracle connection properties:

Property Description

Database Type The database type.

Name Name of the connection. The name is not case sensitive and must be unique within the
domain. The name cannot exceed 128 characters, contain spaces, or contain the
following special characters:
~ ` ! $ % ^ & * ( ) - + = { [ } ] | \ : ; " ' < , > . ? /

ID String that the Data Integration Service uses to identify the connection. The ID is not
case sensitive. It must be 255 characters or less and must be unique in the domain.
You cannot change this property after you create the connection. Default value is the
connection name.

Description The description of the connection. The description cannot exceed 765 characters.

User Name The database user name.

Password The password for the database user name.

12
Property Description

Pass-through security Enables pass-through security for the connection. When you enable pass-through
enabled security for a connection, the domain uses the client user name and password to log
into the corresponding database, instead of the credentials defined in the connection
object.

Metadata Access The connection string used to access the metadata.


Properties: Connection Use the following connection URL:
String
jdbc:informatica:oracle://<host_name>:<port>;SID=<database
name>

AdvancedJDBCSecurityOpti Database parameters for metadata access to a secure database. Informatica treats the
ons value of the AdvancedJDBCSecurityOptions field as sensitive data and stores the
parameter string encrypted.
To connect to a secure database, include the following parameters:
- EncryptionMethod. Required. Indicates whether data is encrypted when transmitted over
the network. This parameter must be set to SSL.
- ValidateServerCertificate. Optional. Indicates whether Informatica validates the certificate
that is sent by the database server.
If this parameter is set to True, Informatica validates the certificate that is sent by the
database server. If you specify the HostNameInCertificate parameter, Informatica also
validates the host name in the certificate.
If this parameter is set to false, Informatica does not validate the certificate that is sent by
the database server. Informatica ignores any truststore information that you specify.
- HostNameInCertificate. Optional. Host name of the machine that hosts the secure
database. If you specify a host name, Informatica validates the host name included in the
connection string against the host name in the SSL certificate.
- TrustStore. Required. Path and file name of the truststore file that contains the SSL
certificate for the database.
- TrustStorePassword. Required. Password for the truststore file for the secure database.
Note: Informatica appends the secure JDBC parameters to the connection string. If you
include the secure JDBC parameters directly to the connection string, do not enter any
parameters in the AdvancedJDBCSecurityOptions field.

Data Access Properties: The connection string used to access data from the database.
Connection String Use the following connection string:
<database name>.world

Code Page The code page used to read from a source database or to write to a target database or
file.

Environment SQL SQL commands to set the database environment when you connect to the database.
The Data Integration Service runs the connection environment SQL each time it
connects to the database.

Transaction SQL SQL commands to set the database environment when you connect to the database.
The Data Integration Service runs the transaction environment SQL at the beginning of
each transaction.

Retry Period This property is reserved for future use.

Enable Parallel Mode Enables parallel processing when loading data into a table in bulk mode. By default, this
option is cleared.

13
Property Description

SQL Identifier Character The type of character used to identify special characters and reserved SQL keywords,
such as WHERE. The Data Integration Service places the selected character around
special characters and reserved SQL keywords. The Data Integration Service also uses
this character for the Support Mixed-case Identifiers property.
Select the character based on the database in the connection.

Support Mixed-case When enabled, the Data Integration Service places identifier characters around table,
Identifiers view, schema, synonym, and column names when generating and executing SQL
against these objects in the connection. Use if the objects have mixed-case or
lowercase names. By default, this option is not selected.

Informatica Platform Staging Process


When you integrate the MDM Hub with the Informatica platform, the Data Integration Service can load data into the
MDM Hub staging tables. The connection parameters that you specify enable the MDM Hub to interact with the Data
Integration Service and the Model Repository Service.

Perform the following tasks to configure and run Informatica platform staging:

1. Complete integration prerequisites.


2. Prepare the MDM Hub for staging.
3. Prepare the Developer tool for synchronization.
4. Synchronize the Model repository with the Hub Store.
5. Complete the staging setup in the Developer tool.
6. Configure and run the mappings.

Complete Integration Prerequisites


Before you perform Informatica platform staging, ensure that all components are installed and configured.

Perform the following installation and configuration tasks:

1. Install and configure the MDM Hub.


2. Install Informatica services to create a domain.
3. Create and configure the following application services:
• Model Repository Service
• Data Integration Service
4. Install Informatica Developer (the Developer tool), and configure it to connect to the Model repository.

Prepare the MDM Hub for Staging


After you install and configure the MDM Hub, prepare the MDM Hub for staging. Configure source systems and add
staging tables in an Operational Reference Store by using the Hub Console.

Note: If you create a staging table column with the INT data type in the MDM Hub, it appears as a DECIMAL data type
in the Developer tool.

14
Step 1. Configure Source Systems
To manage data input from multiple source systems, the MDM Hub requires a unique internal name for each source
system. To define source systems for the MDM Hub, use the Systems and Trust tool in the Model workbench.

1. In the Hub Console, start the Systems and Trust tool.


2. Acquire a write lock.
3. Right-click in the Systems and Trust pane.
The Add System option appears.
4. Click Add System.
The New System dialog box appears.
5. Specify a unique name and a description to identify the source system, and click OK.
The System Identity properties table appears in the right-most pane.
6. Edit the system identity properties if required, and click Save.

Step 2. Add Staging Tables


Add staging tables that you want to load staging data to.

1. In the Hub Console, start the Schema tool.


2. Acquire a write lock.
3. In the navigation tree, expand the node of the base object that you want to add a staging table to.
4. Right-click the Staging Tables node.
The Add Staging Table option appears.
The following image shows the Add Staging Table option to add a staging table to the Customer base
object:
The Schema tool shows the Add Staging Table option for adding a staging table to the Customer base
object.

5. Click Add Staging Table.

15
The Add Staging to Base Object dialog box appears.
6. Specify the staging table properties.
The following image shows the Add Staging to Base Object Customer dialog box with the Display Name
field set to S_CRM_CUST:

7. From the list of the columns in the base object, select the columns that the source system provides.
You can click Select all columns to select all the base object columns.
8. Specify column properties.

16
9. Click OK.
The Schema tool creates the staging table in the Operational Reference Store along with any support tables,
and then adds the staging table to the navigation tree.
The following image shows the S_CRM_CUST staging table in the navigation tree:

Prepare the Developer Tool for Synchronization


After you install and configure the Informatica services and the Developer tool, prepare the Developer tool for
synchronization. Create a project in the Developer tool where you want to access the staging objects.

Step 1. Create a Project


To store objects that the synchronization process creates, use the Developer tool to create a project in the Model
repository.

1. Select a Model Repository Service in the Object Explorer view.


2. Click File > New > Project.
3. Enter a name for the project.

17
The following image shows the New Project dialog box with the project name Staging in the Name field:

4. Click Next.
The Project Permissions page of the New Project dialog box appears.
5. Optionally, select a user or group, and assign permissions.

18
6. Click Finish.
The project appears under the Model Repository Service in the Object Explorer view.
The following image shows the Object Explorer view with the Model Repository Service named
TSVR28X64D2_MRS that contains a project named Staging:

Synchronize the Model Repository with the Hub Store


After you prepare the Developer tool for synchronization, configure the connection to the Model Repository Service,
enable staging, and synchronize the Model repository with the Hub Store.

Model Repository Service Connection Parameters


To synchronize the Model repository with the MDM Hub, the MDM Hub must be able to connect to the Model
Repository Service. Configure the connection parameters in the Enterprise Manager tool of the Hub Console.

To connect to the Model Repository Service, configure the following connection parameters:

domainHost

Name of the machine in the domain that hosts the master gateway node.

Note: Do not use localhost. The host name must explicitly identify the machine.

domainPort
Port number that the MDM Hub uses to communicate with the services in the domain. Default is 6005.

username
User name to access the Model repository.

19
password
Password to access the Model repository.

securityDomain
Name of the security domain. The value must be Native.

repositoryName
Name of the Model repository.

projectName
Name of the project in the Model repository.

The following sample connection parameter is configured in the Model Repository Service URL field in the Enterprise
Manager tool:
domainHost=TSVR28X64D2,domainPort=6005,username=Administrator,password=Administrator,securityDom
ain=Native,repositoryName= TSVR28X64D2_MRS,projectName=Staging

In the sample, the Informatica platform is installed on a machine with the host name TSVR28X64D2. The port that the
MDM Hub uses to communicate with the services in the domain is 6006. The user name and password are
Administrator, which the user uses to access the TSVR28X64D2_MRS Model repository. The name of the project in
the Model repository is Staging and the security domain is Native.

Step 1. Configure the Model Repository Service Connection


Configure the connection between the MDM Hub and the Model Repository Service. You need the connection to
synchronize data between the MDM Hub and the Model repository.

1. In the Hub Console, start the Enterprise Manager tool.


2. Acquire a write lock.
3. On the ORS databases tab, select an Operational Reference Store.
4. Click the Properties tab.
The Operational Reference Store database properties appear.
5. Enter the Model Repository Service connection parameters in the Model Repository Service URL field, and
click Save.
A message appears indicating that the parameters were saved.

20
The following image shows the Model Repository Service URL field in the Enterprise Manager tool:

6. Click OK.
The MDM Hub connects to the Model Repository Service.
7. Restart the application server and the Hub Console.

Step 2. Enable Staging


Use the Hub Console to enable Informatica platform staging for an MDM Hub staging table.

1. Start the Schema tool.


2. Acquire a write lock.
3. Click Select database.
The Change database dialog box appears.
4. Select the Operational Reference Store in which you want to stage data, and click Connect.
5. In the navigation tree, click the staging table of a base object that you need to use for staging.
The Staging Table Properties page appears.

21
The following image shows the Staging Table Properties page where you can enable Informatica platform
staging:

6. On the Properties tab, enable Informatica platform staging, and click Save.
The staging table is enabled for Informatica platform staging.

Step 3. Synchronize with the Model Repository


Use the Hub Console to synchronize the MDM Hub metadata with the Model repository.

1. Start the Schema tool.


2. Acquire a write lock.
3. Click the staging table of a base object that you need to use for staging.
The Staging Table Properties page appears.

22
4. On the Properties tab, enable Synchronize with Model Repository Service, and click Save.
The changes that you make to the staging table through the Hub Console appear in the Model repository.
The synchronization creates physical and logical data objects and a mapplet in the Model repository.
The following image shows the Staging Table Properties page for the S_CRM_CUST staging table where
you can enable synchronization with the Model repository:

Complete the Staging Setup in the Developer Tool


After you synchronize the Model repository with the Hub Store, complete the configuration of the objects in the
Developer tool. You must create a connection to the source system from which you want to move data into the MDM
Hub staging tables.

If you want to perform cleanse operations, add transformations to the mapplets that the synchronization process
creates. The Data Integration Service manages the records that are rejected during the staging process.

Step 1. Review the Generated Objects


Use the Developer tool to review the data objects that the synchronization process creates.

1. Start the Developer tool.


2. In the Object Explorer view, ensure that you are connected to the Model repository.
3. Right-click the Model Repository Service, and click Refresh.
The Model repository objects appear in the Object Explorer view.
4. Expand the project that you created for staging.
You can view the physical and logical data objects and mapplets for staging.
5. Expand Physical Data Objects, Logical Data Object Models, and Mapplets in the folder with the Operational
Reference Store name and the base object name.

23
The following image shows the Model repository objects in the Object Explorer view:

In the Object Explorer view, the Staging project contains a folder, MDM_SMPL, with the name of the
Operational Reference Store. The MDM_SMPL folder contains another folder, C_CUSTOMER, which is the
name of the base object. The C_CUSTOMER folder contains the physical data objects, logical data objects,
and a mapplet.
6. Right-click a physical data object, and click Open.
The physical data object opens in the editor.

24
The following image shows the C_S_CRM_CUST customized data object that is open in the editor:

7. Right-click the logical data object, click Open, and expand Attributes and Mappings.

25
The following image shows the attributes of the C_S_CRM_CUST_LDO logical data object, the
C_S_CRM_CUST_LDO_Read logical data object read mapping and the C_S_CRM_CUST_LDO_Write
logical data object write mapping links:

8. Click the logical data object read mapping link.


The logical data object read mapping opens.

26
The following image shows the C_S_CRM_CUST_LDO_Read logical data object read mapping:

9. Click the logical data object write mapping link.


The logical data object write mapping opens.
The following image shows the C_S_CRM_CUST_LDO_Write logical data object write mapping:

The C_S_CRM_CUST_LDO_Write logical data object open in the editor.

27
10. Right-click the mapplet.
The following image shows the C_S_CRM_CUST_Mapplet:

The C_S_CRM_CUST_Mapplet mapplet contains the C_S_CRM_CUST_Mapplet_In Input transformation and


the C_S_CRM_CUST_Mapplet_Out Output transformation.

28
Step 2. Create the Connection to the Source System
To specify the connection to the source system, use the Developer tool. Create the connection to import data from the
source system.

1. Click Window > Preferences.


The Preferences dialog box appears.
The following image shows the Preferences dialog box:

2. Select Informatica > Connections.


The Connections pane appears.
3. Expand Databases in the Available Connections list.

29
The following image shows the Preferences dialog box with Databases expanded in the Available
Connections list in the Connections pane:

4. Select a connection type in the Available Connections list, and click Add.
The New Database Connection dialog box appears with the database connection type value populated in
the Type field.
5. In the Name field, enter a database connection name.
6. Click Browse.
The Choose Domain dialog box appears.
7. Select the domain in which you want to store the connection, and click OK.
8. Click Next.
The Connection Details page of the New Database Connection dialog box appears.
9. Specify the connection details for the database, and click Test Connection.
The Test Connection dialog box appears.
10. If the connection is successful click OK, and then click Finish.
The connection appears in the Connections pane.

30
The following image shows the connection, SFAConn, in the Connections pane of the Preferences dialog
box.

11. Click OK.


The Preferences dialog box closes.

Step 3. Add the Connection to the Connection Explorer View


After you create a connection to the data source, add the connection to the Connection Explorer view.

1. To open the Connection Explorer view, click Window > Show View > Connection Explorer.
2. Click the Select Connection button.
The Select Connection dialog box appears.
3. From the Available Connections section, select a connection.

31
The following image shows the connection, SFAConn, selected in the Available Connections section of the
Select Connection dialog box:

4. Click the right arrow.


The selected connection moves to the Selected Connections section of the Select Connection dialog box.
The following image shows the connection, SFAConn, in the Selected Connections section of the Select
Connection dialog box:

32
5. Click OK.
The connection appears in the Connection Explorer view.
The following image shows the connection, SFAConn, in the Connection Explorer view:

Step 4. Create a Physical Data Object for the Source Connection


After you create and add the connection to the Connection Explorer view, add the connection to the physical data
objects.

1. Right-click the database connection in the Connection Explorer view.


A connection menu appears.
2. Click Connect.
The connection is activated in the Connection Explorer view.

33
The following image shows the Connection Explorer view with an active Oracle database connection,
SFAConn:

3. Expand the database to view tables, and right-click the table you want to connect to.
A connection menu appears.
4. Click Add to project.
The Add to project dialog box appears.
5. Select the Create a data object for each resource option.
The New Relational Data Object dialog box appears.
The following image shows the New Relational Data Object dialog box:

34
6. Select the Create data object from existing resource option.
7. In the Name field, enter a name for the source data object and click Finish.
The data object appears with the connection in the Object Explorer view and opens in the editor.
The following image shows the Developer tool with the CUSTOMER_DATA object that has the SFAConn
connection in the Object Explorer view and open in the editor:

35
Step 5. Create a Connection for the Target
To specify the connection to the target staging table, use the Developer tool. Create the connection to transfer the data
output to the staging table.

1. Click Window > Preferences.


The Preferences dialog box appears.
The following image shows the Preferences dialog box:

2. Select Informatica > Connections.


The Connections pane appears.
3. Expand Databases in the Available Connections list.

36
The following image shows the Preferences dialog box with Databases expanded in the Available
Connections list in the Connections pane:

4. Select a connection type in the Available Connections list, and click Add.
The New Database Connection dialog box appears with the database connection type value populated in
the Type field.
5. In the Name field, enter a database connection name.
6. Click Browse.
The Choose Domain dialog box appears.
7. Select the domain in which you want to store the connection, and click OK.
8. Click Next.
The Connection Details page of the New Database Connection dialog box appears.
9. Specify the connection details for the database, and click Test Connection.
The Test Connection dialog box appears.
10. If the connection is successful click OK, and then click Finish.
The connection appears in the Connections pane.

37
The following image shows the OraConn connection, in the Connections pane of the Preferences dialog
box.

11. Click OK.

Step 6. Add the Connection to the Connection Explorer View


After you create a connection for the target staging table, add the connection to the Connection Explorer view.

1. To open the Connection Explorer view, click Window > Show View > Connection Explorer.
2. Click the Select Connection button.
The Select Connection dialog box appears.
3. From the Available Connections section, select a connection.

38
The following image shows the OraConn connection selected in the Available Connections section of the
Select Connection dialog box:

4. Click the right arrow.


The selected connection moves to the Selected Connections section of the Select Connection dialog box.
The following image shows the OraConn connection in the Selected Connections section of the Select
Connection dialog box:

39
5. Click OK.
The connection appears in the Connection Explorer view.
The following image shows the OraConn connection in the Connection Explorer view:

Step 7. Add the Connection to the Physical Data Objects


After you create and add the connection to the Connection Explorer view, add the connection to the physical data
objects that are generated during the synchronization process.

1. Right-click the database connection in the Connection Explorer view.


A connection menu appears.
2. Click Connect.
The connection is activated in the Connection Explorer view.

40
The following image shows the Connection Explorer view with an active OraConn database connection:

3. Right-click the data object, and then click Open.


The data object opens in the editor.
The following image shows the C_S_CRM_CUST data object open in the editor:

4. Click the Advanced tab.

41
The Advanced properties page of the data object opens.
The following image shows the Advanced properties page of the data object:

5. Browse to choose the connection to the staging table, and click the Save button.
The data objects appear with the connection that you specify.
The following image shows the Developer tool with the C_S_CRM_CUST customized data object and the
C_S_CRM_CUST relational data object that have the OraConn connection in the Object Explorer view:

42
Step 8. Add Transformations to Mapplets
To perform data cleansing tasks, you can add transformations to mapplets.

1. In the Object Explorer view, right-click the mapplet, and click Open.
The mapplet opens in the mapplet editor.
The following image shows the mapplet, C_S_CRM_CUST_Mapplet:

The C_S_CRM_CUST_Mapplet mapplet contains an C_S_CRM_CUST_Mapplet_In input transformation and


the C_S_CRM_CUST_Mapplet_Out output transformation.
2. Right-click the mapplet editor, and click Add Transformation.
The Add Transformation dialog box appears.

43
The following image shows the Add Transformation dialog box:

3. Select the transformation that you want, and click OK.


An empty transformation appears in the mapplet editor.
4. Select the transformation in the editor and configure the transformation.

Configure and Run the Mappings


You need to configure a mapping to transform data and load it to the staging tables. The mapping that you use for
staging must contain a physical data object as input, a logical data object as output, and a mapplet that transforms
data.

To perform staging, you need to run the mapping that you configure. The Data Integration Service runs the mapping
and writes the output to the target.

Step 1. Configure the Mappings


You need to create a mapping with source, target, and transformation objects. After you add the mapping objects, you
need to link the ports between mapping objects. Finally, validate the mappings.

1. Create a mapping to transform data and load it into the staging tables.
a. Select a project or folder in the Object Explorer view.
b. Click File > New > Mapping.

44
The following image shows the Mapping dialog box with the Name and Location fields:

c. Enter a mapping name.


d. Click Finish.
An empty mapping appears in the editor.
The following image shows the Mapping_C_S_CRM_CUST empty mapping:

2. Add objects to the mapping to determine the data flow between sources and targets.
a. Drag the physical data object that you created for the source to the editor and select Read to add the
data object as a source.

45
b. Drag the logical data object that represents the staging table to the editor and select Write to add the
data object as a target.
The following image shows the Mapping_C_S_CRM_CUST mapping with a physical data object and a
logical data object:

3. Link ports between mapping objects.


You can manually link ports or link ports automatically.

46
The following image shows the Mapping_C_S_CRM_CUST mapping with links between the physical data
object and the logical data object:

4. Validate a mapping to ensure that the Data Integration Service can read and process the entire mapping.
a. Click Edit > Validate.
Errors might appear in the Validation Log view.
b. Fix errors and validate the mapping again.

Step 2. Run the Mappings


Run a mapping to transform data and load it into staging tables.

If you have not selected a default Data Integration Service, the Developer tool prompts you to select one.

u Right-click an empty area in the mapping editor, and click Run Mapping.
The Data Integration Service runs the mapping and writes the output to the target.

Staging Table Management


When you configure Informatica platform staging, you enable or disable staging for a single or all staging tables in the
MDM Hub. Before you perform staging, you need to synchronize the MDM Hub metadata with the Model repository.
You can enable or disable synchronization for a single or all staging tables.

Disable Staging for a Single Staging Table


You can use the Hub Console to disable Informatica platform staging for a single MDM Hub staging table.

1. Start the Schema tool.

47
2. Acquire a write lock.
3. In the navigation tree, click the staging table of a base object for which you need to disable Informatica
platform staging.
The Staging Table Properties page appears.
The following image shows the Staging Table Properties page for the S_CRM_CUST staging table where
you can disable Informatica platform staging:

4. In the Properties tab, disable Informatica platform staging, and click Save.
The staging table is disabled for Informatica platform staging.

Disable Informatica Platform Staging for All Staging Tables


When you disable the Informatica platform staging for all the MDM Hub staging tables, the staging tables are set for
staging through the MDM Hub. Use the Hub Console to disable the Informatica platform staging for all the MDM Hub
staging tables.

1. Start the Schema tool.


2. Acquire a write lock.
3. In the navigation tree, right-click Base Objects, and then click Disable Informatica platform staging for all
staging tables.
The MDM Hub disables the Informatica platform staging for all the MDM Hub staging tables.

48
The following image shows the Disable Informatica platform staging for all staging tables option that
appears in the Schema tool.

4. To verify if Informatica platform staging is disabled, click the staging tables for each base object and verify
that the Informatica platform staging and Synchronize with Model Repository Service options are
disabled.

Enable Informatica Platform Staging for All Staging Tables


You can use the Hub Console to enable Informatica platform staging for all the MDM Hub staging tables.

1. Start the Schema tool.


2. Acquire a write lock.
3. In the navigation tree, right-click Base Objects, and then click Enable Informatica platform staging for all
staging tables.
The MDM Hub enables Informatica platform staging for all the MDM Hub staging tables.
To verify if Informatica platform staging is enabled, click the staging tables for each base object and verify if
the Informatica platform staging option is enabled.
4. In the Developer tool, open the project that you created for Informatica platform staging.
You can view that a project with the Operational Reference Store name is created in the Model repository.
The project contains physical and logical data objects and mapplets for each staging table.

Synchronize the Changes for all the Staging Tables with the Model
Repository
You can use the Hub Console to enable the synchronization of the changes to all the MDM Hub staging tables with the
Model repository. Before you synchronize the changes to all the MDM Hub staging tables with the Model repository,
configure all the MDM Hub staging tables for Informatica platform staging.

1. Start the Schema tool.


2. Acquire a write lock.

49
3. In the navigation tree, right-click Base Objects, and then click Synchronize all staging tables with MRS.
The All tables were synchronized message appears.
4. Click OK.
The changes that you make to the staging tables through the Hub Console appear in the Model repository.
5. Start the Developer tool and select the project that you created for staging.
You can view the physical and logical data objects and mappings for Informatica platform staging.

Additional Documentation
For information about topics related to Informatica platform staging, see the following documentation:

• Informatica Developer Tool Guide. Provides information about data objects and connections.
• Informatica Mapping Guide. Provides information about mappings and mapplets.
• Informatica Developer Transformation Guide. Provides information about transformations.

Author
Brintha Bennet

50

S-ar putea să vă placă și