Documente Academic
Documente Profesional
Documente Cultură
2011-04-06
Copyright © 2011 SAP AG. All rights reserved.SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP
Business ByDesign, and other SAP products and services mentioned herein as well as their respective
logos are trademarks or registered trademarks of SAP AG in Germany and other countries. Business
Objects and the Business Objects logo, BusinessObjects, Crystal Reports, Crystal Decisions, Web
Intelligence, Xcelsius, and other Business Objects products and services mentioned herein as well
as their respective logos are trademarks or registered trademarks of Business Objects S.A. in the
United States and in other countries. Business Objects is an SAP company.All other product and
service names mentioned are the trademarks of their respective companies. Data contained in this
document serves informational purposes only. National product specifications may vary.These materials
are subject to change without notice. These materials are provided by SAP AG and its affiliated
companies ("SAP Group") for informational purposes only, without representation or warranty of any
kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The
only warranties for SAP Group products and services are those that are set forth in the express
warranty statements accompanying such products and services, if any. Nothing herein should be
construed as constituting an additional warranty.
2011-04-06
Contents
Chapter 2 Architecture...........................................................................................................................11
2.1 Architecture overview............................................................................................................11
2.1.1 Servers and services..............................................................................................................12
2.2 Information Steward on Business Intelligence platform components.......................................19
2.3 Information workflows............................................................................................................22
2.3.1 Adding a table to a Data Insight project..................................................................................22
2.3.2 Profiling data .........................................................................................................................23
2.3.3 Scheduling and running a Metadata Management integrator source.......................................23
2.3.4 Creating a custom cleansing package with Cleansing Package Builder...................................24
3 2011-04-06
Contents
4 2011-04-06
Contents
5 2011-04-06
Contents
Chapter 13 Supportability......................................................................................................................203
13.1 Information Steward logs.....................................................................................................203
13.1.1 Log levels............................................................................................................................203
6 2011-04-06
Contents
Chapter 14 Appendix.............................................................................................................................209
14.1 Glossary..............................................................................................................................209
Index 215
7 2011-04-06
Contents
8 2011-04-06
Getting Started
Getting Started
With operational systems frequently changing, data quality control becomes critical when you publish
business reports. SAP BusinessObjects Information Steward provides data profiling and validation rule
features that you can use to determine and improve the quality and structure of your source data.
Example:
You perform administrative tasks for SAP BusinessObjects Information Steward on the Central
Management Console (CMC) of SAP Business Intelligence Platform.
1. Access the CMC in one of the following ways:
• Select SAP Business Intelligence Platform Central Management Console from the program
group on the Windows Start menu.
Start > Programs > SAP BusinessObjects XI 4.0 > SAP BusinessObjects Enterprise > SAP
BusinessObjects Enterprise Central Management Console.
• Type directly into your browser the name of the computer you are accessing.
http://webserver:8080/BOE/CMC
Replace webserver with the name of the web server machine. If you changed this default virtual
directory on the web server, you need to type your URL accordingly. If necessary, change the
default port number to the number you provided when you installed Business Intelligence Platform.
2. Login to the Central Management Console (CMC) with a user name that belongs to one or more of
the following administration groups:
• Data Insight Administrator
• Metadata Management Administrator
• Administrator
9 2011-04-06
Getting Started
For details, see "To log on to the CMC from your browser" in the BusinessObjects Enterprise
Administrator's Guide.
3. On the CMC Home page, access Information Steward in one of the following ways:
• Click the Information Steward link under the "Organize" area.
• Click the Information Steward tab on the left of your screen.
• Select the Information Steward option from the drop-down list at the top of the CMC Home
page.
10 2011-04-06
Architecture
Architecture
Information Steward uses SAP BusinessObjects Business Intelligence platform for managing user
security, scheduling integrator sources as tasks and utilities, managing sources, and on demand services.
Information Steward uses Data Services for profiling, rule tasks, browsing metadata, and viewing data.
The following diagram shows the architectural components for SAP BusinessObjects Business
Intelligence platform, SAP BusinessObjects Data Services, and SAP BusinessObjects Information
Steward.
Note:
The diagram shows only the servers and services in the Business Intelligence platform and Data Services
that are relevant to Information Steward.
11 2011-04-06
Architecture
Related Topics
• Servers and services
• Data Services Job Server
• Web Application Server
SAP BusinessObjects Business Intelligence platform (BI platform) uses the terms server and service
to refer to the two types of software running on an Information platform services machine.
The term “server” is used to describe an operating system level process (on some systems, this is
referred to as a daemon) hosting one or more services. For example, the Enterprise Information
Management Adaptive Processing Server and Information Steward Job Server are servers. A server
runs under a specific operating system account and has its own PID.
A “service” is a server subsystem that performs a specific function. The service runs within the memory
space of its server under the process id of the parent container (server). For example, the Information
StewardTask Scheduling Service is a subsystem that runs within the Information Steward Job Server.
12 2011-04-06
Architecture
A “node” is a collection of BI platform servers running on the same host. One or more nodes can be on
a single host.
Information platform services can be installed on a single machine, spread across different machines
on an intranet, or separated over a wide area network (WAN).
The Job Server component of SAP BusinessObjects Data Services is required for the following reasons:
• The Data Services Job Server must already be installed on this computer because it provides the
following system management tools that are required during the first installation of Information
Steward:
• Repository Manager
The Repository Manager creates the required Data Insight objects in the Information Steward
repository. The Information Steward installer invokes the Repository Manager automatically when
creating the repository the first time the installer is run.
• Server Manager
The Server Manager creates the Information Steward job server group and job servers and
associates them to the Information Steward repository.
To add job servers to the Information Steward job server group, you must manually invoke the
Server Manager. For details, see “Adding a job server for Data Insight” in the Installation Guide.
• The Data Services Job Server provides the engine processes that perform the Data Insight profiling
and rule tasks. The engine processes use parallel execution and in-memory processing to deliver
high data throughput and scalability.
Note:
When you installed Data Services, ensure that you selected Job Server under the Server component
in the "Select Feature" window of the Data Services installer.
In addition, you need to choose MDS and VDS during the Data Services installation. These two options
are not checked by default.
13 2011-04-06
Architecture
• Data Insight
• Metadata Management
• Metapedia
• Cleansing Package Builder
Note:
The Information Steward web application must be installed on the same web application server as that
of the SAP BusinessObjects Business Intelligence platform.
For specific version compatibility, refer to the Product Availability Matrix available at http://ser
vice.sap.com/PAM.
2.1.1.2.1 Administration
You use the Central Management Console (CMC) to perform SAP BusinessObjects Information Steward
administrative tasks such as the following:
• Define Data Insight connections and projects
• Configure and run metadata integrators
• Define source groups to subset the metadata when viewing relationships such as Same As, Impact,
and Lineage
• Administer user security for the modules of Information Steward: Data Insight , Metadata Management,
and Cleansing Package Builder
• Configure application settings that affect the behavior and performance of Data Insight profile and
rule tasks
• Schedule Data Insight profile and rule tasks
• Run or schedule Information Steward utilities
For more information, see the SAP BusinessObjects Information Steward Administrator Guide.
14 2011-04-06
Architecture
• Impact analysis - Allows you to identify which objects will be affected if you change or remove
other connected objects.
• Lineage analysis - Allows you to trace back from a target object to the source object.
• Metapedia can perform tasks such as:
• Define Metapedia terms related to your business data and organize the terms into categories.
• Cleansing Package Builder can perform tasks such as:
• Define cleansing packages to parse and standardize data
• Publish a cleansing package and export it to SAP BusinessObjects Data Services where users
can import it to generate a base Data Cleanse transform that can be included in jobs to cleanse
your data
For more information, see the SAP BusinessObjects Information Steward User Guide.
15 2011-04-06
Architecture
You can also obtain third-party metadata integrators for other data sources. For more information about
third-party metadata integrators, see http://www.metaintegration.net/Products/MIMB/SupportedTools.html.
Related Topics
• User Guide: section "BusinessObjects Enterprise objects"
• User Guide: "SAP NetWeaver Business Warehouse metadata"
• User Guide: section "Data Modeling metadata"
• User Guide: section "BusinessObjects Data Federator objects"
• User Guide: section "BusinessObjects Data Services objects"
• http://www.metaintegration.net/Products/MIMB/SupportedTools.html
• http://www.metaintegration.net/Products/MIMB/Documentation/
• User Guide: section "Relational Database metadata"
2.1.1.4 Services
16 2011-04-06
Architecture
The following table describes each of the services that are pertinent to SAP BusinessObjects Information
Steward.
Part of server
Service that service Service description Deployment comments
runs on
Informa-
Enterprise Infor-
tion Stew- Performs tasks on Data Services such An Adaptive Processing Server of
mation Manage-
ard Admin- as test rule, delete objects (connection, SAP BusinessObjects Business Intel-
ment Adaptive
istrator datastore, workflow) from Data Ser- ligence platform must already be in-
Processing
Task Ser- vices, export rules to Data Services. stalled on this computer.
Server
vice
17 2011-04-06
Architecture
Part of server
Service that service Service description Deployment comments
runs on
18 2011-04-06
Architecture
Part of server
Service that service Service description Deployment comments
runs on
Informa-
tion Stew- An Adaptive Job Server of SAP
Information Processes scheduled Data Insight pro-
ard Task BusinessObjects Business Intelli-
Steward Job file and rule tasks in the Central Man-
Schedul- gence platform must already be in-
Server agement Console (CMC).
ing Ser- stalled on this computer.
vice
Informa-
tion Stew-
An Adaptive Job Server of SAP
ard Inte- Information Processes scheduled Metadata Man-
BusinessObjects Business Intelli-
grator Steward Job agement integrator sources in the
gence platform must already be in-
Schedul- Server Central Management Console (CMC).
stalled on this computer.
ing Ser-
vice
19 2011-04-06
Architecture
The following table describes how SAP BusinessObjects Information Steward uses each pertinent SAP
BusinessObjects Business Intelligence platform (BI platform) component.
BI platform
How SAP BusinessObjects Information Steward uses component
component
20 2011-04-06
Architecture
BI platform
How SAP BusinessObjects Information Steward uses component
component
Maintains a database of information about your BI platform system. The data stored
by the CMS includes information about users and groups, security levels, schedule
information, BI platform content, and servers. For more information about the CMS,
see SAP BusinessObjects Business Intelligence Platform Administrator's Guide.
The following objects in the Metadata Management module of Information Steward
are stored in the CMS.
• Integrator Source configurations
• Source groups
Central Man- • Utilities configurations
agement Serv- • Data Insight connections
er (CMS) • Projects
• Tasks
Note:
Because integrator source configurations and source group definitions are stored
in the CMS, you can use the Upgrade management tool to move them from one
version of the CMS to another. The schedules and rights information are considered
dependencies of these configurations. For details, see the Upgrade Guide and the
“Lifecycle Management” section in the Administrator Guide.
Information Steward Job Server uses the Adaptive Job Server for executing profiling
tasks and integrator tasks.
Adaptive Job
Server The server may host the following services for Information Steward:
• Information Steward Task Scheduling Service
• Information Steward Integrator Scheduling Service
The EIM Adaptive Processing Server uses BIP adaptive processing server to host
the following services:
• Metadata Relationship Service
• Metadata Search Service
Adaptive Pro-
cessing Server • Metadata Integrator Service
• Data Services Metadata Browsing Service
• Data Services View Data Service
• Information Steward Administrator Task Service
• Cleansing Package Builder Core Service
• Cleansing Package Builder Auto-analysis Service
• Cleansing Package Builder Publishing Service
21 2011-04-06
Architecture
BI platform
How SAP BusinessObjects Information Steward uses component
component
When tasks are performed in SAP BusinessObjects Information Steward, such as adding a table to a
Data Insight project, running a Metadata Management integrator source, or creating a cleansing package,
information flows through SAP BusinessObjects intelligence platform services, SAP BusinessObjects
Data Services, and SAP BusinessObjects Information Steward. The servers and services within each
of these software products communicate with each other to accomplish a task. For an overview of the
servers and services, see Architecture overview.
The following section describes some of the process flows as they would happen in SAP BusinessObjects
intelligence platform services, SAP BusinessObjects Data Services, and SAP BusinessObjects
Information Steward.
This workflow describes the process of adding a table to a Data Insight project.
1. The user selects the Add > Tables on "Workspace Home" in the Data Insight tab to access the
"Browse Metadata" window.
2. The web application server passes the request to the Central Management Server (CMS) and returns
a list of connections that the user can view assuming the user has appropriate permissions to view
the connections.
3. If the user has the appropriate rights to view the selected connection, the CMS sends the request
to the Data Services Metadata Browsing Service.
4. The Data Services Metadata Browsing Service obtains the metadata from the connection and sends
the metadata to the web application server.
5. The web application server displays the metadata in the Data Insight "Browse Metadata" window.
6. When the user selects a table and clicks Add to Project, the web application stores the metadata
in the Information Steward repository.
22 2011-04-06
Architecture
This workflow describes the process of running a profile task in Data Insight. Running validation rules
are similar to the steps here.
1. The user selects the name of a table or file on "Workspace Home" in the Data Insight tab and clicks
Profile.
2. Choose the tables in the "Workspace home" of the Data Insight tab.
3. Save the task and schedule to run it.
4. The web application server passes the request to the Central Management Server (CMS).
5. The Information Steward web application determines from the CMS system if the user has the right
to run profile tasks on the connection that contains the table or file.
6. The administrator determines if the user has the right to create a profile task for the connection, and
has the right to schedule the task. If so, the task is scheduled in the CMS system.
7. When the scheduled time arrives, the CMS sends the task information to the Information Steward
Task Scheduling Service.
8. The Information Steward Task Scheduling Service sends the profile task to the Data Services Job
Server.
9. The Data Services Job Server partitions the profile task based on the performance application
settings.
10. The Data Services Job Server executes the profile task and stores the results in the Information
Steward repository.
11. The web application server displays the profile results in the Data Insight "Workspace Home" window.
This workflow describes the process of scheduling and running a Metadata Management integrator
source to collect metadata.
1. The user schedules an integrator source on the Central Management Console (CMC) and the request
is sent to the CMS system.
2. The CMS determines if the user has the appropriate rights to schedule the integrator source.
3. If the user has the appropriate rights to schedule the object, the CMS commits the scheduled
integrator request to the CMS system.
4. When the scheduled time arrives, the CMS finds a suitable Information Steward Job Server based
on the Job Server group associated with the integrator and passes the job.
If the process has a SAP BusinessObjects Enterprise 3.x source system, the process contacts the
registered remote job server and passes along the integrator process information.
23 2011-04-06
Architecture
5. The integrator process collects metadata and stores the metadata in the Information Steward
repository.
6. The integrator process also generates the Metadata Management search index files and loads them
to the Input File Repository Server.
7. After uploading the search index files, the integrator source notifies the Metadata Management
search service.
8. The Metadata Management search service downloads the generated index files and consolidates
them into a master index file.
9. The Information Steward Integrator Scheduling Service updates the CMS with the job status.
This workflow describes the process of creating and publishing a custom cleansing package in Cleansing
Package Builder.
1. In Information Steward, the user clicks the Cleansing Package Builder tab.
2. The Cleansing Package Builder (CPB) application sends the user's login information to the CPB
Web Service.
3. The CPB Web Service sends the information to the Enterprise Information Management (EIM)
Adaptive Processing server.
The EIM Adaptive Processing Server runs on the Business Intelligence platform.
4. The EIM Adaptive Processing Server determines which rights the user has in CPB.
5. The information is sent back through the CPB Web Service to the CPB application.
The user sees the cleansing packges they have the rights to view.
6. In the "Cleansing Packages Tasks" screen, the user selects New Cleansing Package > Custom
Cleansing Package to start creating a cleansing package.
The user provides the necessary information and sample data to create the cleansing package.
7. The CPB application sends the information through the CPB Web Service to the CPB Core Service,
using the BusinessObjects Enterprise SDK mechanism.
The CPB Core Service handles the main functions of CPB. The CPB Core Service runs on the EIM
Adaptive Processing Server.
8. The CPB Core Service sends the response back through the CPB Web Service to the CPB
application.
The new cleansing package is created in CPB.
9. The application communicates with the CPB Auto-Analysis Service through the CPB Web Service.
The CPB Auto-Analysis Service analyzes the data to create suggestions of standard forms and
variations. The CPB Auto-Analysis Service runs on the EIM Adaptive Processing Server.
10. When the user has finished refining the cleansing package, the user clicks Publish on the "Cleansing
Packages Tasks" screen.
11. The CPB application communicates with the CPB Publishing Service through the CPB Web Service.
24 2011-04-06
Architecture
The CPB Publishing Service assists in the cleansing package's conversion to the reference data
format used by Data Services. The CPB Publishing Service runs on the Enterprise Management
Adaptive Processing Server.
12. The published cleansing package information is sent to the Input File Repository, where it is stored
and can be accessed by Data Services.
The Input File Repository runs on the Business Intelligence platform.
13. Data Services communicates directly with the Business Intelligence platform to sync with the published
cleansing packages.
25 2011-04-06
Architecture
26 2011-04-06
Securing SAP BusinessObjects Information Steward
SAP BusinessObjects Information Steward uses the security framework that SAP BusinessObjects
Business Intelligence plaform (BI platform) provides.
The BI platform architecture addresses the many security concerns that affect today's businesses and
organizations. The current release supports features such as distributed security, single sign-on, resource
access security, granular object rights, and third-party authentication in order to protect against
unauthorized access.
For details about how BI platform addresses enterprise security concerns, see the “Securing Information
platform services” section of the SAP BusinessObjects Business Intelligence Platform Administrator's
Guide.
For details about user groups and granular object rights, see Information Steward pre-defined users
and groups
The following topics detail how Information Steward uses the enterprise security features provided by
BI platform.
Related Topics
• Securing user data for Information Steward
• Storage of sensitive information
• Configuring the Remote Job Server for SSL
• Reverse proxy servers
SAP BusinessObjects Information Steward is a web-based application that uses enterprise security
provided by SAP BusinessObjects Business Intelligence platform. This section details the ways in which
Information Steward takes advantage of the followingSAP BusinessObjects Enterprise security features:
• Secure access to user data
• Storage of sensitive information
27 2011-04-06
Securing SAP BusinessObjects Information Steward
• Secure connections
• Reverse proxy servres
Related Topics
• Securing user data for Information Steward
• Storage of sensitive information
• Configuring the Remote Job Server for SSL
• Reverse proxy servers
SAP BusinessObjects Information Steward has access to the following data which might contain sensitive
information:
• Source data in Data Insight connections on which users run profile and rule tasks
• Sample data from profiling results that Data Insight stores in the Information Steward repository
• Sample data that failed validation rules that Data Insight stores in the Information Steward repository
• All data that failed validation rules that a user chooses to store in a database accessed through a
Data Insight connection.
The Database Administrator (DBA) secures the data in these databases by managing user permissions
on them:
• Data Insight connections for profiling
• Information Steward repository
• Data Insight connections for all data that failed validation rules
In addition, the Data Insight Administrator or Administrator control access to the data by using the
Central Management System (CMS) to manage the following rights on the Data Insight connections:
• View Data
• Profile/Rule permission
• View Sample Data
• Export Data
For more information, see User rights in Data Insight.
Information Steward uses the SAP BusinessObjects Enterprise cryptography which is designed to
protect sensitive data stored in the CMS repository. Sensitive data includes user credentials, data
28 2011-04-06
Securing SAP BusinessObjects Information Steward
source connectivity data, and any other info objects that store passwords. This data is encrypted to
ensure privacy, keep it free from corruption, and maintain access control. For more information, see
the "Overview of SAP BusinessObjects Enterprise data security" section of the SAP BusinessObjects
Enterprise Administrator's Guide.
Encryption of sensitive information, such as passwords, is done in the following Information Steward
areas:
• Information Steward repositories
• Metadata Integrator sources
• Data Insight connections
The SAP BusinessObjects Enterprise Metadata Integrator in SAP BusinessObjects Information Steward
4.0 can collect metadata from an SAP BusinessObjects Enterprise XI 3.x system by using the Remote
Job Server. When you install Information Steward, you install the Remote Job Server component on
the computer where the Enterprise XI 3.x system resides. Then you use the Information Steward Service
Configuration to configure the Remote Job Server. For information about installing the Remote Job
Server on the SAP BusinessObjects Enterprise XI 3.x system, see “Remote Job Server Installation” in
the Installation Guide.
If you are using the Secure Sockets Layer (SSL) protocol for all network communication between clients
and servers in your SAP BusinessObjects EnterpriseXI 3.x and SAP BusinessObjects Business
intelligence platform 4.0 deployments, you can use SSL for the network communication between the
Remote Job Server and the Metadata Integrator on Information Steward. In this enviironment:
• The server is the Remote Job server on the SAP BusinessObjects EnterpriseXI 3.x. To enable SSL,
the server must have both keystore and truststore files defined.
• The client is the Metadata Integrator on Information Steward 4.0. To enable SSL, the client must
use the same truststore and password as the server.
To set up SSL between the the Remote Job Server and the Metadata Integrator, you need to perform
the following tasks:
• Create keystore and truststore files for the Remote Job Server and copy the truststore file to the
SAP BusinessObjects Business intelligence platform 4.0 system.
• Configure the location of SAP BusinessObjects XI 3.x SSL certificates and key file names (from
Server Intelligence Agent (SIA)).
Related Topics
• Creating the keystore and truststore files for the Remote Job Server
• Configuring the SSL protocol for the Remote Job Server
29 2011-04-06
Securing SAP BusinessObjects Information Steward
3.2.3.1 Creating the keystore and truststore files for the Remote Job Server
To set up SSL protocol for communication to the Remote Job Server, use the keytool command to:
• Create a certificate and store it in a keystore file on the computer where SAP BusinessObjects
Enterprise XI 3.x resides
• Create a trust certificate and store it in a truststore file on the computer where SAP BusinessObjects
Enterprise XI 3.x resides
• Copy the trust certificate into a truststore file on the computer where SAP BusinessObjects Business
Intelligence Platform 4.0. resides
1. On the computer where you installed SAP BusinessObjects Enterprise XI 3.x, generate a keystore
file and export it:
a. Open a cmd window and go to the directory where Metadata Management configuration files are
stored for Information Steward.
For example, type the following command:
cd C:\Program Files (x86)\SAP BusinessObjects\InformationSteward\MM\config
d. Import the certificate from the export file to create the truststore file.
For example, type the following command to import the certificate into a truststore file named
is.truststore.keystore:
%JAVA_HOME%\bin\keytool -import -v -trustcacerts
-file isClient.cer -keystore is.truststore.keystore
-keypass mypwkey -storepass mypwstore
This command stores the certificate in the truststore file in the InformationSteward\MM\Config
directory.
2. Copy the truststore file to the computer where SAP BusinessObjects Business Intelligence Platform
4.0. resides.
30 2011-04-06
Securing SAP BusinessObjects Information Steward
a. Ensure that the is.truststore.keystore file is in a directory that is accessible to both the
computer where SAP BusinessObjects Enterprise XI 3.x is installed and the computer where
SAP BusinessObjects Business Intelligence Platform 4.0. is installed.
b. Copy the is.truststore.keystore file to the directory where Metadata Management
configuration files are stored for Information Steward.
For example:
C:\Program Files (x86)\SAP BusinessObjects\InformationSteward\MM\config
3.2.3.2 Configuring the SSL protocol for the Remote Job Server
After you create a key and certificate on the Remote Job Server computer and store them in a secure
location, you need to provide the Information Steward Service Configuration on SAP BusinessObjects
XI 3.x with the secure location.
31 2011-04-06
Securing SAP BusinessObjects Information Steward
Note:
To run an integrator source with SSL enabled on the Remote Job Server, set the following run-time
JVM parameter. For more details, see Metadata collection using the Remote Job Server with SSL .
-Dbusinessobjects.migration=on
SAP BusinessObjects Information Steward can be deployed in an environment with one or more reverse
proxy servers. A reverse proxy server is typically deployed in front of the web application servers in
order to hide them behind a single IP address. This configuration routes all Internet traffic that is
addressed to private web application servers through the reverse proxy server, hiding private IP
addresses.
Because the reverse proxy server translates the public URLs to internal URLs, it must be configured
with the URLs of the Information Steward web applications that are deployed on the internal network.
For information about supported reverse proxy servers and how to configure them, see “Information
platform services and reverse proxy servers” and “Configuring reverse proxy servers for Information
platform” in the SAP information platform services Administrator's Guide.
32 2011-04-06
Users and Groups Management
The Central Management System (CMS) manages security information, such as user accounts, group
memberships, and object rights that define user and group privileges. When a user attempts an action
on an Information Steward object, the CMS authorizes the action only after it verifies that the user's
account or group membership has sufficient privileges.
Information Steward provides pre-defined user groups that have specific rights on objects unique to
each module. These user groups enable you to grant rights to multiple users by adding the users to a
group instead of modifying the rights for each user account individually. You also have the ability to
create your own user groups.
Related Topics
• Information Steward pre-defined users and groups
• Managing users in Information Steward
• User rights in Data Insight
• User rights in Metadata Management
• User rights in Cleansing Package Builder
SAP BusinessObjects Information Steward provides pre-defined user groups for each module (Data
Insight, Metadata Management, Cleansing Package Builder) to facilitate managing security on the
objects within each module.
33 2011-04-06
Users and Groups Management
The following diagram provides an overview of the pre-defined Information Steward user groups and
their relation to the Administrator group in SAP BusinessObjects Business Intelligence platform.
• The Administrator group in Business Intelligence platform is for users that:
• Create users and custom groups for all Information Steward modules
• Have rights to perform all tasks within all Information Steward modules.
• Grant users access to cleansing packages
• The Data Insight Administrator is for users that grant users and groups access to connections and
projects (by default, all pre-defined Data Insight groups are granted access to all connections and
projects). A Data Insight Administrator also has access to all Data Insight actions.
• The Metadata Management Administrator grants users access to integrator sources. The Metadata
Management Administrator also has access to all Metadata Management actions.
• Within each Information Steward module, additional user groups have specific rights for the objects
within that module. For example, the Data Insight Analyst group can create profile tasks and rules,
but only the Data Insight Rule Approver can approve rules.
Subsequent topics describe these pre-defined groups and the rights they have on the specific objects.
34 2011-04-06
Users and Groups Management
Related Topics
• Data Insight pre-defined user groups
• Type-specific rights for Data Insight objects
• Metadata Management pre-defined user groups
• Type-specific rights for Metadata Management objects
35 2011-04-06
Users and Groups Management
This section contains the steps to create users and add them to Information Steward groups.
Create user accounts and assign them to groups or assign rights to control their access to objects in
SAP BusinessObjects Information Steward.
To create users:
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Administrator
group.
2. At the CMC home page, click Users and Groups.
3. Click Manage > New > New User.
4. To create a user:
a. Select your authentication type from the Authentication Type list.
Information Steward can use the following user authentication:
• Enterprise (default)
• LDAP
• Windows Active Directory (AD)
• SAP ERP and Business Warehouse (BW)
b. Type the account name, full name, email, and description information.
Tip:
Use the description area to include extra information about the user or account.
c. Specify the password information and settings.
5. To create a user that will logon using a different authentication type, select the appropriate option
from the Authentication Type list, and type the account name.
6. Specify how to designate the user account according to options stipulated by your SAP
BusinessObjects Business Intelligence platform license agreement.
If your license agreement is based on user roles, select one of the following options:
• BI Viewer: access to Business Intelligence platform applications for all accounts under the BI
Viewer role is defined in the license agreement. Users are restricted to access application
workflows that are defined for the BI Viewer role. Access rights are generally limited to viewing
business intelligence documents. This role is typically suitable for users who consume content
through Business Intelligence platform applications.
• BI Analyst: access to SAP BusinessObjects Enterprise applications for all accounts under the
BI Analyst role is defined in the license agreement. Users can access all applications workflows
that are defined for the BI Analyst role. Access rights include viewing and modifying business
intelligence documents. This role is typically suitable for users who create and modify content
for Business Intelligence platform applications
If your license agreement is not based on user roles, specify a connection type for the user account.
36 2011-04-06
Users and Groups Management
• Choose Concurrent User if this user belongs to a license agreement that states the number of
users allowed to be connected at one time.
• Choose Named User if this user belongs to a license agreement that associates a specific user
with a license. Named user licenses are useful for people who require access to BusinessObjects
Enterprise regardless of the number of other people who are currently connected.
Related Topics
• Adding users and user groups to Information Steward groups
• BusinessObjects Enterprise Administrator's Guide: Managing users and groups
Groups are collections of users who share the same rights to different objects. SAP BusinessObjects
Information Steward provides groups for Data Insight and Metadata Management, such as Data Insight
Analyst group and Metadata Management User group.
Related Topics
• Data Insight pre-defined user groups
• Group rights for connections
• Group rights for projects
• Group rights for tasks
• Metadata Management pre-defined user groups
37 2011-04-06
Users and Groups Management
The Data Insight module of SAP BusinessObjects Information Steward contains the following objects
that have specific rights that allow various actions on them.
• Connections through which users view data sources and import tables and files to profile the data.
In addition to rights to the connection, a user must also be granted permission on the source data:
For database connections, the Database Administrator must grant privileges on the tables to the
user.
For file connections, the users that run the following services must have permissions on the directory
where the file resides:
• Information Steward Web Application Server (for example, Tomcat)
• Data Services service
• Server Intelligence Agent that runs EIMAdaptiveProcessingServer and ISJobServer
• Views that can join tables and files from multiple connections.
• Projects that contain profile tasks, rule tasks, and scorecards in specific business areas, such as
HR or Sales.
• Profile tasks that collect profile attributes to help you determine the quality and structure of the data.
• Rule tasks that validate the data according to your business and quality rules.
38 2011-04-06
Users and Groups Management
The following diagram shows a pictorial view of users and groups who are granted rights to access
Connections, Projects, and Tasks.
Related Topics
• Data Insight pre-defined user groups
• Type-specific rights for Data Insight objects
• Customizing rights on Data Insight objects
• Managing users in Information Steward
SAP BusinessObjects Information Steward provides the following pre-defined Data Insight user groups
to facilitate the assignment of rights on connections, projects, and tasks. These groups enable you to
change the rights for multiple users in one place (a group) instead of modifying the rights for each user
account individually.
The following table describes the pre-defined user groups in ascending order of rights.
39 2011-04-06
Users and Groups Management
Users that can only view the connections, projects, source data, profile re-
Data Insight User sults, sample profile data, rules, sample data that failed rules, and scorecard
results.
Users that have all the rights of a Data Insight User, plus the following rights:
• Add tables and files to a project
• Remove tables and files from a project
Data Insight Analyst • Create, edit, and delete profile tasks and rule tasks
• Create, edit, and delete rules, bind rules
• Schedule profile tasks and rule tasks
Data Insight Rule Ap- Users that have all the rights of a Data Insight Analyst, plus the right to
prover approve and reject rules.
Users that have all the rights of a Data Insight Analyst, plus the right to
Data Insight Scorecard
create and edit scorecards that consist of rules for specific business areas
Manager
called Key Data Domains.
Users that have all the above rights on Data Insight objects, plus the follow-
ing rights:
• Configure, edit, delete, run, schedule, view history of Information Steward
Data Insight Administra- utilities
tor • Create, edit, and delete Data Insight connections and projects
• Configure Information Steward application settings
• Change Information Steward repository user and password
Related Topics
• Group rights for Data Insight folders and objects in the CMC
• Group rights for connections
• User rights for views
• Group rights for projects
• Group rights for tasks
Rights are the base units for controlling user access to the objects, users, applications, servers, and
other features in SAP BusinessObjects Enterprise.
40 2011-04-06
Users and Groups Management
“Type-specific rights” are rights that affect specific object types only, such as Data Insight connections,
Data Insight projects, profile tasks, or rule tasks.
Rights are set on objects, such as a Data Insight connection or project, rather than on the "principals"
(the users and groups) who access them. By default, the pre-defined Data Insight user groups are
granted access to newly created connections and projects. If you want some users to access only
certain connections and projects, then do not add them to a pre-defined group, but add their user names
to the list of principals for each individual connection and project and assign the appropriate type -specific
rights. For example, to give a user access to a particular connection, you add the user to the list of
principals who have access to the connection.
For more information, see "How rights work in BusinessObjects Enterprise" in the BusinessObjects
Enterprise Administrator's Guide.
Related Topics
• Group rights for connections
• Group rights for projects
• Group rights for tasks
4.4.2.1 Group rights for Data Insight folders and objects in the CMC
Each pre-defined Data Insight user group provides specific rights on the folders and objects in the CMC,
as the following tables shows.
41 2011-04-06
Users and Groups Management
Add
Objects
Connections Create connection Yes No No No No
to Fold-
folder
er
View
View Projects folder Yes Yes Yes Yes Yes
objects
Add ob-
Create project Yes No No No No
Projects folder jects
42 2011-04-06
Users and Groups Management
View
View task properties Yes Yes Yes Yes Yes
objects
Sched
Schedule task Yes Yes Yes Yes No
Profile task or ule
Rule task Resched Change the sched-
Yes No No No No
ule ule of task
Delete
Delete task Yes Yes Yes Yes No
objects
Related Topics
• User rights for Information Steward administrative tasks
SAP BusinessObjects Information Steward provides pre-defined Data Insight user groups that have
specific rights on connections, as the following table shows. You can add users to these groups to
control their rights on connections.
43 2011-04-06
Users and Groups Management
Data In-
Right Description Data Insight Data Insight Data In- Data In-
sight
Administra- Scorecard sight sight Us-
Rule Ap-
tor Manager Analyst er
prover
Pro-
file/Rule Create profile tasks and rule
Yes Yes Yes Yes No
permis- tasks
sion
Note:
• Rights to a Data Insight connection are granted to users or groups when the Data Insight Admnistrator
adds them to the "Principals" list for the connection. For more information, see Adding users and
user groups to Information Steward groups.
• The Administrator and the Data Insight Administrator can create, edit, and delete connections in the
CMC. See Group rights for Data Insight folders and objects in the CMC.
• For a database connection, the Database Administrator must grant the Data Insight user access to
the tables.
• For a file connection, the users that run the following services must have permissions on the directory
where the file resides:
• Information Steward Web Application Server (for example, Tomcat)
• Data Services service
• Server Intelligence Agent that runs EIMAdaptiveProcessingServer and ISJobServer
44 2011-04-06
Users and Groups Management
Related Topics
• User rights in Data Insight
• Data Insight pre-defined user groups
• Assigning users to specific Data Insight objects
Each pre-defined Data Insight user group provides specific rights on projects on Information Steward,
as the following table shows.
• View project
• View profile results
View ob- • View rule results
Yes Yes Yes Yes Yes
jects • View rules
• View scorecard re-
sult
• Import rules
Import Yes Yes Yes Yes No
• Import views
45 2011-04-06
Users and Groups Management
Manage
Add, edit, and remove
score- Yes Yes No No No
Key Data Domains
card
Note:
Only the Administrator and Data Insight Administrator can create, edit, and delete projects in the CMC.
See Group rights for Data Insight folders and objects in the CMC.
The rights each user has on a view is inherited from the rights the user has on the connections that
comprise the view.
For example, suppose View1 is comprised of the following connections and tables:
• ConnectionA, Table1
• ConnectionB, Table2
Suppose User1 has the Edit right on ConnectionA but not on ConnectionB. Therefore, User1 cannot
edit View1 because the denied Edit right is inherited from ConnectionB.
Similarly, if User2 has the Edit right on ConnectionA and ConnectionB, then User2 can edit View1.
This inheritance applies to all of the rights on views, as the following table shows.
46 2011-04-06
Users and Groups Management
View ob- The view name and columns are visible View objects right on all source connec-
jects in Workspace Home window. tions
View Sam- View profile sample data and sample data View Sample Data right on all source
ple Data that failed rules connectionss
Profile/Rule
Create profile tasks and rule tasks Profile/Rule right on all source connections
permission
Export viewed data, profile sample data, Export Data right on all source connec-
Export Data
and sample data that failed rules tions
Note:
The following actions require rights on the project:
• To add, edit, or remove views, a user must have the Edit objects right on the project.
• To copy a view, a user must have the Add objects right on the project.
Each pre-defined Data Insight user group provides specific rights on profile tasks and rule tasks in
Information Steward, as the following table shows.
47 2011-04-06
Users and Groups Management
Related Topics
• Group rights for Data Insight folders and objects in the CMC
To facilitate user managment, assign users to a pre-defined Information Steward user group. By default,
Information Steward assigns all pre-defined user groups to all Data Insight connections and projects.
However, you might want to limit a user's access such as the following:
• Only view and profile data in a subset of connections
• Only create rules on a subset of projects
• Only create scorecard and approve rules on a subset of projects
• Restrict project access
• Restrict access to Data Insight
Related Topics
• Assigning users to specific Data Insight objects
• Denying user rights to specific Data Insight objects
• Data Insight pre-defined user groups
48 2011-04-06
Users and Groups Management
By default, the pre-defined Data Insight user groups are added to the access list of connections and
projects when you create them. However, you might want to deny one or a small subset of users access
to a specific Data Insight connection and project.
To allow a user or group to access all but one specific Data Insight connection and project:
1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Data Insight Administrator group.
2. Add the user name to a pre-defined Data Insight group because the user would still have access to
most connections and projects. For details, see Adding users and user groups to Information Steward
groups
3. At the CMC home page, click Information Steward.
4. Select the object type.
• To deny rights to a connection:
• Select the Connections node in the Tree panel.
• Select the connection name in the right panel.
• To deny rights to a project::
• Expand the Data Insight node, and expand the Projects node in the Tree panel.
• Select the project name in the Tree panel.
• To deny rights to a profile or rule task:
• Expand the Data Insight node, and expand the Projects node in the Tree panel.
• Select the project name in the Tree panel.
• Select the task name in the right panel.
49 2011-04-06
Users and Groups Management
For example, to deny the right to view and profile the data in this connection, click the Deny column
for the following "Specific Rights for Data Insight Connection":
• Profile/Rule permsiion
• View Data
• View Sample Data
12. To deny project rights:
a. Expand the "Application" node and select "Data Insight Project".
b. Click the Denied column for each right that you want to deny this user or group.
For example, to deny the right to create scorecard and approve rules in this project, click the Denied
column for the following "Specific Rights for Data Insight Connection":
• Approve Rule
• Manage Rule
13. To deny profile or rule task rights:
a. Expand the "Application" node and select "Information Steward Profiler Task".
b. Click the Override General Global column and the Denied column for each right that you want
to deny this user or group.
For example, to deny the right to schedule a profile task or rule task, click the Override General
Global column and the Denied column for the following "General Rights for Data Insight Profiler
Task":
• Rechedule instances that the user owns
• Schedule document to run
• View document instances
14. Click OK and verify that the list under "Right Name" does not display the rights you denied.
15. Click name of the principal you just added, click View Security and verify that the list of rights that
you denied has the red icon in the "Status" column.
16. Click OK and close the "User Security" window.
Related Topics
• Data Insight pre-defined user groups
You might want to limit a user's or group's access to only specific Data Insight connections, projects,
and tasks. In this case, you would add the user or group to the list of principals for each specific Data
Insight object (instead of adding to a pre-defined Data Insight user group).
Note:
You must assign the user to both the connection and the project to be able to add tables or files, create
profile tasks, and create rules.
50 2011-04-06
Users and Groups Management
51 2011-04-06
Users and Groups Management
Related Topics
• Data Insight pre-defined user groups
Whenever a project is created, all pre-defined Data Insight user groups are automatically added to its
principal list. This feature facilitates user rights management when you want the same user or same
group of users to have rights on all projects.
You might want to limit rights of a subset of users to only one project. For example, you might want to
limit the Manage Scorecards right on the Human Resources project to only User A, and you want only
User B to have the Manage Scorecards right on the Finance project.
To restrict the Manage Scorecards right to a specific user for each project:
1. Create User A and User B. For details, see Creating users for Information Steward .
2. Add User A and User B to the Data Insight Analyst user group. For details, see Adding users and
user groups to Information Steward groups.
This Data Insight Analyst user group has all of the rights of the Data Insight Scorecard Manager
user group except the Manage Scorecard right, which you will grant to specific users within a project
in subsequent steps.
3. Create the Human Resources project and the Finance project. For details, see Creating a project.
4. To grant User A the Manage Scorecard right on the Human Resources project:
a. Log on to the Central Management Console (CMC) with a user name that is a member of either
the Administrator group or the Data Insight Administrator group.
b. At the CMC home page, click Information Steward, expand the Data Insight node, and expand
the Projects node in the Tree panel.
c. Select the Human Resources project and click Manage > Security > User Security.
52 2011-04-06
Users and Groups Management
Related Topics
• Creating users for Information Steward
• Adding users and user groups to Information Steward groups
• Denying user rights to specific Data Insight objects
The Metadata Mangement module of SAP BusinessObjects Information Steward contains the following
objects that have object-specific rights that allow various actions on them.
• Metadata Management application through which users can view relationships (such as Same As,
Impact, and Lineage) between integrator sources.
• Integrator Sources through which users collect metadata.
• Integrator Source Groups to subset the metadata when viewing relationships.
• Metapedia through which users define terms related to their business data and organize the terms
into categories.
The CMS manages security information, such as user accounts, group memberships, and object rights
that define user and group privileges. When a user attempts an action on a Metadata Management
object, the CMS authorizes the action only after it verifies that the user's account or group membership
has sufficient privileges.
Related Topics
• Metadata Management pre-defined user groups
• Group rights for Metadata Management objects in Information Steward
• Group rights for Metadata Management folders and objects
• Assigning users to specific Metadata Management objects
• Managing users in Information Steward
53 2011-04-06
Users and Groups Management
SAP BusinessObjects Information Steward provides the following Metadata Management user groups
to enable you to change the rights for multiple users in one place (a group) instead of modifying the
rights for each user account individually.
Pre-defined User
Description
Group
Metadata Management Users that can only view metadata in the Metadata Management tab of Information
User Steward.
Users that have all the rights of a Metadata Management User, plus the following
rights:
Metadata Management • Create and edit annotations
Data Steward • Create custom attributes and edit values of custom attributes
• Define Metapedia categories and terms
Users that have all the rights of a Metadata Management Data Steward, plus the
following rights:
Metadata Management • Create, edit, and delete Metadata Management integrator sources and source
Administrator groups
• Run and schedule Metadata Integrators.
• Configure, edit, delete, schedule, view history of Information Steward utilities
Related Topics
• Information Steward pre-defined users and groups
Type-specific rights affect only specific object types, such as integrator sources or Metapedia objects.
The following topics describe type-specific rights for each Information Steward object in the CMC and
SAP BusinessObjects Information Steward.
Related Topics
• Group rights for Metadata Management folders and objects
54 2011-04-06
Users and Groups Management
Each pre-defined Metadata Management user group provides specific rights on the folders and objects
in the Central Management Console (CMC), as the following table shows.
Add ob-
Integrator Source Configure new integrator
jects to Yes No No
folder sources
the folder
55 2011-04-06
Users and Groups Management
Add ob-
Create integrator source
jects to Yes No No
configurations
the folder
• Delete integrator
sources
Delete • Delete integrator
source instances Yes No No
objects
Integrator Sources
• Delete the integrator
source schedule
Add ob-
Create integrator source
jects to Yes No No
groups
the folder
56 2011-04-06
Users and Groups Management
Each pre-defined Metadata Management user group provides specific rights on objects in Information
Steward, as the following table shows.
57 2011-04-06
Users and Groups Management
58 2011-04-06
Users and Groups Management
• Create Category
• Create Term
• Import to Excel
Add ob- • Edit Category
jects to • Edit Term (including Ap-
Metapedia tab the proval)
folder,
• Add Terms to Categories
Edit ob- • Relate Terms Yes Yes No
jects, • Associate objects to a Term
Delete • Delete Related Terms
objects • Delete Associated Objects
• Delete associated Terms
• Delete Category
• Delete Term
By default, the pre-defined Metadata Management user groups are added to the access list of integrator
sources and source groups when you create them. You might want to allow only certain users to
configure integrator sources, define source groups, or define Metapedia categories and terms. In these
cases, you would add the user or group to the specific Metadata Management object's access list
(instead of adding to a pre-defined Metadata Management user group).
1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Metadata Management Administrator group.
2. At the CMC home page, click Information Steward.
3. Expand the Metadata Management node in the Tree panel.
4. Select the object type.
• For an integrator source:
a. Select the Integrator Sources node in the Tree panel.
59 2011-04-06
Users and Groups Management
60 2011-04-06
Users and Groups Management
For example, to grant the right to schedule and view integrator instances, click the Override General
Global column and the Granted column for the following "General Rights for Metadata Management
Integrator configuration ":
• Pause and resume document instances
• Schedule document to run
• View document instances
14. Click OK and verify that the list under "Right Name" displays the rights you just added.
15. Click OK and verify that the list of principals includes the name or names you just added.
16. Close the "User Security" window.
The Cleansing Package Builder module of SAP BusinessObjects Information Steward contains the
following objects to which you control access.
• Private cleansing packages: Private cleansing packages are viewed or edited by the user who owns
them and are listed under My Cleansing Packages. Private cleansing packages include those created
by using the New Cleansing Package Wizard or by importing a published cleansing package.
• Published cleansing packages: Published cleansing packages are cleansing packages included
with SAP BusinessObjects Information Steward or cleansing packages which a data steward created
and then published. Published cleansing packages are available to all users and can be used in an
SAP BusinessObjects Data Services Data Cleanse transform or imported and used as the basis for
a new cleansing package.
Related Topics
• Group rights for cleansing packages
• Managing users in Information Steward
The Administrator and the pre-defined Cleansing Package Builder User groups can perform specific
actions on cleansing packages, as the following table shows.
61 2011-04-06
Users and Groups Management
Cleansing Package
Create Create cleansing package. Yes Yes
Builder
62 2011-04-06
Users and Groups Management
The following table describes the Information Steward actions in the "Applications" area of the CMC.
To perform any of these Information Steward actions, a user must belong to one fo the following groups:
• Administrator
• Data Insight Administrator
• Metadata Management Administrator
Table 4-11: Actions for Information Steward in the CMC "Applications" area
Related Topics
• Job server group
• Utilities overview
• Configuration settings
• Viewing and editing repository information
• Group rights for Data Insight folders and objects in the CMC
• Group rights for Metadata Management folders and objects
You might want to view or edit the SAP BusinessObjects Information Steward repository connection
information for situations such as the following:
• View connection information, such as database type and server name.
63 2011-04-06
Users and Groups Management
64 2011-04-06
Data Insight Administration
Each deployment of SAP BusinessObjects Information Steward supports multiple users in one or more
Data Insight projects to assess and monitor the quality of data from various sources.
• A project is a collaborative workspace for data stewards and data analysts to assess and monitor
the data quality of a specific domain and for a specific purpose (such as customer quality assessment,
sales system migration, and master data quality monitoring).
• A connection defines the parameters for Information Steward to access a data source. A data source
can be a relational database, application, or file.
Data Insight users in a project can perform tasks such as browse tables and files in connections, add
tables of interest to the project, profile the data and execute rules to measure the data quality, create
data quality scorecards and monitor data quality. For more details, see the "Data Insight" section in the
User Guide.
Before Data Insight users can perform the above tasks, administrators must do the following tasks on
the Central Management Console (CMC) of SAP BusinessObjects Business Intelligence Platform:
• Define connections to data sources
• Create projects
• Add user names to pre-defined Data Insight user groups to access connections and projects
Tip:
Users of a project should be given access to the same set of connections so that they can collaborate
on the same set data with a project.
• Optionally, define connections to save all data that failed rules for rule tasks. Each rule task can use
a different failed data connection.
After a Data Insight user creates tasks, the Data Insight Administrator can do the following tasks:
• Create schedules to run the profile task and rule task at regular intervals.
• Modify the default schedule to run utilities if the frequency of the profile and rule tasks warrant it.
• Modify application-level settings to change the default configuration for Data Insight.
Tip:
It is recommended that the Data Insight Administrator be someone who is familiar with the CMC.
Related Topics
• Data Insight Connections
65 2011-04-06
Data Insight Administration
Data Insight provides you the capability to define the following types of connections:
• For data profiling
You can configure connections to the following types of data sources to collect profile attributes that
can help you determine the data quality and structure:
• Databases such as Microsoft SQL Server, IBM DB2, Oracle, MySQL, Informix IDS, Sybase ASE,
and ODBC
• Applications such as SAP Business Suite and SAP NetWeaver Business Warehouse
• Text files
Note:
For a complete list of supported databases, applications, and their versions, see the Platform Availability
Matrix available at http://service.sap.com/PAM.
Related Topics
• Defining a Data Insight connection to a database
• Defining a Data Insight connection to an application
• Defining a Data Insight connection to a file
• Displaying and editing Data Insight connection parameters
You define a Data Insight connection to a database for any of the following purposes:
• Contains data that you want to profile, run validation rules on, and calculate quality scorecords to
determine the quality and structure of the data.
66 2011-04-06
Data Insight Administration
Option Description
Name that you want to use for this Data Insight connection.
• Maximum length is 64 characters
• Can be multi-byte
• Case insensitive
Connection Name
• Can include underscores and spaces
• Cannot include other special characters : ?!@#$%^&*()-
+={}[]:";'/\|.,`~
You cannot change the name after you save the connection.
7. In the Connection Type drop-down list, select the Database connection value.
8. In the Purpose of connection drop-down list, select one of the following options:
• For data profiling if you want to profile the data in this connection
• For failed data if you want to use this connection to store all the data that fail specific quality
validation rules.
9. In the Database Type drop-down list, select the database that contains the data that you want to
profile or that will store the data that fail validation rules.
10. Enter the relevant connection information for the database type.
11. If you want to verify that Information Steward can connect successfully before you save this profile
connection, click Test connection.
12. Click Save.
Note:
After you save the connection, you cannot change its name, connection type, purpose and connection
parameters which uniquely identify a database..
The newly configured connection appears in the list on the right of the "Information Steward" page.
67 2011-04-06
Data Insight Administration
After you create a connection, you must authorize users to it so that they can perform tasks such as
view the data, run profile tasks and run validation rules on the data.
Related Topics
• User rights in Data Insight
• SAP In-Memory Database connection parameters
• IBM DB2 connection parameters
• Informix database connection parameters
• Microsoft SQL Server connection parameters
• MySQL database connection parameters
• Netezza connection parameters
• ODBC connection parameteres
• Oracle database connection parameters
• Sybase ASE connection parameters
• Sybase IQ connection parameters
• Teradata connection parameters
default
Language Select the correct language Language of the data in the database.
for your database server.
68 2011-04-06
Data Insight Administration
69 2011-04-06
Data Insight Administration
default
Language Select the correct language Language of the data in the database.
for your database server.
Refer to the requirements of Type the data source name defined in DB2 for
Data Source Name
your database connecting to your database.
default
Language Select the correct language Language of the data in the database.
for your database server.
70 2011-04-06
Data Insight Administration
Refer to the requirements of Type the Data Source Name defined in the
Data Source Name
your database ODBC.
default
Language Select the correct language for Language of the data in the database.
your database server.
71 2011-04-06
Data Insight Administration
To use Microsoft SQL Server as a profile source when SAP BusinessObjects Information Steward is
running on a UNIX platform, you must use an ODBC driver, such as the DataDirect ODBC driver.
For more information about how to obtain the driver, see the Platforms Availability Report (PAR) available
in the SAP BusinessObjects Support > Documentation > Supported Platforms section of the SAP
Service Marketplace: http://service.sap.com/bosap-support
.
Refer to the requirements Enter the name of the database to which the profiler
Database Name
of your database. connects.
default
Select the correct lan-
Language Language of the data in the database.
guage for your database
server.
72 2011-04-06
Data Insight Administration
default
Language Select the correct language Language of the data in the database.
for your database server.
73 2011-04-06
Data Insight Administration
default
Language Select the correct language Language of the data in the database.
for your database server.
74 2011-04-06
Data Insight Administration
default
Language Select the correct language Language of the data in the database.
for your database server.
75 2011-04-06
Data Insight Administration
default
Language Select the correct language Language of the data in the database.
for your database server.
76 2011-04-06
Data Insight Administration
Refer to the requirements Enter the name of the database to which the profiler
Database Name
of your database connects.
default
Language Select the correct language Language of the data in the database.
for your database server..
77 2011-04-06
Data Insight Administration
The value is specific to the Enter the user name of the account through
User Name
database server and language. which the software accesses the database.
default
Language Select the correct language for Language of the data in the database.
your database server..
78 2011-04-06
Data Insight Administration
default
Language of the data in the
Language Select the correct language for your
database.
database server.
You must define a connection to any application that contains data that you want to profile to determine
the quality and structure of the data.
79 2011-04-06
Data Insight Administration
Option Description
Name that you want to use for this Data Insight source.
• Maximum length is 64 characters
• Can be multi-byte
• Case insensitive
Connection Name • Can include underscores and spaces
• Cannot include other special characters : ?!@#$%^&*()-
+={}[]:";'/\|.,`~
You cannot change the name after you save the connec-
tion.
6. In the Connection Type drop-down list, select the Application connection value.
7. In the Application Type drop-down list, select one of the following applications that contains the
data you want to profile:
• SAP Netweaver Business Warehouse
• SAP Applications
For the specific components that Information Steward can profile the data, see the Product Availability
Matrix available at http://service.sap.com/PAM.
8. Enter the relevant connection information for the application type.
9. If you want to verify that Information Steward can connect successfully before you save this profile
connection, click Test connection.
10. Click Save.
The newly configured connection appears in the list of Profile Connections on the right of the
"Information Steward" page.
After you create a connection, you must authorize users to it so that they can perform tasks such as
view the data, run profile tasks and run validation rules on the data
Related Topics
• User rights in Data Insight
• SAP Applications connection parameters
• SAP NetWeaver Business Warehouse connection parameters
The SAP NetWeaver Business Warehouse connection has the same options as the SAP Applicaitons
connection type.
80 2011-04-06
Data Insight Administration
Related Topics
• SAP Applications connection parameters
To connect to SAP Applications, the Data Insight module of Information Steward uses the same
connections as Data Services. For more information, see the SAP BusinessObjects Data Services
Supplement for SAP.
Computer name, fully qualified Name of the remote SAP application computer
Server Name
domain name, or IP address (host) to which the software connects.
Client Number 000-999 The three-digit SAP client number. Default is 800.
System Number 00-99 The two-digit SAP system number. Default is 00.
Alphanumeric characters and Enter the name of the account through which the
User Name
underscores software accesses the SAP application server.
E - English
Select the login language from the drop-down list.
G - German You can enter a customized SAP language in this
Application Language
F - French option. For example, you can type S for Spanish
or I for Italian.
J - Japanese
81 2011-04-06
Data Insight Administration
82 2011-04-06
Data Insight Administration
FTP These options are visible if you selected the FTP data transfer method.
83 2011-04-06
Data Insight Administration
You must define a connection to a text file that contains data that you want to profile to determine the
quality and structure of the data.
Option Description
84 2011-04-06
Data Insight Administration
Option Description
5. In the Connection Type drop-down list, select the File connection option.
6. Enter the path for the file in Directory Path.
7. Click Save.
The name of the newly configured connection appears in the list on the right pane of the "Information
Steward" page.
After you create a connection, you must authorize users to it so that they can perform tasks such as
view the data, run profile tasks and run validation rules on the data.
Related Topics
• User rights in Data Insight
Situations when you might want to view or change Data Insight connection parameters include:
• You need to view the connection parameters on a development or test system so that you can
recreate the Data Insight connection when you move to a production system.
• You have several source systems and you want to ensure that you are connecting to the appropraite
source.
85 2011-04-06
Data Insight Administration
6. If you want to verify that Information Steward can connect successfully before you save this Data
Insight connection, click Test connection.
7. Click Save.
The edited description appears in the list of projects on the right pane of the "Information Steward"
page.
The following table shows the Data insight objects that you can delete from the Central Management
Console (CMC).
For information about dependencies when deleting connections and projects on the Central Management
Console (CMC), see “Deleting a connection” and “Deleting a project” in the Administrator Guide.
Note:
You cannot delete a connection if a table or file is being used in a project. You must remove the table
or file from all projects on Information Steward before you can delete the connection or project in the
CMC.
• Referenced project
Connection You must delete each table from • Table metadata
the Workspace of each project
in Data Insight.
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or that has the Create right on Connections in Information Steward.
2. At the CMC home page, click Information Steward.
3. Click the Connections node in the Tree panel.
4. From the list in the right pane, select the name of the connection and click Manage > Delete.
5. To confirm that you want to delete this connection, click OK in the warning pop-up window.
86 2011-04-06
Data Insight Administration
6. If the following message appears, you must must delete each table from the "Workspace" of each
Data Insight project listed in the message.
The connection cannot be deleted because it is referenced by the following:
Project: projectname
A project is a collaborative workspace for data stewards and data analysts to assess and monitor the
data quality of a specific domain and for a specific purpose (such as customer quality assessment,
sales system migration, and master data quality monitoring).
Create a Data Insight project in the Central Management Console (CMC) to allow your users to define
the project's tasks to profile and validate data in SAP BusinessObjects Information Steward.
1. Log on to the CMC with a user name that belongs to the Data Insight Administrator group or that
has the Create right on Projects in Information Steward.
2. At the CMC home page, click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
3. Expand the Data Insight node in the Tree panel, and select Projects.
4. Click Manage > New > Data Insight Project.
5. On the "New Profiling Project" window, enter the following information.
Option Description
6. Click Save.
87 2011-04-06
Data Insight Administration
Note:
After you save the project, you cannot change its name.
The new project appears in the list of projects on the right pane of the "Information Steward" page.
After you create a project, you must grant users rights to perform actions such as create profile and
rule tasks, run these tasks, or create score cards.
Related Topics
• User rights in Data Insight
To edit a Data Insight project, you must have Edit rights on the project or be a member of the Data
Insight Administrator group.
88 2011-04-06
Data Insight Administration
Caution:
When you delete a project, you delete all the contents of the project which include unapproved rules,
tasks, scorecards, profile results, sample data, views, and so forth.
The Data Insight module of SAP BusinessObjects Information Steward provides you the capability to
create the following types of tasks that you can run immediately or schedule on the CMC to run later
or recurringly:
• Profile tasks to generate profile attributes about data tables and files
• Rule tasks to execute rules bound to table columns to measure the data quality
.
The Data Insight Administrator can schedule a profile or rule task for a given project in the Central
Management Console (CMC). A Data Insight user must have already created the profile task or rule
task on Information Steward (see “Creating a profile task” or “Creating a rule task and setting an alert”
in the User Guide).
You would schedule a profile task or rule task for reasons including the following:
• Set up a recurring time to run a profile task or rule task.
• Specify a specific time to run the profile task or rule task.
89 2011-04-06
Data Insight Administration
a. Click Recurrence in the navigation tree in the left pane of the "Schedule" window.
b. Select the frequency in the Run object drop-down list.
c. Select the additional relevant values for the recurrence option.
For a list of the recurrence options and the additional values, see Recurrence options.
d. Optionally, set the Number of retries to a value other than the default 0 and change the Retry
interval in seconds from the default value 1800..
9. If you want to trigger the execution of this utility when an event occurs, expand Events, and fill in
the appropriate information. For more information about Events , see the SAP BusinessObjects
Business Intelligence Platform Administrator Guide.
10. If you want to change the default values for the run-time paraemeters for the task, click the
Parameters node on the left. For parameter descriptions, see Common runtime parameters for
Information Steward.
11. Click Schedule.
Related Topics
• Pausing and resuming a schedule
When you schedule an integrator source or an SAP BusinessObjects Information Steward utility, you
can choose the frequency to run it in the Recurrence option. The following table describes each
recurrence option and shows the additional relevant values that you must select for each recurrence
option.
The utility will run once only. Select the values for
• Start Date/Time
Once
• End Date/Time
.
The utility will run every N hours and X minutes. Select the values
for:
• Hour(N)
Hourly • Minute(X)
• Start Date/Time
• End Date/Time
Daily The utility will run once every N days. Select the value for Days(N).
90 2011-04-06
Data Insight Administration
The utility will run once every week on the selected days. Select the
following values:
Weekly • Days of the week
• Start Date/Time
• End Date/Time
The utility will run once every N months. Select the following values:
• Month(N)
Monthly • Start Date/Time
• End Date/Time
The utility will run on the Nth day of each month. Select the following
values:
Nth Day of Month • Day(N)
• Start Date/Time
• End Date/Time
The utility will run on the first Monday of each month. Select the fol-
lowing values:
1st Monday of Month • Start Date/Time
• End Date/Time
The utility will run on the last day of each month. Select the following
values:
Last Day of Month • Start Date/Time
• End Date/Time
The utility will run on the X day of the Nth week of each month. Select
the following values:
X Day of Nth Week of the • Week(N)
Month • Day(X)
• Start Date/Time
• End Date/Time
The utility will run on the days you specified as "run" days on a calen-
Calendar dar you have created in the "Calendars" management area of the
CMC.
91 2011-04-06
Data Insight Administration
The Data Insight Administrator or someone in the Administrator group can configure the server to
provide email notifications in the Central Management Console (CMC). A Data Insight user must have
already created the profile task or rule task on Information Steward (see “Creating a profile task” or
“Creating a rule task and setting an alert” in the User Guide).
When a task is schedule to run via the Central Management Console (CMC), you can be notified whether
the task completed successfully or with errors. A profiling task is considered to be in error when profiling
any of the tables fails, either because of an infrastructure error, or due to invalid source information
such as an invalid connection, table, column and so on.
The calculate score task may only fail when a table in the task was unable to generate its score, either
due to an infrastructure error or due to invalid source information such as an invalid connection, table,
column and so on.
Option Description
Use Job Server defaults Select to use the settings already defined in the Job Server.
5. In the CMC Home window, click Servers. Select the ISJobServer. If more than one is available,
configure each one.
6. Choose Manage > Properties, and then click Destination in the navigation pane.
7. Select Email from the Destination drop-down list and then click Add.
8. To set up a notification server for completed processing, you must enter information into the Domain,
Host, and Port fields. All other fields are optional. The following table describes the fields on the
Destination page:
92 2011-04-06
Data Insight Administration
Option Description
Domain (required) Enter the fully qualified domain of the SMTP server.
Enter the port that the SMTP server is listening on. (This standard
Port (required)
SMTP port is 25.)
Select Plain or Login if the job server must be authenticated using one
Authentication
of these methods in order to send email.
Provide the Job Server with a user name that has permission to send
User Name
email and attachments through the SMTP server.
Password Provide the Job Server with the password for the SMTP server.
Provide the return email address. Users can override this default when
From
they schedule an object.
You can add placeholder variables to the message body using the Add
placeholder list. For example, you can add the report title, author, or
Add placeholder
the URL for the viewer in which you want the email recipient to view
the report.
9. Click Save.
Related Topics
• Rule threshold notification
93 2011-04-06
Data Insight Administration
The Data Insight Administrator or someone in the Administrator group can configure the server to
provide email notifications in the Central Management Console (CMC). A Data Insight user must have
already created the profile task or rule task on Information Steward (see “Creating a profile task” or
“Creating a rule task and setting an alert” in the User Guide).
When a calculate score task is created, you have the option to provide an email address for notification
when the rule score falls below the low threshold setting. You must configure the notification server
before processing the task so the server has the correct information to send in the alert. To configure
the notification server to receive processing information:
You'll receive an email when the task is complete.
1. In the CMC Home window, click Servers.
2. Select Server List and then highlight the AdaptiveProcessingServer. If you have more than one
available, you must configure all of them.
3. Choose Manage > Properties and then select Destination in the navigation pane.
4. Select Email from the Destination drop-down list and then click Add.
5. To set up a notification server for rules, you must enter information into the Domain, Host, Port and
From fields. All other fields are optional. The following table describes the fields on the Destination
page:
Option Description
Domain (required) Enter the fully qualified domain of the SMTP server.
Enter the port that the SMTP server is listening on. (This standard
Port (required)
SMTP port is 25.)
Select Plain or Login if the job server must be authenticated using one
Authentication
of these methods in order to send email.
Provide the Job Server with a user name that has permission to send
User Name
email and attachments through the SMTP server.
Password Provide the Job Server with the password for the SMTP server.
Provide the return email address. Users can override this default when
From (required)
they schedule an object.
Not used in this scenario. The email address specified when creating
To
the task is used.
94 2011-04-06
Data Insight Administration
Option Description
You can add placeholder variables to the message body using the Add
placeholder list. For example, you can add the report title, author, or
Add placeholder
the URL for the viewer in which you want the email recipient to view
the report.
6. Click Save.
Related Topics
• Configuring for task completion notification
Task is scheduled to run one time. When it actually runs, the status
Pending
changes to “Running."
95 2011-04-06
Data Insight Administration
9. To close the "Data Insight Task History" page, click the X in the upper right corner.
Related Topics
• Scheduling a task
• Information Steward logs
• Log levels
• Changing log levels
96 2011-04-06
Data Insight Administration
You pause a recurring schedule for a task when you do not want to run it at its regularly scheduled time
until you resume the schedule.
To pause a schedule:
1. Log on to the CMC with a user name that belongs to the Data Insight Administrator group or that
has the Create right on Projects in Information Steward.
2. At the CMC home page, click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
3. In the Tree panel, expand the Data Insight node expand the Projects node.
4. Select the name of your project in the Tree panel.
5. Select the name of the task from the list on the right panel.
6. Click Action > History in the top menu tool bar.
A list of instances for the task appears in the right pane.
7. Select the task instance that has the value "Recurring" in the "Schedule Status " column and click
Action > Pause in the top menu tool bar.
8. When you are ready to resume the recurring schedule, select he task instance that has the value
"Paused" in the "Schedule Status " column and click Action > Resume in the top menu tool bar.
When you schedule a profile task, rule task, or integrator source, you can change the default values of
the runtime parameters in the Parameters option when you schedule the instance. The following table
describes the runtime parameters that are applicable to all metadata integrators, profile tasks, and rule
tasks. For information about runtime parameters that apply to only specific metadata integrators, see
the topics in Related Topics below.
97 2011-04-06
Data Insight Administration
A Data Insight task or Metadata Integrator creates this log in in the Business
Objects installation directory and copies it to the File Repository Server.
You can download this log file after the Data Insight task or Metadata Inte-
grator run completed.
File Log Level
The default logging level for this log is Configuration. Usually you can
keep the default logging level. However, if you need to debug your task or
integrator run, you can change the level to log tracing information. For in-
structions to change log levels, see "Changing log levels".
The Information Steward Job Server creates a Java process to perform the
profile task, rule task or metadata collection. Use the JVM Arguments pa-
rameter to configure runtime parameters for the Java process. For example,
JVM Arguments
if there are many parsed values per row in the data being used by Cleansing
Package Builder, you might want to provide more memory than the default
value for the -Xmx argument.
Optional runtime parameters for the metadata integrator source or Data In-
Additional Arguments sight task. For more information, see "mm_admin_runtime_parm_boe_in-
teg.dita#icc14.0.0_topic_45BF540E11CB43F78E2AB9FE71384A8E".
Related Topics
• JVM runtime parameters
The "Information Steward Settings" window in Central Management Console allows you to configure
data profiling and rule settings in Data Insight. These settings control the behavior and performance of
data profiling tasks and rule tasks. Some settings provide default values for parameters that you can
override when you define a profiling task or rule task on Data Insight.
Related Topics
• Configuring profiling tasks and rule tasks
• Profiling task settings and rule task settings
98 2011-04-06
Data Insight Administration
The following parameters in the "Information Steward Settings" window control the behavior of profiling
tasks, rule tasks, and their performance.
Note:
• The parameters Max input size, Max sample data size, and Optimization period affect the amount
of data to process. The more data to process, the more resources are required for efficient processing.
For performance considerations, see Input data settings.
• The parameters Max sample data size, Number of distinct values, Number of patterns, Number
of words, and Results retention period affect the size of the Information Steward repository. If
you increase these values, the repository size also increases and you might need to free space
more often. For more information, see Settings to control repository size.
99 2011-04-06
Data Insight Administration
100 2011-04-06
Data Insight Administration
101 2011-04-06
Data Insight Administration
102 2011-04-06
Data Insight Administration
Note:
If you specify Sub-table, ensure that:
• You set up a pageable cache directory
on a shared directory that is accessi-
ble to all job servers in the job server
group.
• The network is efficient between the
job servers in the group.
• The requirements for a Data Services
job server group are met.
For more information, see Distribution
level sub-table .
Related Topics
• Configuring profiling tasks and rule tasks
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or Administrator group.
2. Select Applications from the navigation list at the top of the CMC Home page.
3. In the Application Name list, select Information Steward application.
4. Click Actions > Configure Application to display the "Information Steward Settings" window.
5. Keep or change the parameters values listed in the "Information Steward Settings" window.
6. Click Save to apply the changed settings. Changes that affect the user interface will only be visible
to users once they log out and log back in.
To reset the settings to their default values, click Reset. To cancel the changes made to the "settings",
click Cancel.
103 2011-04-06
Data Insight Administration
Related Topics
• Profiling task settings and rule task settings
104 2011-04-06
Metadata Management Administration
The Metadata Management module of SAP BusinessObjects Information Steward collects metadata
about objects from different source systems and stores the metadata in a repository. Source systems
include Business Intelligence (SAP BusinessObjects Enterprise and SAP NetWeaver Business
Warehouse), Data Modeling, Data Integration (SAP BusinessObjects Data Services and SAP
BusinessObjects Data Federation), and Relational Database systems.
When you access Information Steward as a user that belongs to the Metadata Management Administrator
group, you can perform the following tasks in the CMC:
• Configure integrator sources from which to collect metadata (see Configuring sources for Metadata
Integrators)
• Run Integrators to collect metadata (see Running a Metadata Integrator )
• View the status of Metadata Integrator runs (see Viewing integrator run progress and history )
• Organize Metadata Integrator Sources into groups for relationship analysis (see Grouping Metadata
Integrator sources )
• Manage user security of Metadata Integrator sources, source groups, Metapedia (see Type-specific
rights for Metadata Management objects.
• Compute and store end-to-end impact and lineage information for Reporting (see Computing and
storing lineage information for reporting)
• Manage the Metadata Management search indexes (see Recreating search indexes on Metadata
Management)
SAP BusinessObjects Metadata Integrators collect metadata from repository sources that you configure,
and they populate the SAP BusinessObjects Information Steward repository with the collected metadata.
When you install Information Steward, you can select the following Metadata Integrators:
• SAP BusinessObjects Enterprise Metadata Integrator
• SAP NetWeaver Business Warehouse Metadata Integrator
• Common Warehouse Metamodel (CWM) Metadata Integrator
• SAP BusinessObjects Data Federator Metadata Integrator
105 2011-04-06
Metadata Management Administration
This section describes how to configure the SAP BusinessObjects Enterprise Metadata Integrator for
the SAP BusinessObjects Enterprise repository which is managed by SAP BusinessObjects Central
Management Server (CMS). This Integrator collects metadata for Universes, Crystal Reports, Web
Intelligence documents, and Desktop Intelligence documents.
Note:
Ensure that you selected the BusinessObjects Enterprise Metadata Integrator when you installed SAP
BusinessObjects Information Steward .
To configure the BusinessObjects Enterprise Integrator, you must belong to the Metadata Management
Administrator group or add Objects right on the Integrator Sources folder.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Click Manage > New > Integrator Source in the top menu tool bar.
3. In the Integrator Type drop-down list, select BusinessObjects Enterprise.
4. On the "New Integrator Source" page, enter the following information.
Option Description
Name that you want to use for this metadata integrator source. The
Name
maximum length of an integrator source name is 128 characters.
106 2011-04-06
Metadata Management Administration
Option Description
The password to connect to the CMS server to register and run the
Password
Metadata Integrator.
The process that CMS uses to verify the identity of a user who at-
tempts to access the system. The default value is Enterprise. See
Authentication Method
the Business Intelligence Platform Administrator's Guide for available
modes.
5. If you want to verify that Information Steward can connect successfully before you save this source,
click Test connection.
6. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the Information
Steward page.
Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history
6.1.1.1.1 Checkpointing
SAP BusinessObjects Information Steward can run the SAP BusinessObjects Enterprise Metadata
Integrator for extended periods of time to collect large quantities of objects. If unexpected problems
occur during object collection, Information Steward automatically records warning, error, and failure
incidents in your log file for you to analyze later.
107 2011-04-06
Metadata Management Administration
shutdown, or some other incident), the next time you run the BusinessObjects Enterprise Metadata
Integrator, it restarts from the safe start point to finish object collection in the least amount of time.
This section describes how to configure the Metadata Integrator for SAP NetWeaver Business
Warehouse.
Note:
Ensure that you selected the SAP NetWeaver Business Warehouse Metadata Integrator when you
installed SAP BusinessObjects Information Steward.
To configure an SAP NetWeaver Business Warehouse integrator source, you must have the Create
or Add permission on the integrator source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Click the down arrow next to Manage in the top menu tool bar and select New > Integrator Source.
3. In the Integrator Type drop-down list, select SAP NetWeaver Business Warehouse.
4. On the "New Integrator Source" page, enter the following information.
Option Description
108 2011-04-06
Metadata Management Administration
Option Description
SAProuter String
(Optional) String that contains the information
required by SAProuter to set up a connection
between the Metadata Integrator and the SAP
NetWeaver BW system. The string contains the
host name, the service port, and the password,
if one was given.
SAP User Name of the user that will connect to the SAP
NetWeaver BW system.
SAP Password Password for the user that will connect to the
SAP NetWeaver BW system.
Language
Language to use for the descriptions of SAP
NetWeaver BW objects. Specify the 2-character
ISO code for the language (for example, en for
English).
5. To verify that Information Steward can connect successfully before you save this source, click Test
connection.
6. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.
Related Topics
• SAP router string information: http://help.sap.com/saphelp_nw70ehp1/helpda
ta/en/4f/992df1446d11d189700000e8322d00/content.htm
• Running a Metadata Integrator immediately
• Accessing Information Steward for administrative tasks
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history
This section describes how to configure the Metadata Integrator for Common Warehouse Metamodel
(CWM).
Note:
Ensure that you selected the CWM Metadata Integrator when you installed SAP BusinessObjects
Information Steward.
109 2011-04-06
Metadata Management Administration
To configure the CWM Integrator, you must have the right to add objects in the Integrator Sources
folder.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
2. Click the down arrow next to "Manage" in the top menu tool bar and select New > Integrator Source.
3. In the Integrator Type drop-down list, select Common Warehouse Modeling .
4. On the "CWM Integrator Configuration" page, enter the following information.
Option Description
Name that you want to use for this source. The maximum length of an integrator
Source Name
source name is 128 characters.
Name of the file with the CWM content. For example: C:\data\cwm_ex
port.xml
This value is required. The file should be accessible from the computer where
the Metadata Management web browser is running.
Note:
Metadata Management copies this file to the Input File Repository Server on
SAP BusinessObjects Business Intelligence Platform. Therefore, if the original
file is subsequently updated, you must take the following steps to obtain the
updates before you run the Integrator again:
• Update the configuration to recopy the CWM file.
File Name a. From the Integrator Sources list, select the CWM integrator source name
and click Action > Properties.
The file name displays in the comments under the File Name text box,
and the file name has "frs:" prefacing it.
b. Browse to the original file again.
c. Click Save.
• Create a new schedule for the CWM integrator because the old schedule
has a copy of the previous file.
a. With the CWM integrator source name still selected in the Integrator list,
click Action > Schedules.
b. Select the Recurrence and Parameter options that you want.
c. Click Schedule.
5. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.
110 2011-04-06
Metadata Management Administration
Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history
This section describes how to configure the Metadata Integrator for SAP BusinessObjects Data Federator.
Note:
Ensure that you selected the SAP BusinessObjects Data FederatorMetadata Integrator when you
installed SAP BusinessObjects Information Steward.
To configure an SAP BusinessObjects Data Federator integrator source, you must have the Create or
Add right on the integrator source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
2. Click the down arrow next to Manage in the top menu tool bar and select New > Integrator Source.
3. In the Integrator Type drop-down list, select Data Federator.
4. On the "New Integrator Source" page, enter the following information.
Option Description
Name Name that you want to use for this source. The maximum length of an in
tegrator source name is 128 characters.
Description (Optional) Text to describe this source.
DF Designer Server Name or IP address of the computer where the Data Federator Designer
Address resides. For example, if you installed the Data Federator Designer on the
same computer as the Data Federator Integrator, type localhost.
DF Designer Server Port number for the Data Federator Designer. The default value is 3081.
Port
User name Name of the user that will connect to the Data Federator Designer.
Password Password for the user that will connect to the Data Federator Designer.
5. If you want to verify that Metadata Management can connect successfully before you save this
source, click Test connection.
6. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.
111 2011-04-06
Metadata Management Administration
Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history
This section describes how to configure the Metadata Integrator for SAP BusinessObjects Data Services.
Note:
Ensure that you selected the SAP BusinessObjects Data Services Metadata Integrator when you
installed SAP BusinessObjects Information Steward.
To configure the Data Services Integrator, you must have the Create or Add right on the integrator
source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
2. Click the down arrow next to Manage in the top menu tool bar and select New > Integrator Source.
3. In the Integrator Type drop-down list, select Data Services.
4. On the "New Integrator Source" page, in the Integrator Type drop-down list, select BusinessObjects
Data Services information.
5. Enter the following Data Services information.
Option Description
Name that you want to use for this source. The maximum length of an
Name
integrator source name is 128 characters.
Computer Name Name of the computer where the Data Services repository resides.
112 2011-04-06
Metadata Management Administration
Option Description
The name of the database, data source, or service name. Specify the
following name for the database type of the Data Services repository:
• DB2 - Data source name
Datasource,
Database Name, or • Microsoft_SQL_Server - Database name
Service name • Oracle - SID/Service name
• Sybase - Database name
Database User Name of the user that will connect to the Data Services repository.
The password for the user that will connect to the Data Services reposi-
Database Password
tory.
6. If you want to verify that Metadata Management can connect successfully before you save this
source click Test Connection.
7. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the window.
Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history
This section describes how to configure the Metadata Integrator for Meta Integration® Metadata Bridge
(MIMB). For a description of the objects collected by the MIMB Integrator, see the MIMB documentation
at http://www.metaintegration.net/Products/MIMB/Documentation/.
Note:
Ensure that you selected the Meta Integration Metadata Bridge (MIMB) Metadata Integrator when you
installed SAP BusinessObjects Information Steward.
To configure the MIMB Integrator, you must have the Create or Add right on the integrator source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
2. Click the down arrow next to Manage in the top menu tool bar and select New > Integrator Source.
3. In the Integrator Type drop-down list, select Meta Integration Metadata Bridge.
113 2011-04-06
Metadata Management Administration
4. On the "New Integrator Source" page, enter values for Name and Description. The maximum length
of an integrator source name is 128 characters.
5. In the Bridge drop-down list, select the type of integrator source from which you want to collect
metadata and follow the instructions on the user interface to configure the connection information.
6. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.
Related Topics
•
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history
This section describes how to configure and run the Metadata Integrator for a DB2, JDBC, Microsoft
SQL Server, MySQL, or Oracle relational database.
Note:
Ensure that you selected the Relational Database System Metadata Integrator when you installed SAP
BusinessObjects Information Steward.
To configure the Relational Database Integrator, you must add objects in the Integrator Sources folder.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
2. To access the "New Integrator Source" page, take one of the following actions:
• Click the left-most icon, "Create an Integrator source", in top menu bar.
• On the Manage menu, point to New and click Integrator Source.
The "New Integrator Source" page displays.
3. In the Integrator Type drop-down list, select Relational Database.
4. Specify the pertinent connection information for the relational database that you specify in Connection
Type.
Option Description
114 2011-04-06
Metadata Management Administration
Option Description
115 2011-04-06
Metadata Management Administration
Option Description
5. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.
Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history
3. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
4. Take one of the following actions to access the "New Integrator Source" page.
• Click the left-most icon, "Create an Integrator source", in top menu bar.
• On the Manage menu, point to New and click Integrator Source.
The "New Integrator Source" page displays.
5. In the Integrator Type drop-down list, select Relational Database.
6. Specify the following JDBC connection parameters:
116 2011-04-06
Metadata Management Administration
7. Click Test connection if you want to verify that Metadata Management can connect successfully
before you save this source.
8. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of page.
Related Topics
• Configuring sources for Relational Database Metadata Integrator
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history
117 2011-04-06
Metadata Management Administration
2. If you want to configure a universe connection source that uses an ODBC connection, ensure that
the ODBC Datasource exists in the computer where the integrator will run.
3. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
The "Information Steward" page opens with the Integrator Sources node selected in the tree on
the left.
4. Take one of the following actions to access the "New Integrator Source" page.
• Click the left-most icon, "Create an Integrator source", in top menu bar.
• On the Manage menu, point to New and click Integrator Source.
The "New Integrator Source" page displays.
5. In the Integrator Type drop-down list, select Relational Database.
6. Specify the following universe connection parameters:
118 2011-04-06
Metadata Management Administration
Option Description
Library Files The full paths to the Java library files (separated by semicolons)required by
the Universe Connection. For example:
Note:
In a distributed deployment, you must set Library Files to the classpath on
the computer where the integrator runs.
7. Click Test connection if you want to verify that Metadata Management can connect successfully
before you save this source.
8. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.
Related Topics
• Configuring sources for Relational Database Metadata Integrator
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history
You manage integrator sources and instances in the SAP BusinessObjects Information Steward area
of the CMC.
From the list of configured integrator sources, you can select an integrator source and perform a task
from Manage or Actions in the top menu tool bar.
You can perform the following tasks from the Manage menu.
119 2011-04-06
Metadata Management Administration
You can perform the following tasks from the Actions menu.
View the current and previous executions of this Metadata Integrator source
History
(see Viewing integrator run progress and history ).
View and edit the configuration information for this Metadata Integrator
Properties
source.
Related Topics
• Viewing and editing an integrator source
• Deleting an integrator source
• Changing log levels
• Changing limits
You can view and modify the definition of an integrator source in its "Properties" dialog box to change
its description, connection information, and other pertinent information for the integrator source.
• To view the definition, you must have the right to View the integrator source.
• To modify the definition, you must have the right to Edit the integrator source.
120 2011-04-06
Metadata Management Administration
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
2. Expand the "Metadata Management" node in the Tree panel and select Integrator Sources.
3. From the list in the right pane, select the name of the integrator source that you want.
Note:
If you click the integrator source type, you display the version and customer support information for
the integrator.
Related Topics
• Defining a schedule to run a Metadata Integrator
You might want to delete an integrator source in situations such as the following:
• You want to rename your integrator source
Note:
If you rename your integrator source, you lose all the previously collected metadata.
• You no longer need your integrator source
To delete an integrator source, you must belong to the Metadata Management Administrator user group
or have the right to Delete the integrator source.
1. From the Central Management Console (CMC) click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
2. Expand the Metadata Management node.
3. Select the Integrator Sources node.
A list of configured integrator sources appears in the right panel with the date and time each was
last run.
121 2011-04-06
Metadata Management Administration
4. Select the integrator source and click Manage > Delete in the top menu tool bar.
Note:
If you delete an integrator source, you also delete the metadata from that source that was stored in
the Metadata Management repository.
Each time you run a metadata integrator or SAP BusinessObjects Information Steward utility, SAP
BusinessObjects Information Steward creates a new instance and log files for it. By default, the maximum
number of instances to keep is 100. When this maximum number is exceeded, SAP BusinessObjects
Enterprise deletes the oldest instance and its associated log file.
Note:
The Purge utility deletes the database log in the Metadata Management repository for each instance
that was deleted.
To change the limits to delete integrator source instances, you must have Full Control access level
on the Metadata Management folder.
1. Log on to the CMC with a user name that belongs to the Metadata Management Administrator or
Administrator user group.
2. From the Central Management Console (CMC) click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
3. Expand the Metadata Management node.
4. Click Actions > Limits in the top menu bar.
The "Limits: Metadata Management" window appears.
5. If you want to change the default value of 100 maximum number of instances to keep:
a. Select the check box for the option Delete excess instances when there are more than N
instances.
b. Enter a new number in the box under this option.
c. Click Update to save your changes.
6. If you want to specify a maximum number of instances to keep for a specific user or group:
a. Click the Add button next to Delete excess instances for the following users/groups.
b. Select the user or group name from the "Available users/groups" pane and click >.
c. Click OK.
d. If you want to change the default value of 100 maximum number of instances to keep, type a
new number under Maximum instance count per object per user.
e. Click Update to save your changes.
7. If you want to specify a maximum number of days to keep instances for a specific user or group:
122 2011-04-06
Metadata Management Administration
a. Click the Add button next to Delete instances after N days for the following users or groups
.
b. Select the user or group name from the "Available users/groups" pane and click >.
c. Click OK.
d. If you want to change the maximum number of instances to keep (default value 100), type a new
number under Maximum instance count per object per user.
e. Click Update to save your changes.
8. To close the "Limits: Metadata Management" window, click the X in the upper right corner.
Run the Metadata Integrator to collect the metadata for each source that you configured. When you
select Integrator Sources under the "Metadata Management " node in the tree panel on the left of the
SAP BusinessObjects SAP BusinessObjects Information Steward page in the CMC, all configured
integrator sources appear. When you select an integrator source from the list in the right pane, you can
can run it immediately or define a schedule to run it.
Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
123 2011-04-06
Metadata Management Administration
5. Click Actions > Run Now in the top menu tool bar and select .
Tip:
You can also click the icon "Run selected object(s) now" in the icon bar under Manage and Actions.
6. To view the progress of the integrator run, select the integrator source, and click Action > History.
Tip:
If you select Now in the Run object option under Action > Schedule > Recurrence and click
Schedule, the "Integrator History" page automatically displays.
To define a schedule for an integrator source, you must have the right to Schedule the integrator source.
1. From the Central Management Console (CMC) click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
2. Expand the Metadata Management node.
3. Select the Integrator Sources node.
A list of configured integrator sources appears in the right panel with the date and time each was
last run.
4. From the list of configured sources that appears on the right, select the source from which you want
to collect metadata by clicking anywhere on the row except its type.
Note:
If you click the source type, you display the version and customer support information for the metadata
integrator.
124 2011-04-06
Metadata Management Administration
8. Choose the additional relevant values for the selected recurrence option. For details, see Recurrence
options.
9. If you want to send notification when the integrator has run, select the Notification node on the left.
For more information about Notification, see the SAP BusinessObjects Business Intelligence
Platform Administrator Guide.
10. If you want to trigger the execution of a Metadata Integrator when an event occurs, select the Events
node on the left. For more information about Events, see the SAP BusinessObjects Business
Intelligence Platform Administrator Guide.
11. Select the Parameters node on the left to change the default values for run-time parameters for the
metadata integrator. For details, see Common run-time parameters for metadata integrators.
12. Click Schedule.
13. If you use impact and lineage reports on the Open > Reports option on Metadata Management tab
in Information Steward, you must recompute the contents of the lineage staging table to incorporate
changes from the Integrator runs. Similar to setting up a regular schedule to run an Integrator, you
can set up a schedule to compute the lineage staging table at regular intervals. For more information,
see Computing and storing lineage information for reporting.
When you schedule an integrator source, you can change the default values of the run-time parameters
on the "Parameters" page in the Central Management Console (CMC). The following sections describe
the runtime parameters that you can set for different integrator sources.
125 2011-04-06
Metadata Management Administration
The following table describes the run-time parameters that are applicable to all metadata integrators.
For information about run-time parameters that apply to only specific metadata integrators, see the
topics in Related Topics below.
The Metadata Integrator creates this log in in the Business Objects installa-
tion directory and copies it to the File Repository Server. You can download
this log file after the Metadata Integrator run completed.
File Log Level The default logging level for this log is Configuration. Usually you can
keep the default logging level. However, if you need to debug your integrator
run, you can change the level to log tracing information. For a description
of log levels, see Log levels.
Optional run-time parameters for the metadata integrator source. For more
Additional Arguments
information, see User collection parameters.
Related Topics
• Metadata collection using the Remote Job Server with SSL
126 2011-04-06
Metadata Management Administration
To add metadata to the previous metadata collections, select the Update existing objects and add
newly selected objects option. For example, if you have collected Web Intelligence documents on the
first run, then you collect Crystal Reports on the second run, then you will see the Web Intelligence
documents and Crystal Reports metadata together in the SAP BusinessObjects Metadata Management
Explorer. The first time you schedule and run a metadata collection for a specific object type, all metadata
for that object is collected. Subsequent runs will only collect changes since the last run.
To delete metadata from previous metadata collections, select the Delete existing objects before
starting object collection option. For example, if you have collected Web Intelligence documents on
a previous run, and then choose to collect Crystal Reports on the next run, you will see only Crystal
Reports metadata in the Metadata Management Explorer.
You can collect metadata from Universe, Public, or Personal folders. Anyone using the SAP
BusinessObjects Metadata Management Explorer can see the contents of these folders. However, you
must have the proper permissions to be able to run collections on the objects in these folders. In your
Personal folder, only you or an administrator can run a collection on those objects.
Note:
SAP BusinessObjects Enterprise Metadata Integrator does not collect metadata from Inboxes or
Categories.
127 2011-04-06
Metadata Management Administration
Option Description
Collect Web Intelligence Documents and source Collects Web Intelligence documents and source
Universes universes from Public and/or Personal folders.
128 2011-04-06
Metadata Management Administration
Option Description
Note:
It is recommended that you specify Delete exist-
ing objects before starting object collection
the first time you run the integrator source, but
specify Update existing objects and add newly
selected objects for subsequent runs.
6.1.4.2.3 Metadata collection using the Remote Job Server with SSL
You can configure an integrator source on SAP BusinessObjects Information Steward 4.0 to collect
metadata from an SAP BusinessObjects Enterprise XI 3.x system. The Information Steward installation
program installs a Remote Job Server component on the Enterprise XI 3.x to enable this collection. If
SSL is enabled between the Information Steward 4.0 and the Remote Job Server systems, you must
set the businessobjects.migration run-time parameter when you schedule the integrator source.
When you schedule an integrator source to collect metadata from an SAP BusinessObjects Enterprise
XI 3.x system that has SSL enabled, you must set the businessobjects.migration run-time
parameter
Note:
You cannot run remote integrators if Federal Information Processing Standards (FIPS) mode is enabled
on Enterprise XI 3.x.
To set the run-time parameter for the SAP BusinessObjects Enterprise XI 3.x integrator source when
SSL is enabled:
1. From the Central Management Console (CMC) click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
2. Expand the Metadata Management node and click Integrator Sources.
129 2011-04-06
Metadata Management Administration
3. From the list of configured sources that appears on the right, select the source by clicking anywhere
on the row except its type.
Note:
If you click the source type, you display the version and customer support information for the metadata
integrator.
The SAP NetWeaver Business Warehouse metadata integrator provides run-time parameters to adjust
the number of threads to use when collecting metadata from the SAP system and to filter the queries
or workbooks to collect.
130 2011-04-06
Metadata Management Administration
When you select the Integrator Sources in the Tree panel on the left on the SAP BusinessObjects
Information Steward page in the CMC, all configured integrator sources display. Next to each source
name is the date and time when the Integrator was last run for that source.
To view all runs for a metadata integrator source, you must have the right to Viewintegrator sources:
1. From the list of all configured integrator sources, select the name of the source that you want to see
the history of runs.
2. Click the down arrow next to "Actions" in the top menu tool bar and select History from the drop-down
list.
3. The "Integrator History" page displays the following information:
• All "Schedule Names" for the integrator source.
131 2011-04-06
Metadata Management Administration
• Status of each schedule. The possible values are, Success, Failed, Running, Paused, Resumed,
and Stopped.
• "Start Time", "End time", and "Duration" of each integrator run.
• "Log File" for that integrator run.
4. Click the "View the database log" icon (fifth from the left) in the menu bar or click Actions > Database
Log to view the progress messages of the integrator run. For details on logs, see Information Steward
logs
By default, Metadata Management writes high-level messages (such as number of universe processed
and number of reports processed) to the log. You can change the message level on the configuration
page for the integrator source. For details, see Changing log levels.
5. In the top menu bar of the "Integrator History" page, use either the Actions drop-down list or the
icons to perform any of the following options:
Option Description
6.1.6 Troubleshooting
The following sections tell you how to interpret the warning and error messages that you might see in
the database log and file log for each Metadata Integrator run.
132 2011-04-06
Metadata Management Administration
Action: Open the Desktop Intelligence document and edit the data provider to specify a valid universe.
Cause: The SAP BusinessObjects Enterprise Metadata Integrator does not have enough memory to
run.
Action: Decrease the value of the MaxPermSize run-time parameter in the JVM Arguments on the
"Parameters" page when you schedule the integrator source. For example, enter
-XX:MaxPermSize=256m and rerun the metadata integrator.
133 2011-04-06
Metadata Management Administration
Action: Always use the fully-qualified column names in the projection list of the SELECT clause.
Action: Fully qualify the column reference or add the tables used by the derived tables to the universe.
The Metadata Integrator treats the universe tables as a system catalog to find the table and column
references.
However, the Metadata Integrators collect the SQL statement and the Metadata Management Explorer
displays it.
Action: Analyze the SQL statement in the Metadata Management Explorer and establish a user-defined
relationship for these tables and columns.
Cause: You do not have sufficient privilege to extract metadata about Web Intelligence documents.
134 2011-04-06
Metadata Management Administration
• Have your administrator change your security profile to give you permission to refresh Web Intelligence
documents.
• Run the Metadata Integrator with a different user id that has permission to refresh Web Intelligence
documents.
Cause: If a database connection is not configured for Trusted Authentication in SAP BusinessObjects
Business Intelligence Platform, you must supply the user id and password at runtime. If you try to collect
metadata for a report that uses a non-Trusted connection to the database, the report collection fails.
Action: Configure both your SAP BusinessObjects Business Intelligence Platform server and client to
enable Trusted Authentication. For details, see the SAP BusinessObjects Business Intelligence Platform
Administrator Guide.
Cause: The extract for Web Intelligence documents fails if you create your Web Intelligence documents
with the Refresh on Open option and the computer on which you run the BusinessObjects Enterprise
Metadata Integrator does not have connection to the source database on which the reports are defined.
If the Data Federator Integrator connects successfully to the Data Federator Designer, but Data Federator
returns an error:
1. Login to the Data Federator Designer.
2. Within Data Federator Designer, you can obtain a more detailed error message.
SAP BusinessObjects Information Steward provides the capability to group Metadata Sources into
groups such as Development System, Test System, and Production System. After the groups are
defined, you can view impact and lineage diagrams for a specific Source Group.
Related Topics
• Creating source groups
• Modifying source groups
135 2011-04-06
Metadata Management Administration
You must have the Add Objects right on the SAP BusinessObjects Information Steward Source group
folder to create a Metadata Integrator source group.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Select Source Groups node in the tree on the left on the page.
3. Access the "Source Group" window in one of the following ways:
• Click the second icon "Create a Source Group" in the menu bar on top.
• In the menu bar on top, click Manage > New > Source Group.
4. Define the configuration on the Source Group page.
a. Enter the Name and Description.
b. Select integrator sources to add to this source group by clicking the check box to the left of each
integrator source name.
c. Click Save.
The new source group name appears on the right side of the page.
You must have the Edit right on the Metadata Integrator source group.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Select Source Groups node in the tree on the left on the page.
3. Select the source group that you want to modify.
4. In the menu bar on top, click Actions > Properties.
5. You can change any of the following properties of the source group:
• Name
• Description
• Integrator sources that you want to remove or add to the source group.
6. Click on User Security to do any of the following tasks:
• Add principals to this source group.
• View security for a selected principal.
• Assign security for a selected principal.
7. Click Save.
136 2011-04-06
Metadata Management Administration
You must have the Delete right on the Metadata Integrator source group.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Select Source Groups node in the tree on the left on the page.
3. Select the source group that you want to delete.
4. In the menu bar on top, click Manage > Delete.
5. Click OK to confirm the deletion.
137 2011-04-06
Metadata Management Administration
138 2011-04-06
Cleansing Package Builder Administsration
Ownership can be reassigned only by an Information Steward administrator. You can change ownership
only for private cleansing packages. Published cleansing packages are either unowned or linked to
private cleansing packages and therefore when you change the ownership of a private cleansing
package, the new owner automatically can republish to the linked cleansing package.
139 2011-04-06
Cleansing Package Builder Administsration
The status of a cleansing package is displayed in the status bar of the "Cleansing Package Tasks"
screen and is also indicated by the cleansing package icon.
It may take some time for a cleansing package with a BUSY status to complete the operation and
change to a READY state. The state of a cleansing package with a BUSY status cannot be changed
by an Information Steward administrator. You can either wait for the operation to complete or delete
the cleansing package.
When a cleansing package is opened for editing, it enters a locked state so that no other user may edit
it. To close a cleansing package, you must either return to the "Cleansing Package Tasks" screen,
switch to another cleansing package, or log off from Cleansing Package Builder. If the browser window
is closed or the computer is shut down without logging off, the cleansing package may become locked.
A cleansing package may become locked when it is in any of the following states:
• OPEN_FOR_READWRITE
• CANCEL_AUTO_ANALYSIS
• CANCEL_PUBLISHING
• AUTO_ANALYSIS
• PUBLISHED
To unlock a cleansing package:
140 2011-04-06
Cleansing Package Builder Administsration
1. (Data steward ) When you encounter a locked cleansing package ( , , or ) in the "Cleansing
Package Tasks" screen, ask your Information Steward administrator to unlock it.
2. (Information Steward administrator) Change the cleansing package state from LOCKED to ERROR.
a. Log in to the Central Management Console (CMC).
b. Select Information Steward.
c. Expand the "Cleansing Package" node and select Private or Published.
d. Right-click the desired cleansing package, and choose Properties.
e. Change the state to ERROR and click Save.
f. Notify the data steward that the cleansing package state is updated to ERROR. .
The data steward must verify the condition of the cleansing package prior to further use
3. (Data steward) When notified that the cleansing package is unlocked and moved to the ERROR
state ( , , or ), do the following:
a. From the "Cleansing Package Tasks" screen, open the cleansing package.
b. Verify the condition of the cleansing package and that it displays information as expected.
c. Close the cleansing package.
If the cleansing package is returned to a READY state and its condition was as you expected,
you may use it.
If the cleansing package returns to the ERROR state or the condition was not as expected, the
cleansing package is corrupt and should be deleted.
Related Topics
• Cleansing package states and statuses
You can view cleansing package properties, including the state, in the following locations:
• In the "Cleansing Package Tasks" screen.
In Cleansing Package Builder, in the "Cleansing Package Tasks" screen, hover over the desired
cleansing package to display the properties sheet.
• In the Information Steward area of the Central Management Console (must have Information Steward
administrator privileges).
Log in to the Central Management Console (CMC). Select Information Steward. Expand the
"Cleansing Package" node and select Private or Published. Right-click the desired cleansing
package and choose Properties.
The table below describes the possible states for a cleansing package:
141 2011-04-06
Cleansing Package Builder Administsration
State Description
A cleansing package may have of the following statuses: READY, BUSY, LOCKED, ERROR. The
status of a cleansing package is displayed in the status bar of the "Cleansing Package Tasks" screen
and is also indicated by the cleansing package icon. The following table shows the state, associated
icon and possible user action:
, , or
READY Open and edit or view the cleansing package.
142 2011-04-06
Cleansing Package Builder Administsration
Ensure that you do not have the cleansing package open in a different
browser window.
, , or Wait at least 20 minutes for Cleansing Package Builder to automatically
LOCKED
close the cleansing package and restore it to a READY state.
ER , , or Open the cleansing package and assess its condition. For more information,
ROR see Unlocking a cleansing package.
The state of a cleansing package with a BUSY status cannot be changed by an Information Steward
administrator. You can either wait for the operation to complete or delete the cleansing package.
A cleansing package with a LOCKED status can be unlocked by an Information Steward administrator.
Unlocking a cleansing package changes its status to ERROR. Before further use, the condition of the
cleansing package must be verified by a data steward.
Related Topics
• Unlocking a cleansing package
143 2011-04-06
Cleansing Package Builder Administsration
144 2011-04-06
Information Steward Utilities
SAP BusinessObjects Information Steward provides the following utilities that you manage on the CMC.
Configurable Default
Utility Description
Properties Schedule
145 2011-04-06
Information Steward Utilities
Configurable Default
Utility Description
Properties Schedule
None
Recreates Metadata Management search indexes. Integrator
source (You either
Update Information Steward provides a configured Search Index create a
Search utility that rebuilds the search indexes across all integrator (See Modify- schedule for
Index sources. You can create additional configurations to ing utility a configured
recreate search indexes for specific integrator sources. configura- utility or run it
tions) immediately.)
Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Scheduling a utility
• Running a utility on demand
• Monitoring utility executions
• Modifying utility configurations
• Creating a utility configuration
The SAP BusinessObjects Information Steward Repository provides a lineage staging table,
MMT_Alternate_Relationship, that consolidates end-to-end impact and lineage information across all
integrator sources. The Metadata Management module provides pre-defined Crystal Reports from this
146 2011-04-06
Information Steward Utilities
table. You can also create your own reports from this table. To view the reports on the Reports option
in the Open drop-down list in the Metadata Management tab, they must be Crystal Reports (see
"Defining custom reports" in the Users Guide).
Before generating reports that rely on this lineage staging table, you should update the lineage information
in the lineage staging table. You can either schedule or run the Compute Lineage Report utility on
demand to ensure those reports contain the latest lineage information.
The following activities can change the lineage information, and it is recommended that you run the
lineage computation after any of these activities occur:
• Run an Integrator to collect metadata from a source system (see "Running a Metadata Integrator"
in the Metadata Management Administration section).
• Change preferences for relationships between objects (see "Changing preferences for relationships"
in the Users Guide). The data in the lineage staging table uses the values in Impact and Lineage
Preferences and Object Equivalency Rules to determine impact and lineage relationships across
different integrator sources.
• Establish or modify a user-defined relationship of type Impact or Same As (see "Establishing
user-defined relationships between objects" in the Users Guide).
Related Topics
• Scheduling a utility
• Running a utility on demand
• Monitoring utility executions
• Modifying utility configurations
• Creating a utility configuration
The search feature of the Metadata Management module of SAP BusinessObjects Information Steward
allows you to search for an object that might exist in any metadata integrator source. When you run a
metadata integrator source, Metadata Management updates the search index with any changed
metadata.
You might need to recreate the search indexes in situations such as the following:
• The Search Server was disabled and could not create the index while running a metadata integrator
source.
• The search index is corrupted.
Related Topics
• Scheduling a utility
• Running a utility on demand
• Monitoring utility executions
• Modifying utility configurations
147 2011-04-06
Information Steward Utilities
9. Optionally, set the Number of retries to a value other than the default 0 and change the Retry
interval in seconds from the default value 1800.
10. If you want to be notified when this utility runs successfully or when it fails, expand Notification,
and fill in the appropriate information. For more information about Notification, see the SAP
BusinessObjects Business Intelligence Platform Administrator Guide.
11. If you want to trigger the execution of this utility when an event occurs, expand Events, and fill in
the appropriate information. For more information about Events, see the SAP BusinessObjects
Business Intelligence Platform Administrator Guide.
12. Click Schedule.
13. If you want this newly created schedule to override the default recurring schedule for the Purge or
Calculate Scorecard utility, delete the old recurring instance.
148 2011-04-06
Information Steward Utilities
a. From the list of "Utility Configurations", select the name of the utility whose schedule you want
to delete.
b. Click Actions > History.
c. Select the recurring schedule that you want to delete and click the delete icon in the menu bar.
Note:
To change the recurring schedule directly (instead of creating a new and deleting the old one), see
Rescheduling a utility.
Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Running a utility on demand
• Monitoring utility executions
• Modifying utility configurations
• Creating a utility configuration
To reschedule a utility:
1. Login to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator group or the Administrator group.
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the "Applications Name" list.
4. Click Action > Manage Utilities in the top menu tool bar.
5. From the list of "Utility Configurations", select the name of the utility configuration that you want to
reschedule.
6. In the top menu tool bar, click Actions > History.
7. On the "Utility History" screen, select the schedule name with that has a schedule status of "Recurring"
and click Reschedule in the top menu bar.
a. Click Recurrence in the navigation tree in the left pane of the "Reschedule" window.
149 2011-04-06
Information Steward Utilities
7. On the "Utility Configurations" screen, click the Refresh icon to update the "Last Run" column for
the utility configuration.
Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Scheduling a utility
• Monitoring utility executions
150 2011-04-06
Information Steward Utilities
The utility is scheduled to run one time. When it actually runs, there will
Pending
be another instance with status “Running."
The utility is scheduled to recur. When it actually runs, there will be another
Recurring
instance with status “Running."
151 2011-04-06
Information Steward Utilities
c. To close the "Database Log" window, click the X in the upper right corner.
8. To save a copy of a utility log:
a. Scroll to the right of the "Utility History" screen, and click the Download link in the "Log File"
column in the row of the utility instance you want.
b. Click Save.
c. On the "Save As" window, browse to the directory where you want to save the log and optionally
change the default file name.
9. To close the "Utility History" screen, click the X in the upper right corner.
Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Scheduling a utility
• Running a utility on demand
• Modifying utility configurations
• Creating a utility configuration
SAP BusinessObjects Information Steward provides a default configuration for each of the utilities. You
can modify the configuration settings for the following utilities:
• Compute Lineage Report utility
The default configuration for the Compute Lineage Report utility has Mode set to Optimized which
recalculates lineage information in the lineage staging table for only integrator sources that have
changed since the utility was last run. You might want to set Mode to Full to recalculate lineage
information across all integrator sources.
• Update Search Index utility
The default configuration for the Update Search Index utility has Integrator Source set to All
Sources, which rebuilds the search indexes in Metadata Management for all integrator sources.
You might want to set Integrator Source to a specific integrator source to rebuild search indexes
for the metadata collected for only that integrator source.
Note:
The Calculate Scorecard and Purge utilities do not have configuration parameters.
152 2011-04-06
Information Steward Utilities
4. Click Action > Manage Utilities in the top menu tool bar.
5. Select the utility whose configuration you want to change.
6. Click Actions > Properties in the top menu tool bar.
The "Utilities Configurations" screen appears.
7. For a Compute Lineage Report utility, you can change the following parameters:
• Description
• Mode
Mode can be set to one of the following values:
• Full mode recalculates all impact and lineage information and repopulates the entire lineage
staging table.
Note:
If you select Full mode, the computation can take a long time to run because it recalculates
impact and lineage information across all integrator sources.
• Optimized mode (the default) recalculates recalculates impact and lineage information for
only the integrator sources that contain changes since the last time the computation was run.
For example, if only one Integrator was run, the computation only recalculates impact and
lineage information corresponding to that integrator source and updates the lineage staging
table.
8. For an Update Search Index utility, you can change the following parameters:
• Description
• Integrator Source
• All Sources recreates the search index for all integrator sources that you have configured.
• The specific name of an integrator source that appears in the drop-down list.
9. Click Save.
Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Scheduling a utility
• Running a utility on demand
• Monitoring utility executions
• Creating a utility configuration
SAP BusinessObjects Information Steward provides a default configuration for each of the utilities. You
can define another configuration with different settings and still keep the default configuration for the
following utilities:
153 2011-04-06
Information Steward Utilities
Note:
You cannot define another configuration for the Calculate Scorecard and Purge utilities.
Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Scheduling a utility
• Running a utility on demand
• Monitoring utility executions
• Modifying utility configurations
154 2011-04-06
Server Management
Server Management
• SAP BusinessObjects Data Services Job Server which executes the Data Insight profiling tasks.
You can create multiple Job Servers, each on a different computer, to use parallel execution for the
profiling tasks.
For a description of these servers and services, see Servers and services.
This section describes how to manage the above servers for Information Steward.
To verify that the SAP BusinessObjects Information Steward servers are running and enabled:
1. From the CMC Home page, go to the "Servers" management area.
2. Expand the Service Categories node and select Enterprise Information Management Servers.
155 2011-04-06
Server Management
The list of servers in the right pane includes a State column that provides the status for each server
in the list.
3. Verify that the following Enterprise Information Management Servers servers are “Running” and
“Enabled”.
• "EIMAdaptiveProcessingServer"
• "ISJobServer"
4. If a Enterprise Information Management Servers server is not running or enabled, do the following:
a. Select the server name from the list.
b. Open the Actions drop-down menu and select Start Server or Enable Server.
For information about the services that run under Enterprise Information Management Servers , see
the “Architecture” section.
Related Topics
• Services
To verify that the SAP BusinessObjects Information Steward services were added:
1. From the CMC Home page, go to the "Servers" management area.
2. Expand the Service Categories node and select Enterprise Information Management Services.
The right pane lists the following servers:
• "EIMAdaptiveProcessingServer"
• "ISJobServer"
3. Ensure that the relevant services appear for "EIMAdaptiveProcessingServer".
a. Right-click "EIMAdaptiveProcessingServer" and click Stop Server.
b. Right-click "EIMAdaptiveProcessingServer" and click Select Services.
c. Verify that for each Information Steward feature that you installed, the list of services for
EIMAdaptiveProcessingServer includes the following services:
Services
156 2011-04-06
Server Management
Services
d. If any of the EIMAdaptiveProcessingServer services are not in the list on the right, select the
service name from the "Available services" list on the left, click > to add it, and click OK.
e. Right-click "EIMAdaptiveProcessingServer" and click Start Server.
4. Ensure that the relevant services appear for "ISJobServer".
a. Right-click "ISJobServer" and click Stop Server.
b. Right-click "ISJobServer" and click Select Services.
c. Verify that for each Information Steward feature that you installed, the list of services for
"ISJobServer" includes the services listed in the table in step 3c.
d. If any of the "ISJobServer" services are not in the list on the right, select the service name from
the "Available services" list on the left, click > to add it, and click OK.
e. Right-click "ISJobServer" and click Start Server.
The installation process of SAP BusinessObjects Data Services configures the following services (under
the server EIMAdaptiveProcessingServer) with default settings.
• Metadata Browsing Service
• View Data Service
These services are used by SAP BusinessObjects Information Steward to connect and view data in
profiling sources. You might want to change the configuration settings to more effectively integrate
Information Steward with your hardware, software, and network configurations.
157 2011-04-06
Server Management
Related Topics
• Metadata Browsing Service configuration parameters
• View Data Service configuration parameters
You can change the following properties of the Metadata Browsing Service.
158 2011-04-06
Server Management
You can change the following properties of the View Data Service.
159 2011-04-06
Server Management
160 2011-04-06
Server Management
The SAP BusinessObjects Data ServicesJob Server performs the profile and rule tasks on data in Data
Insight connections. The Information Steward Task Server sends Data Insight profile tasks to the Data
ServicesJob Server which partitions the data and uses parallel processing to deliver high data throughput
and scalability.
This section contains the tasks to manage Data ServicesJob Servers for Information Steward.
Related Topics
• Configuring a Data Services Job Server for Data Insight
• Adding Data Services Job Servers for Data Insight
• Displaying job servers for Information Steward
• Removing a job server
If you will run Data Insight profile and rule tasks, you must access the Data Services Server Manager
to create a job server and associate it with the Information Steward repository. This association adds
the job server to a pre-defined Information Steward job server group that Data Insight will use to run
tasks. For details about how job server groups improve performance, see the “Performance and Sizing
Considerations” section in the Administrator Guide.
To configure a job server and associate it with the Information Steward repository:
1. Access the Data Services Server Manager from the Windows Start menu:
Start > Programs > SAP BusinessObjects Data Services XI 4.0 > Data Services Server Manager
2. On the "Job Server" tab, click Configuration Editor.
3. On the "Job Server Configuration Editor" window, click Add.
4. In the "Job Server Properties" window, enter a name for Job Server name.
5. In the "Associated Repositories" section, click Add and fill in the "Repository Information" of the
Information Steward repository that you want to associate with this Job Server.
• Database type
• Database Server name
• Database name
• Username
• Password
6. Click Apply and OK.
161 2011-04-06
Server Management
The "Job Server Configuration Editor" window now displays the job server you just added.
7. Click Close and Restart to restart the job server with the updated configurations.
For more information about using the Data ServicesServer Manager, see “Server management” in the
SAP BusinessObjects Data Services Administrator's Guide.
If you installed additional Data Services Job Servers on multiple computers, you can use them to run
Data Insight profile and rule tasks even though Information Steward is not installed on those computers.
On each computer where a Data Services Job Servers is installed, you must add a job server to the
pre-defined Information Steward job server group. For details, see Configuring a Data Services Job
Server for Data Insight.
For more information about job server groups, see “Performance and Sizing Considerations” in the
Administrator Guide.
Related Topics
• Displaying job servers for Information Steward
• Removing a job server
To display the job servers that are associated with your SAP BusinessObjects Information Steward
repository:
1. Log in to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator group or the Administrator group.
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the "Applications Name" list.
4. Click Action > View Data Services Job Server in the top menu tool bar.
The "Job Server List" screen displays:
• The name of each Data Server Job Server associated with the Information Steward repository.
• The computer name and port number for each Data Server Job Server.
Related Topics
• Configuring a Data Services Job Server for Data Insight
162 2011-04-06
Server Management
To remove a job server from the Information Steward job server group on Data Services:
1. On each computer where you installed additional Data Service Job Servers, access the Data Services
Server Manager from the Windows Start menu:
Start > Programs > SAP BusinessObjects Data Services XI 4.0 > Data Services Server Manager
2. On the "Job Server" tab, click Configuration Editor.
3. On the "Job Server Configuration Editor" window, select the name of the job server you want to
delete and click Delete.
4. In the "Job Server Properties" window, in the "Associated Repositories" section, ensure the name
of your Information Steward repository is selected and click Delete.
5. In the " Repository Information" section, enter the password of your Information Steward repository
and click Apply.
6. Click Yes on the prompt that asks if you want to remove persistent cache tables.
7. Click OK to return to the "Job Server Manager" window.
8. Click Close and Restart and then click OK to restart the Data Services job server with the updated
configuration.
Related Topics
• Configuring a Data Services Job Server for Data Insight
• Adding Data Services Job Servers for Data Insight
• Displaying job servers for Information Steward
163 2011-04-06
Server Management
164 2011-04-06
Performance and Scalability Considerations
The main Information Steward functions that influence performance and sizing are as follows.
Data profiling
Information Steward can perform basic and advanced data profiling operations to collect information
about data attributes like minimum, maximum values, pattern distribution, data dependency, uniqueness,
address profiling, and so on. These operations require intense, complex computations and are affected
by the amount data that is processed.
Complexity of rules also affects performance. For example, if you have lookup functions in rule
processing, it takes more time and disk space. Typically, lookup tables are small, but if they are big, it
can adversely affect performance.
Metadata integrators
Metadata integrators collect metadata about different objects in the source systems and store it in the
Information Steward repository. If a large amount of metadata is being collected, it can affect performance
and the size of the repository.
165 2011-04-06
Performance and Scalability Considerations
Related Topics
• Factors that influence performance and sizing
• Scalability and performance considerations
• Best practices for performance and scalability
10.2 Architecture
The following diagram shows the architectural components for SAP BusinessObjects Business
Intelligence platform, SAP BusinessObjects Data Services, and SAP BusinessObjects Information
Steward. The resource-intensive servers and services are indicated with a red asterisk:
• Data Services Job Server processes data profiling and rule validation tasks in Data Insight.
• Cleansing Package Builder Auto-analysis Service analyzes sample data and generates parsing
rules for Cleansing Package Builder.
• Information Steward Integrator Scheduling Service processes Metadata Integrators.
• Metadata Relationship Service performs lineage and impact analysis.
• Data Services Metadata Browsing Service obtains metadata from Data Insight connections.
• Data Services Viewdata Service obtains the source data from Data Insight connections.
• Web Applicaiton Server handles requests from users on the web applications for Information Steward.
For more details about these servers and services, see Servers and services.
166 2011-04-06
Performance and Scalability Considerations
Use Information Steward to work on large amounts of data and metadata. The following factors affect
the performance and sizing. These factors affect the required processing power (CPUs), RAM, and
hard disk space (for temporary files during processing) and the size of the Information Steward repository.
Related Topics
• Data characteristics and profiling type
• Number of concurrent users
167 2011-04-06
Performance and Scalability Considerations
The amount of the data is calculated using the number of records and the number of columns. In general,
the more data that is processed, the more time and resources it requires. This is true for profiling and
rules validation operations. The data may come from one or more sources, multiple tables within a
source, views, or files.
This factor affects the required CPU, RAM, and hard disk space and the Information Steward repository
size. The larger the record size, the more resources are required for efficient processing.
If the data being processed has many columns, it requires more time and resources. If you are doing
column profiling on many columns or if the columns being processed have long textual data, it affects
performance. In short, if the record length is more, more resources are required.
Related Topics
• Factors that influence performance and sizing
For distribution profiling such as Value, Pattern, or Word distribution, if the data has many distinct values,
patterns, or words, it requires more resources and time to process.
This factor affects the required CPUs, RAM, and Information Steward repository size. The more distinct
the data, the more resources required.
Related Topics
• Factors that influence performance and sizing
Information Steward is a multi-user web-based application. As the number of concurrent users increases,
it affects practically all aspects of the application. More users may mean more profiling tasks, more
scorecard views and rule execution, and so on. The key word is concurrent.
168 2011-04-06
Performance and Scalability Considerations
If all users run tasks concurrently, it affects the required CPUs, RAM, and hard disk and the Information
Steward repository size. If most of the users are just viewing the scorecard, then the performance
depends more on the web application server where the Information Steward repository is created.
Related Topics
• Factors that influence performance and sizing
Metadata integrators collect metadata from various sources. As the number of metadata sources
increases, it takes more resources and time. This factor affects the required CPUs, RAM, and Information
Steward repository size.
Related Topics
• Factors that influence performance and sizing
If multiple users view impact and lineage information concurrently, it affects the response time.
Related Topics
• Factors that influence performance and sizing
169 2011-04-06
Performance and Scalability Considerations
For Cleansing Package Builder, the amount of sample data used to create the cleansing package affects
the performance of the Auto-analysis service. This in turn can affect response time for the user interface.
Related Topics
• Factors that influence performance and sizing
When you create custom cleansing packages, if there are many parsed values per row, it requires more
resources and time to analyze and create parsing rules. For Cleansing Package Builder, the larger the
data set, the more CPU processing power is required. The more parsed values per row, the more RAM
is required.
Related Topics
• Factors that influence performance and sizing
If there many users create custom cleansing packages concurrently, it requires more CPUs and RAM.
Related Topics
• Factors that influence performance and sizing
170 2011-04-06
Performance and Scalability Considerations
Information Steward uses SAP BusinessObjects Business Intelligence platform and Data Services
platform for most of the heavy computational work. It inherits the service-oriented architecture provided
by these platforms to support a reliable, flexible, highly available, and high performance environment.
Here are some features and recommendations for using the platforms for performance. These are not
mutually exclusive, nor are they sufficient by themselves in all cases. You should employ a combination
of the following ways to improve throughput, reliability, and availability of the deployment.
Related Topics
• Resource intensive functions
• Factors that influence performance and sizing
• Best practices for performance and scalability
• Information Steward web application
• Scheduling tasks
• Queuing tasks
• Degree of parallelism
• Grid computing
• Multi-threaded file read
• Data Insight result set optimization
• Performance settings for input data
• Settings to control repository size
• Settings for Metadata Management
• Settings for Cleansing Package Builder
Related Topics
• Scalability and performance considerations
• Information Steward web application
171 2011-04-06
Performance and Scalability Considerations
Business Intelligence platform provides a distributed scalable architecture. This means that services
that are needed for a specific functionality can be distributed across machines in the given landscape.
As long as the services are in the same Business Intelligence platform environment, it doesn't matter
which machine they are on; they just need to be in the same CMS cluster. The Information Steward
web application and the Information Steward repositories can be on different machines.
Information Steward uses some Business Intelligence platform services, and also has its own services
that can be distributed across machines for better throughput. The general principle is that if one of the
services needs many resources, then it should be on a different machine. Similarly, if you add capacity
to existing hardware, it can be used for more than one service.
The Data insight module of Information Steward uses the Data Services Job Server, which supports
distributed processing.
This section offers some recommendations on different combinations of Information Steward services
that can be combined or decoupled.
Related Topics
• Scalability and performance considerations
• Data Insight related services
• Metadata Management related services
• Cleansing Package Builder related services
• Information Steward repository
• Information Steward web application
• Grid computing
The most important part of the processing for Data insight is done by the Data Services Job Server.
You can install Data Services Job Servers on multiple machines and make them part of the single job
server group that is used for Information Steward. The profiling and rules tasks are distributed by the
Information Steward Job Server to the Data Services Job Server group. The actual tasks are executed
by a specific Data Services Job Server based on the resource availability on that server. So, if one
server is busy, the task can be processed by another server. This way, multiple profiling and rule tasks
can be executed simultaneously.
Related Topics
• Scalability and performance considerations
172 2011-04-06
Performance and Scalability Considerations
For Metadata Management, two important services from a performance perspective are the metadata
relationship service and metadata integrators.
Generally, these two services should be on separate servers (or should run at different times, if they
are on the same server). If there are many metadata integrators and they collect a lot of metadata from
different sources, each one of them could be on a separate server.
Schedule intensive processing integrators to run at non-commercial hours.
Related Topics
• Scalability and performance considerations
• Information Steward web application
The Auto-analysis service is a resource intensive service in Cleansing Package Builder. The CPU and
RAM requirements depend on the number of parsed values found in the data. Also, if multiple concurrent
users create large cleansing packages, it is recommended that you dedicate one server to the
Auto-analysis service and allocate enough memory to the Java process.
Related Topics
• Scalability and performance considerations
• Information Steward web application
• Common runtime parameters for Information Steward
The Information Steward repository stores all of the metadata collected, profiling and rule results, and
sample data. The repository should be on a separate database server. To avoid resource contention,
173 2011-04-06
Performance and Scalability Considerations
the database server should not be the same database server that contains the source data. This may
or may not be on the same database server that hosts the Business Intelligence platform repository.
The Information Steward repository should be on the same subnetwork as the Data Services Job Server
that processes large amounts of data and the metadata integrator that processes the largest amount
of metadata.
Related Topics
• Scalability and performance considerations
• Information Steward web application
Typically all web applications are installed on a separate server with other web applications. No other
services or repositories are typically installed with the web application, so that response time for
Information Steward user interface users is not affected by services that process data.
Related Topics
• Scalability and performance considerations
• Information Steward web application
Information Steward runs many tasks that are resource intensive. Business Intelligence platform provides
the ability to schedule them. This ability can be used to distribute the tasks so that the same resources
can be utilized for multiple purposes. The following can be scheduled:
• Profiling tasks
• Rule tasks
• Metadata integrators
• Calculate Scorecard utility
• Compute Lineage report utility
• Purge utility
• Update search index utility
If you schedule these tasks so that they run at different times and when few users access the system,
you can achieve good performance with limited resources. This time slicing is highly recommended for
profiling, rules tasks, and metadata integrators. For example, if you have users that process profiling
174 2011-04-06
Performance and Scalability Considerations
tasks on demand during business hours, then the metadata integrators and rules task should be
scheduled during non-business hours.
If there are large profiling jobs, they should be scheduled during non-business hours and ideally on a
dedicated powerful server.
Related Topics
• Scalability and performance considerations
• Scheduling a task
• Scheduling a utility
Information Steward can queue when there are many Data Insight tasks that are requested to run at
the same time. This depends on the user configuration for the Average Concurrent Tasks option.
Based on this setting and the number of Data Services Job Servers in the group, Information Steward
calculates the total number of tasks allowed to run simultaneously in a given landscape. Only that many
tasks are sent to the Data Services Job Server group for processing. The remaining tasks are queued.
As soon as one of the running tasks finishes, the next task in the queue is processed.
Using this setting, you can control how many Data Insight-related processes are running so that the
resources can be utilized and scheduled for other processes running on the system.
Related Topics
• Scalability and performance considerations
• Configuration settings
For Data Insight functionality, Information Steward uses the Data Services engine. The Data Services
engine supports parallel processing in multiple ways, one of which is Degree of Parallelism (DOP). The
basic idea is to split a single Data Services job into multiple processing units and utilize available
processing power (CPUs) on a server to work on those processing units in parallel. The distribution of
work is different for profiling vs. rule processing.
Note:
• DOP is only used for Data Insight functionality for column profiling and rule processing. Metadata
Management and Cleansing Package Builder do not use DOP.
175 2011-04-06
Performance and Scalability Considerations
• In general, do not set the DOP more than the number of available CPUs. To fine tune the performance,
set the DOP value based on the number of concurrent tasks and available hard disk and RAM
resources. Gradually increase the value of DOP to reach an optimal setting. For more information,
see the SAP BusinessObjects Data Services Performance Optimization Guide.
Related Topics
• Scalability and performance considerations
• Column profiling
• Rule processing
• Hard disk requirements
• When to use degree of parallelism
For column profiling, the task is distributed proportionately for different number of columns. The formula
is: Number of execution units = Number of Columns / DOP.
For example, if there is a column profiling task for 100 million rows with 20 columns, with DOP = 4, the
task will be broken down in execution units that work on 5 columns each for all 100 million rows. The
data is "partitioned" for 5 columns each for each execution unit to work on.
Related Topics
• Scalability and performance considerations
• Degree of parallelism
Advanced profiling tasks (dependency, redundancy, and uniqueness) require complex sorting operations,
so it is important to optimize the degree of parallelism settings. The degree of parallelism setting is used
to execute sorting operations in parallel sorting operations to increase throughput.
Related Topics
• Scalability and performance considerations
• Degree of parallelism
176 2011-04-06
Performance and Scalability Considerations
For rule processing, the number of execution units is proportional to the number of rules (as opposed
of number of columns for profiling). The formula is: Number of execution units = Number of rules / DOP.
For example, if there is a rule execution task for 100 million rows with 20 rules, with DOP = 4, the task
will be broken down in execution units that processes 5 rules each for all 100 million rows.
Related Topics
• Scalability and performance considerations
• Degree of parallelism
Adding CPUs and increasing DOP does not guarantee improved throughput. When many processes
run in parallel, they also share other hardware resources such RAM and hard disk.
When the Data Services engine processes data, it creates temporary work files in the Pageable Cache
Directory. Naturally, if many processes are running simultaneously, all of them create temporary files
in the same location. Because this directory is accessed by all of the processes simultaneously, there
is a potential for disk contention. In most environments, depending on the hard disk capacity and speed,
you will hit a ceiling after which an increase in DOP will not improve performance proportionately.
Therefore, it is important to enhance all aspects of the hardware at the same time: the number of CPUs,
hard disk capacity, speed, and RAM. You should have a very efficient disk access to go along with the
increased number of DOP. Make sure the pageable cache directory is set accordingly.
Related Topics
• Scalability and performance considerations
• Degree of parallelism
• Grid computing
177 2011-04-06
Performance and Scalability Considerations
• When you have only a few very powerful machines with a lot of processing power, RAM, and fast
disk access.
• When you have a large amount of data for profiling or rule tasks.
Note:
• If you run many profile tasks simultaneously with DOP > 1, each of them could be split in multiple
execution units. For example, 4 tasks and DOP 4 could result in 16 execution units. Now there are
16 processes competing for resources (CPU, RAM, and hard disk) on the same machine. So it
always a good idea to schedule jobs efficiently or to use multiple Data Services Job servers.
• DOP is a global setting and affects the entire landscape.
Related Topics
• Scalability and performance considerations
• Degree of parallelism
• Grid computing
You can perform grid computing using the Data Services Job Server group. This group is a logical
group of multiple Data Services Job Servers. When you install Information Steward, you can assign a
single Data Services Job Server group for that Information Steward instance. There are two ways you
can utilize the Job Server group with Distribution level setting: distribution level table and distribution
level sub-table.
A single profiling or rule task can work on one or more “tables”. The term “table” is used in a general
sense of the number of records with rows and columns. In reality, this can come from an RDBMS table,
a flat file, an SAP application, and so on.
Note:
DOP and distribution level are global settings and affect the entire landscape.
Related Topics
• Scalability and performance considerations
• Distribution level table
• Distribution level sub-table
• When to use grid computing
178 2011-04-06
Performance and Scalability Considerations
When you set the distribution level to Distribution level table, each table of the task is executed on
separate Data Services Job Server in the group. If you have set DOP, it is effective on the independent
machines and one task on that particular server could be further distributed.
For example, if you have 8 tables in a task and it is submitted to a Data Services Job Server group with
8 Data Services Job Server, then each Data Services Job Server processes one table. If the DOP is
set to 4, then each Data Services Job Server tries to parallelize the task into 4 execution units, one for
each particular table. There is no interdependency between the different job servers; they share no
resources.
When a Data Services Job Server group receives a task that involves multiple tables and the distribution
level is set to Table, it uses an intelligent algorithm that chooses the Data Services servers based on
the available resource. If a particular server is busy, then the task is submitted to a relatively less busy
Data Services server. These calculations are based on the number of CPUs, RAM, and so on. If you
have two Data Services servers, one with many resources and another with low resources, it is quite
possible that the bigger server gets a proportionally higher number of tasks to execute.
You can also choose servers based purely in a "round robin" fashion, in which case the task is submitted
to the next available Data Services server. For more information, see theSAP BusinessObjects Data
Services Administrator Guide.
Related Topics
• Scalability and performance considerations
• Grid computing
Note:
This setting is only effective for column profiling. Use this setting with caution, as it may have a negative
impact on performance of other types of profiling, such as advanced profiling.
When you set the distribution level to Distribution level sub-table, based on the DOP setting Data
Services distributes a single table task across multiple machines in the network. The basic idea is to
split a single task into multiple independent execution units and send them to different Data Services
Job Servers for execution. You can think of this as DOP but across multiple machines rather than
multiple CPUs of a single machine.
For example, you have a column profiling task for 100 million rows 40 columns, the distribution level is
set to Sub-Table, the DOP is 8, and there are 8 Data Services Job Servers in the group. The task is
179 2011-04-06
Performance and Scalability Considerations
split into 8 execution units for 5 columns each and sent to the 5 Data Services Job Servers, which then
can execute it in parallel. There is no sharing of CPU or RAM. But all of the Data Services Job Servers
share the same pageable cache directory. You must ensure this directory location is shared and
accessible to all Data Services Job Servers . This location should have a very efficient disk and the
network on which this setup is done should be very fast so that it does not become bottleneck; otherwise,
the gains of parallel processing will be negated.
Related Topics
• Scalability and performance considerations
• Grid computing
The principles are similar to DOP, but with the additional aspects of distribution level.
• Distribution level table: Use when you have many concurrent profiling and rules tasks that work on
large amount of data. Set the distribution level to Table, so that individual tasks are sent to different
servers.
• Distribution level sub-table: Use when you have a very few column profiling tasks on very large
amount of data. In this case, use distribution level "Sub-table". It is important that these machines
share an efficient hard disk and that they are connected by a fast network.
Related Topics
• Scalability and performance considerations
• Grid computing
SAP BusinessObjects Information Steward allows direct integration with the SAP Business Warehouse
and the SAP ERP Central Component (ECC) system. One of the main benefits is the ability to connect
directly to the production SAP systems to perform data profiling and data quality analysis based on the
actual, most timely data, instead of connecting to a data warehouse, which is loaded infrequently. To
fully utilize this advantage without risking the performance and user experience on the production ECC
system, consider these requirements.
To use utilize the back-end resources in the most efficient and sustainable way, the connection user
defined for the interaction between Information Steward and the SAP back-end system should be linked
to background processing. This is the recommended setup for the connection user.
180 2011-04-06
Performance and Scalability Considerations
The option to use a dialog user for the connection between Information Steward and the SAP back-end
system should be considered carefully and should only be considered for smaller datasets not exceeding
50,000 records from a medium width table. In this case, the synchronous processing will block a dialog
process for the processing time and its resources. Using this approach on larger data sets would require
changing the heap size for the maximum private memory for a dialog process and therefore has impact
on the overall memory required by the ECC system. In addition, the extraction of larger sets of data will
most likely exceed the recommended maximum work process runtime for dialog process on the ABAP
Web Application server significantly. Depending on the amount of records and number of columns of
the table the data is retrieved from, the extraction process runtime will vary. Due to the significant
resource required and the extended runtime and allocation of a dialog process the usage of dialog user
for the communication is not recommended.
You can connect to the SAP ECC system and retrieve data for profiling and data quality analysis using
one of the following methods.
• Transfer the data directly via the RFC connection established.
In this case the data is stored in an internal table first and then transferred via the synchronous RFC
connection.
• Asynchronous processing scenario 1.
The extracted data is written into a file to a shared directory for both the ABAP Web Application and
the Information Steward. After the extraction is completed, Information Steward picks up the file for
further processing.
For this first asynchronous processing scenario, specify the SAP Applications connection parameters
Working directory on SAP server and Application Shared Directory, and set Data transfer
method to Shared directory.
• Asynchronous processing scenario 2.
In this case the data is written to directory on the back-end server, which is made available to
Information Steward via an FTP server connection. Similar to the second scenario, the data files
are then picked up by Information Steward from the directory specified, in this case via the FTP
connection path.
For this second asynchronous processing scenario, specify the Working directory on SAP server
and the FTP parameters when you define the SAP Applications connection, and set Data transfer
method to FTP.
The data transfer via option two and three have slightly longer turnaround times, but consumes fewer
resources on the back-end system.
It is recommended that you perform optimized requests for data extraction from the SAP system when
you define Information Steward profiling tasks. This optimization should be considered in all scenarios.
Therefore, specify only a subset of the columns in your profile task and use filters to focus on a specific
data set. The more optimized the request, the faster the extraction and overall process execution. In
this case, Information Steward only requests the data for extraction as defined by the filter. The filter
criteria is passed to the SAP system.
Related Topics
• SAP Applications connection parameters
181 2011-04-06
Performance and Scalability Considerations
When reading flat files, the Data Services engine can parallelize the tasks in multiple threads to achieve
higher throughput. With multi-threaded file processing, the Data Services engine reads large chunks
of data and processes them simultaneously. This way, the CPUs are not waiting for the file input/output
operations to finish.
You can set the number of file processing threads to the number of CPUs available. Use this setting
when you have a large file to run profile or rule task on.
Related Topics
• Scalability and performance considerations
This optimization is applicable to all of the scheduled profile and rule tasks. Information Steward provides
the ability to optimize redundant tasks for Data Insight. If the profile or rule result set for a particular
table is already available, then it is not processed. This is controlled by the Optimization Period setting.
Data in certain tables do not change very often or to refresh the result set is at a specified frequency.
It is not required to execute that again within that time period. If the data is not going to change in a
profiling task, there is no need to process the task again. Similarly, if the scorecards are calculated only
on a nightly basis, there is no need to recalculate the score.
Suppose the Optimization Period is set to 24 hours. A rule task was executed for Table1 and the
result set is already stored. If that same task is tried again within 24 hours, it is not processed again.
Imagine another case where a single rule task involves Table1 and Table2. In this case, rules are not
executed on Table1, but they are processed for Table2 because this table does not have a result set
available.
If you want to always get the latest results due to the changing nature of the data, set the Optimization
Period to the expected period of change in data.
Note:
Any profiling or rule tasks that are run on demand do not use this optimization and all of the data is
processed.
182 2011-04-06
Performance and Scalability Considerations
Related Topics
• Scalability and performance considerations
For Data Insight functionality, Information Steward provides the following settings to improve performance.
Related Topics
• Scalability and performance considerations
• Input data settings
These settings are available for both profile and rule tasks when you create a task. Defaults are set in
"Configure Applications".
Filter condition
You can also control exactly which data gets processed using the filter condition. Because the number
of records affects performance, set the filter condition and process only the amount of data required.
Suppose you have 10 million records for all countries and there are 1 million records for the U.S. If you
are interested in profiling data for the U.S. only, you should set the filter for country = US. This way,
only 1 million records are processed.
You can combine this with Max Input Size and Sampling Rate to further improve performance, if
applicable.
Related Topics
• Scalability and performance considerations
• Performance settings for input data
• Configuration settings
183 2011-04-06
Performance and Scalability Considerations
For Data Insight functionality, Information Steward provides the following settings to control the size of
the Information Steward repository. This depends on the number of records as well, because the more
data, the bigger the potential result set. However, the repository size can be controlled for even very
large amounts of data.
Related Topics
• Scalability and performance considerations
• Profiling
• Rule processing
• Metadata Management
10.4.11.1 Profiling
The following settings affect the size of the Information Steward repository. Choose these numbers
carefully based on what your data domain experts require to understand data. The lower the number,
the smaller the repository.
You can set a high number, but the repository size increases. Also, response time for viewing sample
data is affected because more rows need to be read from the database, transported over the network,
and rendered in the browser.
• Max sample data size: The sample size for each profile attribute.
• Number of distinct values: The number of distinct values to store for the value distribution result.
• Number of patterns: The number of patterns to store for the pattern distribution result.
• Number of words: The number of words to store for the word distribution result.
• Results retention period: Controls the number of days before the profiling results are deleted. The
longer you keep the results, the bigger the repository. For more information, see the Purge utility.
Related Topics
• Scalability and performance considerations
• Settings to control repository size
• Profiling task settings and rule task settings
184 2011-04-06
Performance and Scalability Considerations
Max sample data size: This setting controls the number of failed records to save for each rule. The
higher the number, the more records that are available to view as sample data. This results in a larger
repository size, and the response time for viewing sample failed data is affected.
Score retention period: This setting controls the number of days before the scores are deleted. The
longer you keep the score data, the larger the repository size. For more information, see the Purge
utility.
Related Topics
• Scalability and performance considerations
• Settings to control repository size
• Profiling task settings and rule task settings
The size of the Information Steward repository is also controlled by the amount of metadata that is
collected and retained. Optimize the amount by selectively choosing the components that you are
interested in for different metadata integrators.
Related Topics
• Scalability and performance considerations
• Settings to control repository size
185 2011-04-06
Performance and Scalability Considerations
1. JVM arguments for metadata integrators that collect a large amount of metadata should be adjusted
for higher memory allocation. It is recommended that you update the JVM parameters on the integrator
parameters page to -Xms1024m -Xmx4096m.
2. Run-time parameters for the maximum number of concurrent processes to collect metadata should
be set to the number of CPUs that can be dedicated for metadata collection. Typically metadata
integrators are installed on independent servers, so you can set it to the number of CPUs on the
server. These parameters for parallel processing may be different for different metadata integrators.
3. Each metadata integrator provides some method of performance improvement specifically for the
type of metadata that it collects. For example, with SAP BusinessObjects Enterprise Metadata
Integrator, you can reduce processing time by selectively choosing different components.
4. For the first time the integrator is run, the run-time parameter Update Option for metadata integrators
should be set to Delete existing objects before collection. For subsequent runs, change it to
Update existing objects and add newly selected objects. For example, for SAP BusinessObjects
Enterprise Metadata Integrator, you can first collect metadata only for Web Intelligence documents
(by selecting only that component as specified in step 3). For the first run, set the option to Delete
existing. In the next run, you may collect all of the Crystal Reports metadata. For the second run,
set the option to Update existing
Related Topics
• Scalability and performance considerations
• Common runtime parameters for Information Steward
The Metadata Management utilities Compute Lineage Report and Update Search index have some
configuration parameters. Run Compute Lineage in Optimized mode so that it is updated incrementally.
Related Topics
• Scalability and performance considerations
186 2011-04-06
Performance and Scalability Considerations
If there are many parsed values (more than 20) per row in the data being used for Cleansing Package
Builder, then the Auto-Analysis service requires adjustment to the JVM runtime parameter that controls
memory allocation for EIMAPS. If the memory cap for Java is left at 1GB, even though the system has
16GB, the service will run out of memory. The best practice is to allocate 2-3GB of memory to the Java
services via the -Xmx setting.
Related Topics
• Scalability and performance considerations
Schedule cleansing package processing during non-business hours. Depending on the size of the
cleansing package, publishing can be a time consuming task. SAP-supplied cleansing packages for
Name-Title-Firm are typically very large. If they are updated and published, schedule the processing
during non-business hours. It may be a good idea to have a dedicated server for publishing, if this is a
frequent occurrence.
Related Topics
• Scalability and performance considerations
• When using a distributed environment, enable and run only the servers that are necessary. For more
information, see the Business Intelligence Platform Administrator Guide.
• Use dedicated servers for resource intensive servers like the Data Services Job Server, Metadata
Integrators, and the Cleansing Package Builder Auto-Analysis service.
• Install the Information Steward Web Application on a separate server. The Business Intelligence
platform Web Tier must be installed on the same computer as the Information Steward Web
Application. If you do not have Tomcat or Bobcat, you need to manually deploy the Information
Steward Web Application.
187 2011-04-06
Performance and Scalability Considerations
• If you have many concurrent users, you can use multiple Information Steward web applications with
Load Balancer.
• To obtain a higher throughput, the Information Steward repository should be on a separate computer
but in the same sub-network as the Information Steward Web applications, Enterprise Information
Management Adaptive Processing Server, Information Steward Job Server, and Data Services Job
Server.
• Make sure that the database server for the Information Steward repository is tuned and has enough
resources.
• Allocate enough memory and hard disk space to individual servers as needed.
• Follow good scheduling practices to make sure that resource intensive tasks do not overlap each
other. Schedule them to run during non-business hours so that on-demand request performance is
not affected.
Related Topics
• Installation Guide: Deploying web applications with WDeploy
• SAP BusinessObjects Business Intelligence platform Web Application Deployment Guide: Failover
and load balancing
• Data Insight best practices
• Metadata Management best practices
• Cleansing Package Builder best practices
1. If you expect your Data Insight profiling and rule tasks to consume a large amount of processing,
deploy the Data Services Job Server on a separate computer.
2. To improve the execution of Data Insight profiling tasks, the Data Services Job Server can be on
multiple computers that are separate from the web application server to take advantage of Data
Services job server groups and parallel execution. You must access the Data Services Server
Manager on each computer to do the following tasks:
• Add a Data Services Job Server and associate it with the Information Steward repository. For
more information, see “Adding Data Services Job Servers for Data Insight” in the Administrator
Guide.
• Specify the path of the pageable cache that will be shared by all job servers in the Pageable
cache directory option.
188 2011-04-06
Performance and Scalability Considerations
3. For a predictable distribution of tasks when using multiple Data Services Job Servers, try to ensure
that the hardware and software configurations are homogeneous. This means that they should all
have similar CPU and RAM capacity.
4. Irrespective of using DOP and/or multiple Data Services Job Servers, set the pageable cache
directory on high speed and high capacity disk.
5. If you are processing flat files, store them on a high speed disk so that read performance is good.
6. Process only data that must be processed. Use settings such as Max Input Size, Sampling rate,
and Filter conditions appropriately.
7. When using Information Steward views, use correct join and filter conditions so that you are pulling
in only required rows.
8. Choose only columns that you are interested in profiling or the rules that you are interested in
calculating for the score. Selecting all columns may lead to redundant processing.
9. Word distribution profiling is done only on a few columns. Do not choose this for all columns.
Otherwise, performance and the size of the Information Steward repository is affected.
10. If you have lookup functions in rule processing, it takes more time and disk space. Typically, lookup
tables are small, but if they are big, lookup tables can adversely affect performance. Ensure that
the tables on which lookup is performed are small. As an alternative to lookup, you can use the SQL
function.
11. Choose the DOP setting and the distribution level carefully. Remember that these are global settings
and affect the entire landscape.
12. When doing column profiling on large amounts of data with many small capacity Data Services Job
Servers, only use the distribution level Sub-Table. This setting can have an adverse effect on other
types of tasks. You may want to change it back to Table level after that column profiling task is done.
13. Store the reference data required for address profiling on a high speed and high capacity disk.
14. Make sure that the database server that contains source data is tuned and has enough resources.
15. Schedule the Purge utility to run during non-business hours. If column profiling and rules are executed
many times a day, try to schedule the Purge utility to run more than once, so that it can increase
free disk space in the repository.
Related Topics
• General best practices
• Using SAP applications as a source
• Organizing projects
• User Guide: Creating a rule
1. Choose the data retrieval method (synchronous or asynchronous) based on the data set size and
performance requirements of your SAP system.
2. For smaller data sets, use the synchronous method.
189 2011-04-06
Performance and Scalability Considerations
3. For larger data sets, use the asynchronous method, where the data from SAP systems is written
into a file that Information Steward uses.
It is recommended to use background processing on the SAP back-end, which can be controlled by
the user type of the connection the user defined.
Dialog processing should be considered carefully. In this case, adjust the run-time parameter such as
heap size for th maximum private memory and maximum work process runtime.
Related Topics
• SAP Applications connection parameters
How scorecards and projects are organized depends on how business users want to view the scorecards.
But this organization can affect the response time for the users. If a project contains many scorecards,
details of all the scorecards must be retrieved for viewing. So it is good idea to create multiple projects
according to the area of interest and have a limited number of scorecards in those projects.
For example, you could create projects for different geographical locations. Within each project, you
could have different scorecards, such as Customer, Vendor, and so on. Or you could create projects
based on Customer, Vendor, and so on, and then have scorecards based on geography.
The organization also helps you decide what data sources you want to use and the filter conditions
involved for profiling and rule execution.
Another benefit of proper organization is that you can control the user security per project and restrict
access to only specific users.
To avoid future problems, different Data Insight user groups should work together in the beginning of
the project to decide these aspects.
Related Topics
• General best practices
• Profiling and rules
• Using SAP applications as a source
190 2011-04-06
Performance and Scalability Considerations
1. Metadata integrators for BusinessObjects Enterprise, Data Services, and SAP Business Warehouse
should be installed on their own dedicated servers if they require large processing time or they run
in overlapping time periods with other metadata integrators or Data Insight tasks.
2. The Metadata Relationship Service and Metadata Search Service can be combined.
3. Additional guidelines to consider for Metadata Relationship Service:
• Should be on a separate computer than the web application server to obtain higher throughput.
• Can be on its own computer, or it can be combined with any Metadata Integrator. The rationale
for this combination is that Metadata Integrators usually run at night or other non-business hours,
and the Metadata Relationship Service runs during normal business hours when users are viewing
relationships (such as impact and lineage) on the Information Steward web application.
4. Another guideline to consider for Metadata Search Service:
• Can be on its own computer, or it can be combined with any Metadata Integrator. The rationale
for this combination is that Metadata Integrators usually run at night or other non-business hours,
and the Search Server runs during normal business hours when users are searching on the
Metadata Management tab of Information Steward.
5. The File Repository Servers should be installed on a server with a high speed and high capacity
disk.
6. Adjust runtime parameters correctly.
Related Topics
• General best practices
1. Cleansing Package Builder Auto-Analysis service should be on a dedicated server obtain higher
throughput.
2. Use sample data that represents various patterns in your whole data set. Large amounts of sample
data with too many repeating patterns leads to redundant processing overhead.
3. Run-time parameters should be set correctly for Auto-Analysis service. Specifically, memory
requirements are very important. If enough memory is not made available to the process, it will run
out of memory. If the memory cap for Java is left at 1GB, even though the system has 16GB, the
service will run out of memory. The best practice is to allocate 2-3GB of memory to the Java services
via the -Xmx setting.
Related Topics
• Common runtime parameters for Information Steward
• General best practices
191 2011-04-06
Performance and Scalability Considerations
192 2011-04-06
Backing Up and Restoring Metadata Management
This section describes suggested procedures for performing backups and restorations of various objects
associated with this product.
Related Topics
• Migration mechanisms and tools
This application provides an XML export utility, MMObjectExporter.bat, which allows you to export
repository objects to an XML file. The utility is installed on the machine on which you installed this
product. You specify its output by using required and optional command line arguments.
193 2011-04-06
Backing Up and Restoring Metadata Management
Option Description
If there are no objects found in the repository that meet the mainObject
requirements, then the utility exits with an error message. If some but not
all of the objects are found, the utility generates the XML file but writes a
warning message about the objects that were not found to the log.
194 2011-04-06
Backing Up and Restoring Metadata Management
Option Description
After the utility runs, this product creates an XML file according to the specified arguments.
Example:
In this example, at a command prompt positioned in the installation directory's subdirectory, the user
enters the following command and arguments:
mmobjectexporter 82 "c:\temp\first exported.xml"
boeUser Jane boePassword My1Password
mainObject Universe
195 2011-04-06
Backing Up and Restoring Metadata Management
Entry Description
Use these procedures when you want to backup and restore this application's configurations from one
system to another.
Note:
Use the backup utility in your Relational Database Management System to back up the repository that
contains all MMT_* tables. Use the Lifecyle management console for SAP BusinessObjects BI platform
to back up and restore configurations of this product.
You must have the current version of this product on both the source and target machines.
To back up your configurations, use the Lifecyle management console for SAP BusinessObjects BI
platform to create an output BIAR file for Metadata Management configuration information on the source
SAP BusinessObjects BI platform system:
196 2011-04-06
Backing Up and Restoring Metadata Management
1. On the "Destination environment" screen, specify a BIAR file that is accessible to both the source
and target SAP BusinessObjects BI platform machines.
2. On the "Select objects to import" screen, select only the Import application folders and objects
option.
3. On the "Select application folders and objects" screen, select Metadata Management.
4. On the "Select objects to import" screen, select the users and groups that have permissions to the
Metadata Management integrator source configurations.
5. On the next screen, select the users and groups that have permissions to the Metadata Management
integrator source configurations.
The following configuration information is backed up as a result of this procedure:
• Metadata Integrator source configuration
• Metadata Management utilities configurations
• Metatdata source groups
• Security information (users, groups, and their permissions)
You must have this version of the application on both the source and target machines.
197 2011-04-06
Backing Up and Restoring Metadata Management
198 2011-04-06
Life Cycle Management
Each phase could involve a different computer in a different environment with different security settings.
For example, the initial test may require only limited sample data and low security, while final testing
may require a full emulation of the production environment including strict security.
In this phase, you define and test the following objects for each module of SAP BusinessObjects
Information Steward:
• Data Insight—Define profile tasks, rules, and scorecards that instruct Information Steward in your
data quality requirements. The software stores the rule definitions so that you can reuse them or
modify them as your system evolves.
199 2011-04-06
Life Cycle Management
• Metadata Management—Define integrator sources, integrator source groups, and integrator source
instances that collect metadata to determine the relationships of data in one source to data in another
source.
After you define the objects, use SAP BusinessObjects Information Steward to test the execution of
your application. At this point, you can test for errors and trace the flow of execution without exposing
production data to any risk. If you discover errors during this phase, you can correct them and retest
the application.
The software provides feedback through trace, error, and monitor logs during this phase.
The testing repository should emulate your production environment as closely as possible, including
scheduling Data Insight tasks and Metadata Integrator runs rather than manually starting them.
In this phase, you set up a schedule in the Central Management Console (CMC) to run your Data Insight
tasks and Metadata Integrator runs as jobs. Evaluate results from production runs and when necessary,
return to the test phase to optimize performance and refine your target requirements.
After you move the software into production, monitor it in the CMC for performance and results. During
production:
• Monitor your Data Insight tasks and Metadata Integrator runs and the time it takes for them to
complete.
The trace and monitoring logs provide information about each task and run.
You can customize the log details. However, the more information you request in the logs, the longer
the task or integrator runs. Balance run time against the information necessary to analyze
performance.
• Check the accuracy of your data.
200 2011-04-06
Life Cycle Management
Lifecycle management console for SAP BusinessObjects BI platform is a web-based tool that enables
you to move BI resources from one system to another system, without affecting the dependencies of
these resources. It also enables you to manage different versions of BI resources, manage dependencies
of BI resources, and roll back a promoted resource to restore the destination system to its previous
state.
You can use the lifecycle management console for SAP BusinessObjects BI platform to move objects
in the Central Management System (CMS) between the same versions. For example, when you move
objects from the test system to the production system, you can accomplish the task through the lifecycle
management console.
You can use the lifecycle management console to move the following Information Steward objects:
• Integrator source configurations
• Metadata Management utilities
• Source groups
• Security information (including users, groups, and their permissions) for Metadata Management,
Metapedia, Data Insight, and Cleansing Package Builder. Most Information Steward security
information is stored at the folder level, so to move all of the security settings from one system to
another, promote each folder. The lifecycle management console has an option that lets you choose
whether to promote a job with its associated security and whether to include application rights.
Using Information Steward import and export functionality, you can move the following objects:
201 2011-04-06
Life Cycle Management
Data Insight file formats For more information about importing and exporting file formats, see the
“Data Insight ” section of the SAP BusinessObjects Information Steward
User Guide.
Data Insight rules For more information about managing rules, see the “Data Insight ” section
of the SAP BusinessObjects Information Steward User Guide.
Data Insight views For more information about importing and exporting views, see the “Views”
section of the SAP BusinessObjects Information Steward User Guide.
Metapedia terms and cat- For more information about importing and exporting terms and categories
egories with Excel, see the “Metapedia” section of the SAP BusinessObjects Infor-
mation Steward User Guide.
202 2011-04-06
Supportability
Supportability
For each profiling and rule task and Metadata Integrator run, SAP BusinessObjects Information Steward
writes information in the following logs:
• Database Log - Use the database log as an audit trail. This log is in the Information Steward
Repository. You can view this log while the Metadata Integrator or Data Insight profile or rule task
is running.
The default logging level for the database log is Information which writes informational messages,
such as number of reports processed, as well as any warning and error messages. It is recommended
that you keep the logging level for the database log at a high level so that it does not occupy a large
amount of disk space.
• File Log - Use the file log to provide more information about a Metadata Integrator or Data Insight
profile or rule task run. The Metadata Integrator creates this log in in the Business Objects installation
directory and copies it to the File Repository Server. You can download this log file after the Metadata
Integrator run completed.
The default logging level for the file log is Configuration which writes static configuration
messages, as well as informational, warning, and error messages. You can change the logging level
for the file log if you want more detailed information. If your logs are occupying a large amount of
space, you can change the maximum number of instances or days to keep logs.
Each logging level logs all messages at that level or higher. Therefore, the default logging level
Information logs informational, warning, and error messages. If you change the logging level to Warning,
SAP BusinessObjects Information Steward logs warning and error messages. Similarly, if you change
the logging level to Integrator trace, Information Steward logs trace, configuration, informational, warning,
and error messages
203 2011-04-06
Supportability
You can change the log levels for Metadata Management and Data Insight logs.
To change the Metedata Managent log levels, you must have the Schedule right on the integrator
source.
1. On the Information Steward page in the Central Management Console (CMC) , expand the Metadata
Management node, and expand the Integrator Sources node to display all configured integrator
sources.
2. Select the integrator source for which you want to change the logging level by clicking anywhere on
the row except its type.
Note:
If you click the integrator type, you display the version and customer support information for the in
tegrator.
204 2011-04-06
Supportability
6. Click Schedule.
Future runs of the recurring schedule for this integrator source will use the logging level you specified..
205 2011-04-06
Supportability
The "Integrator History" pane displays each schedule in the right panel.
4. Select the schedule name and click the icon for View the database log.
The "Database log" shows the task messages which are a subset of the message in the log file.
5. To find specific messages in the "Database log" window, enter a string in the text box and click
Filter.
For example, you might enter error to see if there are any errors.
Note:
• For information about troubleshooting Cleansing Package Builder, see “Troubleshooting” in the
User Guide.
• For information about troubleshooting integrator sources, see Troubleshooting
6. To close the "Database log" window, click the X in the upper right corner.
8. To close the "Database log" window, click the X in the upper right corner.
206 2011-04-06
Supportability
You can also view log information for SAP BusinessObjects Data Services and SAP BusinessObjects
Enterprise XI 4.0.
Look for log files associated with job execution, for example errorlog.txt and tracelog.txt
207 2011-04-06
Supportability
208 2011-04-06
Appendix
Appendix
14.1 Glossary
accuracy
The extent to which data objects correctly represent the real-world values for which they
were designed.
address profiling
A process that componentizes and measures address data with dictionary data.
alternate
A substitute spelling or nickname; Mr. is an alternate for Mister.
annotation
User notes added to an object in Metadata Management and Data Insight.
association
A relationship between terms contained in a Metapedia business glossary and metadata
objects.
catalog
A relational object type that, in a relational database management system (RDBMS),
corresponds to a database. Each deployment contains two such object types, one for the
datasources and one for the target tables.
category
The organization system for grouping terms to denote a common functionality. Categories
can contain sub-categories, and you can associate terms to more than one category.
cleansing package
The parsing rules and other information that define how to parse and standardize the data
of a specific data domain.
Cleansing Package Builder
A module of Information Steward that allows a data steward to create and modify cleansing
packages for any data domain. A cleansing package is then used to process the data in
accordance with package guidelines through SAP BusinessObjects Data Services.
cleansing package category
The organization system for grouping terms to denote a common functionality. Cleansing
package categories can contain sub-categories, and you can associate terms to more than
one category.
209 2011-04-06
Appendix
completeness
The extent to which data is not missing.
conformity
The extent to which data conforms to a specified format.
consistency
The extent to which distinct data instances provide non-conflicting information about the
same underlying data object.
context definition
A method that allows users to specify context when data contains a pattern or contains
parsed values that have a special meaning when used together, such as a range of
acceptable values.
custom attribute
Properties you add to existing metadata objects that, once defined, can be searched for
and viewed.
custom cleansing package
The parsing rules and other information that you have defined in order to parse and
manipulate all types of data including operational and product data.
data insight project
A collaborative space for data stewards and data analysts to assess and monitor the data
quality of a specific domain and for a specific purpose (such as customer quality
assessment, sales system migration, and master data quality monitoring).
data steward
A person who manages data as an asset, is an expert in his data domain and is responsible
for the quality of the data.
database
One or more large structured sets of persistent data, usually associated with software, to
update and query the data. A relational database organizes the data, and relationships
between them, into tables.
datasource schema
The definition of a table’s columns and primary keys.
dependency profiling
A process that determines whether the data in one column or table is based on the results
of another column or table.
dimension
A logical grouping of characteristics within an InfoCube.
directory structure
A hierarchy within Information Steward that is organized into folders that contain four
categories, namely Data Integration, Business Intelligence, Data Modeling, and Relational
Databases.
extract
210 2011-04-06
Appendix
A process by which Information Steward copies information from source systems and loads
it into the repository.
file format
Flat file definition which includes the column name, data type, delimiter or character width.
This is equivalent to the schema for a relational database table.
impact diagram
A graphical representation of the object(s) that will be affected if you change or remove
other connected objects.
InfoCube
A type of InfoProvider that describes a self-contained dataset, for example, from a business
oriented area.
integrator source
A named set of parameters that describes how a metadata integrator can access a data
source.
integrity
The extent to which data is not missing important relationship linkages.
key data domains
A set of related data objects or key data entities.
lineage diagram
A diagram that shows where the data comes from and what sources provide the data for
this object.
metadata integrator
An application that collects information about objects in a source system and integrates it
in one or more related source systems.
metadata object
A unit of information that the software creates from an object in a source system.
Metapedia
A custom glossary within Information Steward that you use to define and organize terms
and categories related to your business data.
Metapedia category
The organization system for grouping Metapedia terms to denote a common functionality.
Categories can contain sub-categories, and you can associate Metapedia terms to more
than one category.
Metapedia term
A word or phrase that defines a business concept in your organization.
MultiProvider
An SAP NetWeaver Business Warehouse object that combines data from several
InfoProviders and makes it available for reporting.
object equivalency rule
A naming rule that indicates that an object in one source system is the same physical
object in another source system.
211 2011-04-06
Appendix
object tray
A temporary holding space for objects that you want to export or define a relationship for
in Metadata Management.
Open Hub Destination
An SAP NetWeaver Business Warehouse object within the open hub service that contains
all information about a target system for data in an InfoProvider. The target system can be
external.
parent-child
A hierarchical relationship where one object is subordinate to another. In this hierarchy,
the parent is one level above the child; a parent can have several children, but a child can
have only one parent. For example, a table can have multiple columns, but a column can
belong to only one table.
parsed value
A data string that results from parsing.
parsing rule
A rule that determines how data is classified based on a pattern within the data and how
the data is mapped to specific attributes.
person and firm cleansing package
A cleansing package that parses party or name and firm information such as given name,
family name, prename, title, phone number, and firm or company name.
private cleansing package
A cleansing package that can be viewed or edited only by the user who owns it.
profile
A process that generates attributes about the data such as minimum and maximum values,
pattern distribution and data dependency to help data analysts discover and understand
data anomalies.
profile
To generate attributes about the data.
profile task
A task to profile one or more tables, views and/or flat files. This task can be scheduled or
executed on demand.
properties file
A collection of information that appears on the Report tab of each report; it includes such
information as the name and description of the report, and the source(s) of the information
contained in the report.
published cleansing package
A cleansing package that is either SAP-supplied or created and then is published by a
data steward, is available to all users, and can be used in a Data Services transform.
quality dimensions
A category for rules such as accuracy and completeness. This helps to organize your rules
and provides a score that contributes to the scorecard value.
212 2011-04-06
Appendix
query
An SAP NetWeaver Business Warehouse object consisting of a combination of
characteristics and key figures (InfoObjects) that allow you to analyze the data in an
InfoProvider.
query views
An SAP NetWeaver Business Warehouse object consisting of a modified view of the data
in a query or an external InfoProvider.
redundancy profiling
A process that measures the amount of repeated data.
rule task
A task to run rules bound by one or more tables, views and/or flat flies. This task can be
scheduled or executed on demand.
same as relationship
The association between two objects indicating that they are identical physical objects.
Only objects of the same object type can have this kind of association.
schema
A definition of a table in a relational database.
score
A numerical result calculated by counting the records that pass a rule divided by the total
number of records.
scorecard
A high level data quality view of a key data domain based on business data quality
objectives.
scripting language
Expression language used to write validation rules.
server instance
A database, data source, or service in a relational database management system.
source
An object that provides data that is copied or transformed to become part of the target
object.
source group
A set of related integrator sources.
source system
A software application from which SAP BusinessObjects Information Steward extracts and
organizes metadata into directory structures, enabling you to navigate and analyze the
metadata.
standard form
The standardized or normalized form of a variation, which is displayed after cleansing.
sub-category
213 2011-04-06
Appendix
Within a category, the organization system for grouping terms to denote a common
functionality.
synonym
Another name for an object in the same system. For example, a synonym for a relational
table exists in the same database as the table.
target schema
A set of tables. A project can only contain one target schema, and its name is always target
schema.
timeliness
The extent to which data is sufficiently up-to-date for the task at hand.
transfer rules
An SAP NetWeaver Business Warehouse object that determines how the data for a
DataSource is to be moved to the InfoSource. The uploaded data is transformed using
transfer rules.
transformation
An SAP NetWeaver Business Warehouse object that consists of functions for unloading,
loading, and formatting data between different data sources and data targets that use data
streams.
transformation name
The identity of the universe object, if the target is a measure, or the data flow name, if the
source data was taken from an Extract, Transform, and Load (ETL) system.
uniqueness
The extent to which the data for a set of columns is not repeated.
uniqueness profiling
A process that determines whether the exact piece of data is repeated within the same
column or differing columns.
usage scenario
An example that is typical of the kinds of tasks you’d like to perform with the software.
validation rules
A method that assesses the quality of data in the source system. These rules are bound
to one or more columns to derive a score.
variation
A value that has been assigned to an attribute.
web template
An SAP NetWeaver Business Warehouse object consisting of an HTML document that
determines the structure of a Web application.
workbook
An SAP NetWeaver Business Warehouse object consisting of a Microsoft Excel spreadsheet
with one or more embedded NetWeaver Business Warehouse queries.
214 2011-04-06
Index
A catalog 209 concurrent tasks 175
categories and terms concurrent users
accessing Information Steward 9 importing and exporting 201 performance 168
accuracy 209 category 209 performance factors 170
adaptive job server 19 Central Management Console response time 169
adaptive processing server 19 See CMC 19 configurations, backing up 196
add collected metadata 127 Central Management Server configurations, backing up³ 196
address profiling 209 See CMS 19 configurations, restoring 197
administration chunk of data 209 configure
tasks for Metadata Management cleansing package 209 utility 186
105 Cleansing Package Builder 209 configuring
advanced profiling 176 best practices 191 Common Warehouse Metamodel
alias 209 distributed processing 173 Metadata Integrator 109
alternate 209 performance 186 Metadata Integrator,
annotation 209 performance factors 170 BusinessObjects Enterprise
assigning rights related services 173 106
Metadata Management tasks 59 resource intense 165 Data Federator Metadata Integrator
association 209 scalability 186 111
asynchronous processing security rights 61 Data Services Metadata Integrator
SAP as a source 180 cleansing package category 209 112
Average Concurrent Tasks 175 cleansing packages Meta Integration Metadata Bridge
deleting 139 (MIMB) Metadata Integrator
editing descriptions 140 113
B group rights 61 Metadata Browsing Service 157,
backing up owners 139 158
integrator configurations 196 publishing 187 NetWeaver Business Warehouse
best practices states and statuses 141 Metadata Integrator 108
Cleansing Package Builder 191 unlocking 140 View Data Service 157, 159
Data Insight 188 CMC 19 configuring Metadata Integrator
general 187 CMS 19 JDBC connections 116
Metadata Management 190 collect Relational Database 114
profiling 188 Crystal Reports 127 universe connections 118
projects 190 universes 127 conformity 209
rules 188 Web Intelligence documents 127 connection parameters
SAP as a source 180 column profiling displaying and editing 85
BI platform components degree of parallelism 176 HP Neoview 69
that Information Steward uses 11 Common Warehouse Metamodel IBM DB2 70
Business Intelligence platform Metadata Integrator Informix IDS 71
components configuring 109 Microsoft SQL Server 72
how Information Steward uses 19 completeness 209 MySQL 73
BusinessObjects Enterprise Integrator components Netezza 74
configuring 106 Business Intelligence platform 19 ODBC 74
compute lineage report utility 146 Oracle database 76
configuring 153 SAP Applications 81
C description 145 SAP In-Memory Database 68
modifying configuration 152 SAP NetWeaver Business
calculate scorecard utility monitoring 151 Warehouse 80
description 145 rescheduling 149 connections
monitoring 151 run now 150 for Data Insight 66
rescheduling 149 scheduling 148 to application 79
run now 150 computing Relationship Table 146 to database 66
scheduling 148
215 2011-04-06
Index
216 2011-04-06
Index
Information Steward job server group lineage (continued) Metadata Management search indexes
description 161 viewing from InfoView for Web (continued)
Information Steward repository Intelligence documents 106 scheduling update of 148
editing user and password 63 lineage analysis Metadata Management utilities
Information Steward users 36 resource intense 165 moving 201
information workflose 22 lineage diagram 209 metadata object 209
Informix IDS connection parameters lineage staging table metadata sources
71 recalculate lineage information 153 performance factors 169
input data logs, viewing 131 Metapedia 209
performance 183 metapedia category 209
scalability 183 metapedia term 209
settings 183
M Metapedia terms and categories
installing metadata integrators 105 match standard 209 importing and exporting 201
integrator configurations Meta Integration Metadata Bridge Microsoft SQL Server connection
backing up 196 (MIMB) Metadata Integrator parameters 72
restoring 197 configuring 113 migrating
integrator source 209 metadata browsing application configurations 199
Integrator source configurations resource intense 165 Information Steward objects using
moving 201 Metadata Browsing Service lifecyle management console
integrator sources changing properties 158 201
changing limits for 122 metadata integrator 209 migration tools 200
configuring 105 runtime parameters 185 MMT_Alternate_Relationship table
managing tasks 119 scheduling 124 description 146
run-time parameters 125 Metadata Integrator history scheduling computation of 148
types of 105 viewing 131 modifying source groups 136
viewing and editing 120 Metadata Integrator logs moving
integrity 209 viewing 131 Information Steward objects using
Metadata Integrator, Common lifecyle management console
Warehouse Model 201
J multi-threaded files
configuring 109
JDBC connection sources metadata integrators performance 182
configuring 116 options to schedule 90 scalability 182
job server resource intense 165 multiple users
configuring on Data Services 161 runtime parameters 97 performance 168
job server group when you install Information MultiProvider 209
adding job server 162 Steward 105 MySQL connection parameters 73
deleting a job server 163 Metadata Integrators
Job Server Group definition 15 N
for Information Steward 161 running 123
JVM running immediately 123 Netezza connection parameters 74
runtime parameters 187 tasks 119 NetWeaver Business Warehouse
Metadata Management Metadata Integrator
administration tasks 105 configuring 108
K best practices 190 NetWeaver BW integrator source
key data domains 209 distributed processing 173 run-time parameters 130
keystore and truststore lineage staging table 146 notification server
SSL setup for Remote Job Server performance 185 alerts 91, 94
30 performance factors 169 configuring for processing 91
pre-defined user groups 54 configuring for rules 94
related services 173 email notification 91, 94
L repository 185
scalability 185
lifecycle management console 201 test phase 199
O
lineage updating search indexes 147
response time 169 object collection
Metadata Management search indexes selective 127
rescheduling update of 149 object equivalency rule 209
217 2011-04-06
Index
218 2011-04-06
Index
219 2011-04-06
Index
universe connection sources users (continued) viewing log for utility 151
configuring 118 multiple 168 views
unlocking utilities importing and exporting 201
cleansing packages 140 descriptions 145 views , Data Insight
update search index utility 147 options to schedule 90 security rights 46
configuring 153 utility
description 145 configuring 186
modifying configuration 152 utilties
W
monitoring 151 changing limits for 122 web application
run now 150 distributed processing 174
usage scenario 209 web application server 19
user data
V
Web application server 13, 14
securing 28 validation rule administration 14
user groups resource intense 165 Web Intelligence documents
Data Insight 39 validation rules 209 viewing lineage from InfoView 106
Information Steward 35 value 209 web template 209
user rights variation 209 workbook 209
Data Insight objects 38 View Data Service workflow
users changing properties 159 adding table to project 22
adding to Information Steward viewing data scheduling and running a profile
groups 37 resource intense 165 task 23
assigning rights 48, 50 viewing impact and lineage scheduling and running an
concurrent 168 response time 169 integrator source 23
denying rights 49
220 2011-04-06