Sunteți pe pagina 1din 220

SAP BusinessObjects Information Steward Administrator Guide

■ SAP BusinessObjects Information Steward 4.0 (14.0.0)

2011-04-06
Copyright © 2011 SAP AG. All rights reserved.SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP
Business ByDesign, and other SAP products and services mentioned herein as well as their respective
logos are trademarks or registered trademarks of SAP AG in Germany and other countries. Business
Objects and the Business Objects logo, BusinessObjects, Crystal Reports, Crystal Decisions, Web
Intelligence, Xcelsius, and other Business Objects products and services mentioned herein as well
as their respective logos are trademarks or registered trademarks of Business Objects S.A. in the
United States and in other countries. Business Objects is an SAP company.All other product and
service names mentioned are the trademarks of their respective companies. Data contained in this
document serves informational purposes only. National product specifications may vary.These materials
are subject to change without notice. These materials are provided by SAP AG and its affiliated
companies ("SAP Group") for informational purposes only, without representation or warranty of any
kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The
only warranties for SAP Group products and services are those that are set forth in the express
warranty statements accompanying such products and services, if any. Nothing herein should be
construed as constituting an additional warranty.

2011-04-06
Contents

Chapter 1 Getting Started........................................................................................................................9


1.1 Product overview.....................................................................................................................9
1.2 Accessing Information Steward for administrative tasks...........................................................9

Chapter 2 Architecture...........................................................................................................................11
2.1 Architecture overview............................................................................................................11
2.1.1 Servers and services..............................................................................................................12
2.2 Information Steward on Business Intelligence platform components.......................................19
2.3 Information workflows............................................................................................................22
2.3.1 Adding a table to a Data Insight project..................................................................................22
2.3.2 Profiling data .........................................................................................................................23
2.3.3 Scheduling and running a Metadata Management integrator source.......................................23
2.3.4 Creating a custom cleansing package with Cleansing Package Builder...................................24

Chapter 3 Securing SAP BusinessObjects Information Steward...........................................................27


3.1 Security Overview..................................................................................................................27
3.2 Enterprise security.................................................................................................................27
3.2.1 Securing user data for Information Steward...........................................................................28
3.2.2 Storage of sensitive information.............................................................................................28
3.2.3 Configuring the Remote Job Server for SSL..........................................................................29
3.2.4 Reverse proxy servers...........................................................................................................32

Chapter 4 Users and Groups Management...........................................................................................33


4.1 Users and Groups overview...................................................................................................33
4.2 Information Steward pre-defined users and groups................................................................33
4.3 Managing users in Information Steward..................................................................................35
4.3.1 Creating users for Information Steward .................................................................................36
4.3.2 Adding users and user groups to Information Steward groups................................................37
4.3.3 Denying access......................................................................................................................38
4.4 User rights in Data Insight......................................................................................................38
4.4.1 Data Insight pre-defined user groups......................................................................................39
4.4.2 Type-specific rights for Data Insight objects ..........................................................................40

3 2011-04-06
Contents

4.4.3 Customizing rights on Data Insight objects.............................................................................48


4.5 User rights in Metadata Management....................................................................................53
4.5.1 Metadata Management pre-defined user groups....................................................................54
4.5.2 Type-specific rights for Metadata Management objects..........................................................54
4.5.3 Assigning users to specific Metadata Management objects...................................................59
4.6 User rights in Cleansing Package Builder...............................................................................61
4.6.1 Group rights for cleansing packages......................................................................................61
4.7 User rights for Information Steward administrative tasks .......................................................62
4.7.1 Viewing and editing repository information.............................................................................63

Chapter 5 Data Insight Administration...................................................................................................65


5.1 Administration overview for Data Insight................................................................................65
5.2 Data Insight Connections.......................................................................................................66
5.2.1 Defining a Data Insight connection to a database...................................................................66
5.2.2 Defining a Data Insight connection to an application...............................................................79
5.2.3 Defining a Data Insight connection to a file.............................................................................84
5.2.4 Displaying and editing Data Insight connection parameters....................................................85
5.2.5 Deleting a Data Insight connection.........................................................................................86
5.3 Data Insight projects..............................................................................................................87
5.3.1 Creating a project..................................................................................................................87
5.3.2 Editing a project description...................................................................................................88
5.3.3 Deleting a project...................................................................................................................88
5.4 Data Insight tasks..................................................................................................................89
5.4.1 Scheduling a task...................................................................................................................89
5.4.2 Recurrence options................................................................................................................90
5.4.3 Configuring for task completion notification ...........................................................................91
5.4.4 Rule threshold notification .....................................................................................................94
5.4.5 Monitoring a task...................................................................................................................95
5.4.6 Pausing and resuming a schedule...........................................................................................96
5.4.7 Common runtime parameters for Information Steward...........................................................97
5.5 Configuration settings............................................................................................................98
5.5.1 Profiling task settings and rule task settings...........................................................................99
5.5.2 Configuring profiling tasks and rule tasks.............................................................................103

Chapter 6 Metadata Management Administration...............................................................................105


6.1 Administration overview for Metadata Management ............................................................105
6.1.1 Configuring sources for Metadata Integrators......................................................................105
6.1.2 Managing integrator sources and instances ........................................................................119
6.1.3 Running a Metadata Integrator ............................................................................................123
6.1.4 Changing run-time parameters for integrator sources...........................................................125

4 2011-04-06
Contents

6.1.5 Viewing integrator run progress and history ........................................................................131


6.1.6 Troubleshooting ..................................................................................................................132
6.1.7 Grouping Metadata Integrator sources ................................................................................135

Chapter 7 Cleansing Package Builder Administsration.......................................................................139


7.1 Changing ownership of a cleansing package........................................................................139
7.2 Deleting a cleansing package...............................................................................................139
7.3 Changing the description of a cleansing package.................................................................140
7.4 Unlocking a cleansing package.............................................................................................140
7.5 Cleansing package states and statuses...............................................................................141

Chapter 8 Information Steward Utilities...............................................................................................145


8.1 Utilities overview..................................................................................................................145
8.1.1 Computing and storing lineage information for reporting.......................................................146
8.1.2 Recreating search indexes on Metadata Management.........................................................147
8.2 Scheduling a utility...............................................................................................................148
8.3 Rescheduling a utility............................................................................................................149
8.4 Running a utility on demand..................................................................................................150
8.5 Monitoring utility executions.................................................................................................151
8.6 Modifying utility configurations.............................................................................................152
8.7 Creating a utility configuration..............................................................................................153

Chapter 9 Server Management............................................................................................................155


9.1 Server management overview..............................................................................................155
9.2 Verifying Information Steward servers are running...............................................................155
9.3 Verifying Information Steward services ...............................................................................156
9.4 Configuring Metadata Browsing Service and View Data Service .........................................157
9.4.1 Metadata Browsing Service configuration parameters..........................................................158
9.4.2 View Data Service configuration parameters........................................................................159
9.5 Job server group..................................................................................................................161
9.5.1 Configuring a Data Services Job Server for Data Insight......................................................161
9.5.2 Adding Data Services Job Servers for Data Insight..............................................................162
9.5.3 Displaying job servers for Information Steward....................................................................162
9.5.4 Removing a job server..........................................................................................................163

Chapter 10 Performance and Scalability Considerations......................................................................165


10.1 Resource intensive functions...............................................................................................165
10.2 Architecture.........................................................................................................................166
10.3 Factors that influence performance and sizing......................................................................167
10.3.1 Data Insight..........................................................................................................................167

5 2011-04-06
Contents

10.3.2 Metadata Management........................................................................................................169


10.3.3 Cleansing Package Builder...................................................................................................169
10.4 Scalability and performance considerations..........................................................................170
10.4.1 Scalability levels in deployment............................................................................................171
10.4.2 Distributed processing.........................................................................................................172
10.4.3 Scheduling tasks .................................................................................................................174
10.4.4 Queuing tasks......................................................................................................................175
10.4.5 Degree of parallelism...........................................................................................................175
10.4.6 Grid computing....................................................................................................................178
10.4.7 Using SAP applications as a source.....................................................................................180
10.4.8 Multi-threaded file read........................................................................................................182
10.4.9 Data Insight result set optimization.......................................................................................182
10.4.10 Performance settings for input data......................................................................................183
10.4.11 Settings to control repository size........................................................................................184
10.4.12 Settings for Metadata Management.....................................................................................185
10.4.13 Settings for Cleansing Package Builder................................................................................186
10.5 Best practices for performance and scalability.....................................................................187
10.5.1 General best practices.........................................................................................................187
10.5.2 Data Insight best practices...................................................................................................188
10.5.3 Metadata Management best practices.................................................................................190
10.5.4 Cleansing Package Builder best practices............................................................................191

Chapter 11 Backing Up and Restoring Metadata Management.............................................................193


11.1 Backing up and restoring overview.......................................................................................193
11.2 Exporting objects to XML.....................................................................................................193
11.2.1 To export objects to XML.....................................................................................................193
11.3 Backing up and restoring configurations...............................................................................196
11.3.1 Backing up configurations....................................................................................................196
11.3.2 Restoring configurations......................................................................................................197

Chapter 12 Life Cycle Management......................................................................................................199


12.1 Migration basics...................................................................................................................199
12.1.1 Development process phases..............................................................................................199
12.2 Migration mechanisms and tools..........................................................................................200
12.2.1 Moving objects using the lifecycle management console .....................................................201
12.2.2 Exporting and importing objects using Information Steward..................................................201

Chapter 13 Supportability......................................................................................................................203
13.1 Information Steward logs.....................................................................................................203
13.1.1 Log levels............................................................................................................................203

6 2011-04-06
Contents

13.1.2 Changing log levels..............................................................................................................204


13.1.3 Viewing logs........................................................................................................................205
13.1.4 Viewing additional logs.........................................................................................................207

Chapter 14 Appendix.............................................................................................................................209
14.1 Glossary..............................................................................................................................209

Index 215

7 2011-04-06
Contents

8 2011-04-06
Getting Started

Getting Started

1.1 Product overview

With operational systems frequently changing, data quality control becomes critical when you publish
business reports. SAP BusinessObjects Information Steward provides data profiling and validation rule
features that you can use to determine and improve the quality and structure of your source data.

Example:

1.2 Accessing Information Steward for administrative tasks

You perform administrative tasks for SAP BusinessObjects Information Steward on the Central
Management Console (CMC) of SAP Business Intelligence Platform.
1. Access the CMC in one of the following ways:
• Select SAP Business Intelligence Platform Central Management Console from the program
group on the Windows Start menu.
Start > Programs > SAP BusinessObjects XI 4.0 > SAP BusinessObjects Enterprise > SAP
BusinessObjects Enterprise Central Management Console.
• Type directly into your browser the name of the computer you are accessing.
http://webserver:8080/BOE/CMC

Replace webserver with the name of the web server machine. If you changed this default virtual
directory on the web server, you need to type your URL accordingly. If necessary, change the
default port number to the number you provided when you installed Business Intelligence Platform.

2. Login to the Central Management Console (CMC) with a user name that belongs to one or more of
the following administration groups:
• Data Insight Administrator
• Metadata Management Administrator
• Administrator

9 2011-04-06
Getting Started

For details, see "To log on to the CMC from your browser" in the BusinessObjects Enterprise
Administrator's Guide.
3. On the CMC Home page, access Information Steward in one of the following ways:
• Click the Information Steward link under the "Organize" area.
• Click the Information Steward tab on the left of your screen.
• Select the Information Steward option from the drop-down list at the top of the CMC Home
page.

10 2011-04-06
Architecture

Architecture

2.1 Architecture overview

Information Steward uses SAP BusinessObjects Business Intelligence platform for managing user
security, scheduling integrator sources as tasks and utilities, managing sources, and on demand services.
Information Steward uses Data Services for profiling, rule tasks, browsing metadata, and viewing data.

The following diagram shows the architectural components for SAP BusinessObjects Business
Intelligence platform, SAP BusinessObjects Data Services, and SAP BusinessObjects Information
Steward.
Note:
The diagram shows only the servers and services in the Business Intelligence platform and Data Services
that are relevant to Information Steward.

11 2011-04-06
Architecture

Related Topics
• Servers and services
• Data Services Job Server
• Web Application Server

2.1.1 Servers and services

SAP BusinessObjects Business Intelligence platform (BI platform) uses the terms server and service
to refer to the two types of software running on an Information platform services machine.

The term “server” is used to describe an operating system level process (on some systems, this is
referred to as a daemon) hosting one or more services. For example, the Enterprise Information
Management Adaptive Processing Server and Information Steward Job Server are servers. A server
runs under a specific operating system account and has its own PID.

A “service” is a server subsystem that performs a specific function. The service runs within the memory
space of its server under the process id of the parent container (server). For example, the Information
StewardTask Scheduling Service is a subsystem that runs within the Information Steward Job Server.

12 2011-04-06
Architecture

A “node” is a collection of BI platform servers running on the same host. One or more nodes can be on
a single host.

Information platform services can be installed on a single machine, spread across different machines
on an intranet, or separated over a wide area network (WAN).

2.1.1.1 Data Services Job Server

The Job Server component of SAP BusinessObjects Data Services is required for the following reasons:
• The Data Services Job Server must already be installed on this computer because it provides the
following system management tools that are required during the first installation of Information
Steward:
• Repository Manager
The Repository Manager creates the required Data Insight objects in the Information Steward
repository. The Information Steward installer invokes the Repository Manager automatically when
creating the repository the first time the installer is run.
• Server Manager
The Server Manager creates the Information Steward job server group and job servers and
associates them to the Information Steward repository.

To add job servers to the Information Steward job server group, you must manually invoke the
Server Manager. For details, see “Adding a job server for Data Insight” in the Installation Guide.

• The Data Services Job Server provides the engine processes that perform the Data Insight profiling
and rule tasks. The engine processes use parallel execution and in-memory processing to deliver
high data throughput and scalability.

Note:
When you installed Data Services, ensure that you selected Job Server under the Server component
in the "Select Feature" window of the Data Services installer.
In addition, you need to choose MDS and VDS during the Data Services installation. These two options
are not checked by default.

2.1.1.2 Web Application Server

SAP BusinessObjects Information Steward is deployed on:


• The Central Management Console (CMC) through which you perform administrative tasks for
Information Steward
• A web application server through which you access the Information Steward modules:

13 2011-04-06
Architecture

• Data Insight
• Metadata Management
• Metapedia
• Cleansing Package Builder

Note:
The Information Steward web application must be installed on the same web application server as that
of the SAP BusinessObjects Business Intelligence platform.
For specific version compatibility, refer to the Product Availability Matrix available at http://ser
vice.sap.com/PAM.

2.1.1.2.1 Administration
You use the Central Management Console (CMC) to perform SAP BusinessObjects Information Steward
administrative tasks such as the following:
• Define Data Insight connections and projects
• Configure and run metadata integrators
• Define source groups to subset the metadata when viewing relationships such as Same As, Impact,
and Lineage
• Administer user security for the modules of Information Steward: Data Insight , Metadata Management,
and Cleansing Package Builder
• Configure application settings that affect the behavior and performance of Data Insight profile and
rule tasks
• Schedule Data Insight profile and rule tasks
• Run or schedule Information Steward utilities
For more information, see the SAP BusinessObjects Information Steward Administrator Guide.

2.1.1.2.2 SAP BusinessObjects Information Steward web application


The SAP BusinessObjects Information Steward is a web-based interface through which users of each
module can perform tasks to analyze data and its metadata.
• Data Insight users can perform tasks such as:
• Profile data in tables or files and analyze the resulting profile attributes
• Define rules and set up data quality scorecards
• View data quality score trend and view sample data that failed each rule
• View data quality impact on dependent data sources
• Metadata Management users can perform tasks such as:
• View metadata from different sources and search for objects without the need to know the source
or application in which it exists
• Add annotations to an object and define custom attributes and values for an object
• Run pre-defined reports that answer typical business questions such as "Which Universe objects
are not used in my reports?" or "Which reports are using my tables?"
• View impact and lineage of objects within the same source or within different sources.

14 2011-04-06
Architecture

• Impact analysis - Allows you to identify which objects will be affected if you change or remove
other connected objects.
• Lineage analysis - Allows you to trace back from a target object to the source object.
• Metapedia can perform tasks such as:
• Define Metapedia terms related to your business data and organize the terms into categories.
• Cleansing Package Builder can perform tasks such as:
• Define cleansing packages to parse and standardize data
• Publish a cleansing package and export it to SAP BusinessObjects Data Services where users
can import it to generate a base Data Cleanse transform that can be included in jobs to cleanse
your data

For more information, see the SAP BusinessObjects Information Steward User Guide.

2.1.1.3 Metadata integrators

Metadata integrators are programs that do the following:


• Collect metadata from source systems and store the collected metadata into the SAP BusinessObjects
Information Steward repository.
• Can be scheduled to run at regular intervals.
• Update the existing metadata.
• Can run on one job server or multiple job servers for load balancing and high availability.
• Each run as a separate process.

SAP BusinessObjects Information Steward provides the following metadata integrators:

Metadata integrator name Metadata collected

Collects metadata about objects such as univers-


SAP BusinessObjects Enterprise Metadata Inte-
es, Crystal Reports, Web Intelligence documents,
grator
and Desktop Intelligence documents.

Collects metadata about objects such as Queries,


SAP NetWeaver Business Warehouse Metadata InfoProviders, InfoObjects, Transformations, and
Integrator DataSources from an SAP NetWeaver Business
Warehouse system.

Collects metadata about objects such as catalogs,


Common Warehouse Model (CWM) Metadata
schemas, and tables from the CWM Relational
Integrator
Package.

Collects metadata about objects such as projects,


SAP BusinessObjects Data Federator Metadata
catalogs, datasources, and mapping rules from
Integrator
a Data Federator repository.

15 2011-04-06
Architecture

Metadata integrator name Metadata collected

Collects metadata about objects such as source


SAP BusinessObjects Data Services Metadata
tables, target tables, and column mappings from
Integrator
a Data Services repository.

Collects metadata from other third-party sources


such as the following:
• Data Modeling metadata such as Sybase
Power Designer, Embarcadero ER/Studio, and
Oracle Designer
Meta Integration Metadata Bridge (MIMB) Meta- • Extract, Transform, and Load (ETL) metadata
data Integrator (also known as MITI Integrator) such as Oracle Warehouse Builder and Mi-
crosoft SQL Server Integration Services (SSIS)
• OLAP and BI metadata such as IBM DB2
Cube Views, Oracle OLAP, and Cognos BI
Reporting

Collects metadata from relational database man-


agement systems (RDBMS) which can be DB2,
MySQL, Oracle, SQL Server, Java Database
Relational databases Metadata Integrator Connectivity (JDBC), or a BusinessObjects Uni-
verse connection. Collected metadata includes
the definition of objects such as tables, view,
synonyms, and aliases.

You can also obtain third-party metadata integrators for other data sources. For more information about
third-party metadata integrators, see http://www.metaintegration.net/Products/MIMB/SupportedTools.html.

Related Topics
• User Guide: section "BusinessObjects Enterprise objects"
• User Guide: "SAP NetWeaver Business Warehouse metadata"
• User Guide: section "Data Modeling metadata"
• User Guide: section "BusinessObjects Data Federator objects"
• User Guide: section "BusinessObjects Data Services objects"
• http://www.metaintegration.net/Products/MIMB/SupportedTools.html
• http://www.metaintegration.net/Products/MIMB/Documentation/
• User Guide: section "Relational Database metadata"

2.1.1.4 Services

16 2011-04-06
Architecture

The following table describes each of the services that are pertinent to SAP BusinessObjects Information
Steward.

Table 2-2: Services pertinent to Information Steward

Part of server
Service that service Service description Deployment comments
runs on

Performs the data analysis when you


create a custom cleansing package in
Cleansing Package Builder. The data
analysis uses the information the user
Cleansing
Enterprise Infor- provides in the custom cleansing pack-
Package An Adaptive Processing Server of
mation Manage- age wizard, along with statistical analy-
Builder SAP BusinessObjects Business Intel-
ment Adaptive sis, to create an abstract version of the
Auto-anal- ligence platform must already be in-
Processing records. The abstracted records are
ysis Ser- stalled on this computer.
Server then grouped, and each group is pro-
vice
cessed with data inference algorithms
to create the suggestions that the user
sees in Design mode of Cleansing
Package Builder.

Cleansing Enterprise Infor- Performs all the main functionality of


An Adaptive Processing Server of
Package mation Manage- Cleansing Package Builder, such as
SAP BusinessObjects Business Intel-
Builder ment Adaptive creating and opening cleansing pack-
ligence platform must already be in-
Core Ser- Processing ages, Design mode, and Advanced
stalled on this computer.
vice Server Mode.

Cleansing Enterprise Infor- Converts a published cleansing pack-


An Adaptive Processing Server of
Package mation Manage- age to the reference data format used
SAP BusinessObjects Business Intel-
Builder ment Adaptive by Data Services. This process can take
ligence platform must already be in-
Publishing Processing a significant period of time for large
stalled on this computer.
Service Server cleansing packages.

Informa-
Enterprise Infor-
tion Stew- Performs tasks on Data Services such An Adaptive Processing Server of
mation Manage-
ard Admin- as test rule, delete objects (connection, SAP BusinessObjects Business Intel-
ment Adaptive
istrator datastore, workflow) from Data Ser- ligence platform must already be in-
Processing
Task Ser- vices, export rules to Data Services. stalled on this computer.
Server
vice

17 2011-04-06
Architecture

Part of server
Service that service Service description Deployment comments
runs on

The Data Services Metadata Brows-


ing Service is a component of SAP
Provides Data Insight users the capabil-
BusinessObjects Data Services
Data Ser- Enterprise Infor- ity to browse and import metadata from
which must already be installed on
vice Meta- mation Manage- different data sources. Data sources
this computer.
data ment Adaptive include relational database systems
Browsing Processing (such as Oracle and Microsoft SQL An Adaptive Processing Server of
Service Server Server) , and Applications (such as, SAP BusinessObjects Business Intel-
SAP ECC and Oracle Applications). ligence platform must already be in-
stalled on this computer.

The Data Services View Data Service


is a component of SAP BusinessOb-
Enterprise Infor- jects Data Services which must al-
Data Ser-
mation Manage- Provides the capability to view the exter- ready be installed on this computer.
vices View
ment Adaptive nal data in Data Insight connections in
Data Ser- An Adaptive Processing Server of
Processing Information Steward.
vice SAP BusinessObjects Business Intel-
Server
ligence platform must already be in-
stalled on this computer.

It is recommended that you install the


Metadata Relationship Service on a
Computes metadata object relationships different computer than the web ap-
Informa-
Enterprise Infor- (such as data lineage and change im- plication server. An Adaptive Process-
tion Stew-
mation Manage- pact analysis). The Metadata Manage- ing Server of SAP Business Intelli-
ard Meta-
ment Adaptive ment module of Information Steward gence Platform must already be in-
data Rela-
Processing uses the results of these computed re- stalled on that computer.
tionship
Server lationships to display metadata relation-
Service You can deploy multiple Metadata
ship diagrams.
Relationship Services for load balanc-
ing and availability.

18 2011-04-06
Architecture

Part of server
Service that service Service description Deployment comments
runs on

Allows you to find an object that exists


in any integrator source while viewing You can deploy multiple search ser-
metadata on the Metadata Management vices for load balancing and availabil-
module of Information Steward. It uses ity.
the Lucene Search Engine.
Informa- If the Metadata Search Service is not
Enterprise Infor-
tion Stew- The Metadata Search Service con- available during the construction and
mation Manage-
ard Meta- structs the search index during the exe- update processes, the search might
ment Adaptive
data cution of the Metadata Integrators, and return incorrect results. For these
Processing
Search the File Repository Server stores the situations, Information Steward pro-
Server
Service compressed search index. The Metada- vides a utility to reconstruct the
ta Search Service also updates the search index. For details, see
search index for changes to Metapedia “Recreating search indexes on
terms and categories, custom attributes, Metadata Management” in the Admin-
and annotations. istrator Guide.

Informa-
tion Stew- An Adaptive Job Server of SAP
Information Processes scheduled Data Insight pro-
ard Task BusinessObjects Business Intelli-
Steward Job file and rule tasks in the Central Man-
Schedul- gence platform must already be in-
Server agement Console (CMC).
ing Ser- stalled on this computer.
vice

Informa-
tion Stew-
An Adaptive Job Server of SAP
ard Inte- Information Processes scheduled Metadata Man-
BusinessObjects Business Intelli-
grator Steward Job agement integrator sources in the
gence platform must already be in-
Schedul- Server Central Management Console (CMC).
stalled on this computer.
ing Ser-
vice

Informa- Enterprise Infor-


tion Stew- mation Manage-
Helper service that is used for testing
ard Inte- ment Adaptive It is installed with the integrators.
connections to integrator sources.
grator Ser- Processing
vice Server

2.2 Information Steward on Business Intelligence platform components

19 2011-04-06
Architecture

The following table describes how SAP BusinessObjects Information Steward uses each pertinent SAP
BusinessObjects Business Intelligence platform (BI platform) component.

BI platform
How SAP BusinessObjects Information Steward uses component
component

Deploys Information Steward on:


• The Central Management Console (CMC) through which administrative tasks
for Information Steward are performed
• A web application server through which you access the Information Steward
Web Applica- modules:
tion Server • Data Insight
• Metadata Management
• Metapedia
• Cleansing Package Builder

Manages the following for SAP BusinessObjects Information Steward


• Enterprise Information Management Adaptive Processing Server (EIMAdaptive-
ProcessingServer) and services
• Information Steward Job Server (ISJobServer) and services
• Manage Metadata Management module:
• Integrator source configurations
• Source group configurations
Central Man- • Manage Data Insight module:
agement Con- • Connections
sole (CMC) • Projects
• Profile and rule tasks
• Manage Cleansing Package Builder module:
• Cleansing package owner
• Information Steward utilities
• User security (authentication and authorization)
• Manage repository and application settings

20 2011-04-06
Architecture

BI platform
How SAP BusinessObjects Information Steward uses component
component

Maintains a database of information about your BI platform system. The data stored
by the CMS includes information about users and groups, security levels, schedule
information, BI platform content, and servers. For more information about the CMS,
see SAP BusinessObjects Business Intelligence Platform Administrator's Guide.
The following objects in the Metadata Management module of Information Steward
are stored in the CMS.
• Integrator Source configurations
• Source groups
Central Man- • Utilities configurations
agement Serv- • Data Insight connections
er (CMS) • Projects
• Tasks
Note:
Because integrator source configurations and source group definitions are stored
in the CMS, you can use the Upgrade management tool to move them from one
version of the CMS to another. The schedules and rights information are considered
dependencies of these configurations. For details, see the Upgrade Guide and the
“Lifecycle Management” section in the Administrator Guide.

Information Steward Job Server uses the Adaptive Job Server for executing profiling
tasks and integrator tasks.
Adaptive Job
Server The server may host the following services for Information Steward:
• Information Steward Task Scheduling Service
• Information Steward Integrator Scheduling Service

Required during Information Steward installation to create the Enterprise Informa-


tion Management Adaptive Processing Server.

The EIM Adaptive Processing Server uses BIP adaptive processing server to host
the following services:
• Metadata Relationship Service
• Metadata Search Service
Adaptive Pro-
cessing Server • Metadata Integrator Service
• Data Services Metadata Browsing Service
• Data Services View Data Service
• Information Steward Administrator Task Service
• Cleansing Package Builder Core Service
• Cleansing Package Builder Auto-analysis Service
• Cleansing Package Builder Publishing Service

21 2011-04-06
Architecture

BI platform
How SAP BusinessObjects Information Steward uses component
component

Stores files associated with:


• A published cleansing package. The stored information can be accessed by
File Repository Data Services
Server • History logs for Data Insight task and metadata integrator execution
• Search index for metadata integrator objects

2.3 Information workflows

When tasks are performed in SAP BusinessObjects Information Steward, such as adding a table to a
Data Insight project, running a Metadata Management integrator source, or creating a cleansing package,
information flows through SAP BusinessObjects intelligence platform services, SAP BusinessObjects
Data Services, and SAP BusinessObjects Information Steward. The servers and services within each
of these software products communicate with each other to accomplish a task. For an overview of the
servers and services, see Architecture overview.

The following section describes some of the process flows as they would happen in SAP BusinessObjects
intelligence platform services, SAP BusinessObjects Data Services, and SAP BusinessObjects
Information Steward.

2.3.1 Adding a table to a Data Insight project

This workflow describes the process of adding a table to a Data Insight project.
1. The user selects the Add > Tables on "Workspace Home" in the Data Insight tab to access the
"Browse Metadata" window.
2. The web application server passes the request to the Central Management Server (CMS) and returns
a list of connections that the user can view assuming the user has appropriate permissions to view
the connections.
3. If the user has the appropriate rights to view the selected connection, the CMS sends the request
to the Data Services Metadata Browsing Service.
4. The Data Services Metadata Browsing Service obtains the metadata from the connection and sends
the metadata to the web application server.
5. The web application server displays the metadata in the Data Insight "Browse Metadata" window.
6. When the user selects a table and clicks Add to Project, the web application stores the metadata
in the Information Steward repository.

22 2011-04-06
Architecture

2.3.2 Profiling data

This workflow describes the process of running a profile task in Data Insight. Running validation rules
are similar to the steps here.
1. The user selects the name of a table or file on "Workspace Home" in the Data Insight tab and clicks
Profile.
2. Choose the tables in the "Workspace home" of the Data Insight tab.
3. Save the task and schedule to run it.
4. The web application server passes the request to the Central Management Server (CMS).
5. The Information Steward web application determines from the CMS system if the user has the right
to run profile tasks on the connection that contains the table or file.
6. The administrator determines if the user has the right to create a profile task for the connection, and
has the right to schedule the task. If so, the task is scheduled in the CMS system.
7. When the scheduled time arrives, the CMS sends the task information to the Information Steward
Task Scheduling Service.
8. The Information Steward Task Scheduling Service sends the profile task to the Data Services Job
Server.
9. The Data Services Job Server partitions the profile task based on the performance application
settings.
10. The Data Services Job Server executes the profile task and stores the results in the Information
Steward repository.
11. The web application server displays the profile results in the Data Insight "Workspace Home" window.

2.3.3 Scheduling and running a Metadata Management integrator source

This workflow describes the process of scheduling and running a Metadata Management integrator
source to collect metadata.
1. The user schedules an integrator source on the Central Management Console (CMC) and the request
is sent to the CMS system.
2. The CMS determines if the user has the appropriate rights to schedule the integrator source.
3. If the user has the appropriate rights to schedule the object, the CMS commits the scheduled
integrator request to the CMS system.
4. When the scheduled time arrives, the CMS finds a suitable Information Steward Job Server based
on the Job Server group associated with the integrator and passes the job.
If the process has a SAP BusinessObjects Enterprise 3.x source system, the process contacts the
registered remote job server and passes along the integrator process information.

23 2011-04-06
Architecture

5. The integrator process collects metadata and stores the metadata in the Information Steward
repository.
6. The integrator process also generates the Metadata Management search index files and loads them
to the Input File Repository Server.
7. After uploading the search index files, the integrator source notifies the Metadata Management
search service.
8. The Metadata Management search service downloads the generated index files and consolidates
them into a master index file.
9. The Information Steward Integrator Scheduling Service updates the CMS with the job status.

2.3.4 Creating a custom cleansing package with Cleansing Package Builder

This workflow describes the process of creating and publishing a custom cleansing package in Cleansing
Package Builder.
1. In Information Steward, the user clicks the Cleansing Package Builder tab.
2. The Cleansing Package Builder (CPB) application sends the user's login information to the CPB
Web Service.
3. The CPB Web Service sends the information to the Enterprise Information Management (EIM)
Adaptive Processing server.
The EIM Adaptive Processing Server runs on the Business Intelligence platform.
4. The EIM Adaptive Processing Server determines which rights the user has in CPB.
5. The information is sent back through the CPB Web Service to the CPB application.
The user sees the cleansing packges they have the rights to view.
6. In the "Cleansing Packages Tasks" screen, the user selects New Cleansing Package > Custom
Cleansing Package to start creating a cleansing package.
The user provides the necessary information and sample data to create the cleansing package.
7. The CPB application sends the information through the CPB Web Service to the CPB Core Service,
using the BusinessObjects Enterprise SDK mechanism.
The CPB Core Service handles the main functions of CPB. The CPB Core Service runs on the EIM
Adaptive Processing Server.
8. The CPB Core Service sends the response back through the CPB Web Service to the CPB
application.
The new cleansing package is created in CPB.
9. The application communicates with the CPB Auto-Analysis Service through the CPB Web Service.
The CPB Auto-Analysis Service analyzes the data to create suggestions of standard forms and
variations. The CPB Auto-Analysis Service runs on the EIM Adaptive Processing Server.
10. When the user has finished refining the cleansing package, the user clicks Publish on the "Cleansing
Packages Tasks" screen.
11. The CPB application communicates with the CPB Publishing Service through the CPB Web Service.

24 2011-04-06
Architecture

The CPB Publishing Service assists in the cleansing package's conversion to the reference data
format used by Data Services. The CPB Publishing Service runs on the Enterprise Management
Adaptive Processing Server.
12. The published cleansing package information is sent to the Input File Repository, where it is stored
and can be accessed by Data Services.
The Input File Repository runs on the Business Intelligence platform.
13. Data Services communicates directly with the Business Intelligence platform to sync with the published
cleansing packages.

25 2011-04-06
Architecture

26 2011-04-06
Securing SAP BusinessObjects Information Steward

Securing SAP BusinessObjects Information Steward

3.1 Security Overview

SAP BusinessObjects Information Steward uses the security framework that SAP BusinessObjects
Business Intelligence plaform (BI platform) provides.
The BI platform architecture addresses the many security concerns that affect today's businesses and
organizations. The current release supports features such as distributed security, single sign-on, resource
access security, granular object rights, and third-party authentication in order to protect against
unauthorized access.

For details about how BI platform addresses enterprise security concerns, see the “Securing Information
platform services” section of the SAP BusinessObjects Business Intelligence Platform Administrator's
Guide.
For details about user groups and granular object rights, see Information Steward pre-defined users
and groups

The following topics detail how Information Steward uses the enterprise security features provided by
BI platform.

Related Topics
• Securing user data for Information Steward
• Storage of sensitive information
• Configuring the Remote Job Server for SSL
• Reverse proxy servers

3.2 Enterprise security

SAP BusinessObjects Information Steward is a web-based application that uses enterprise security
provided by SAP BusinessObjects Business Intelligence platform. This section details the ways in which
Information Steward takes advantage of the followingSAP BusinessObjects Enterprise security features:
• Secure access to user data
• Storage of sensitive information

27 2011-04-06
Securing SAP BusinessObjects Information Steward

• Secure connections
• Reverse proxy servres

Related Topics
• Securing user data for Information Steward
• Storage of sensitive information
• Configuring the Remote Job Server for SSL
• Reverse proxy servers

3.2.1 Securing user data for Information Steward

SAP BusinessObjects Information Steward has access to the following data which might contain sensitive
information:
• Source data in Data Insight connections on which users run profile and rule tasks
• Sample data from profiling results that Data Insight stores in the Information Steward repository
• Sample data that failed validation rules that Data Insight stores in the Information Steward repository
• All data that failed validation rules that a user chooses to store in a database accessed through a
Data Insight connection.

The Database Administrator (DBA) secures the data in these databases by managing user permissions
on them:
• Data Insight connections for profiling
• Information Steward repository
• Data Insight connections for all data that failed validation rules

In addition, the Data Insight Administrator or Administrator control access to the data by using the
Central Management System (CMS) to manage the following rights on the Data Insight connections:
• View Data
• Profile/Rule permission
• View Sample Data
• Export Data
For more information, see User rights in Data Insight.

3.2.2 Storage of sensitive information

Information Steward uses the SAP BusinessObjects Enterprise cryptography which is designed to
protect sensitive data stored in the CMS repository. Sensitive data includes user credentials, data

28 2011-04-06
Securing SAP BusinessObjects Information Steward

source connectivity data, and any other info objects that store passwords. This data is encrypted to
ensure privacy, keep it free from corruption, and maintain access control. For more information, see
the "Overview of SAP BusinessObjects Enterprise data security" section of the SAP BusinessObjects
Enterprise Administrator's Guide.
Encryption of sensitive information, such as passwords, is done in the following Information Steward
areas:
• Information Steward repositories
• Metadata Integrator sources
• Data Insight connections

3.2.3 Configuring the Remote Job Server for SSL

The SAP BusinessObjects Enterprise Metadata Integrator in SAP BusinessObjects Information Steward
4.0 can collect metadata from an SAP BusinessObjects Enterprise XI 3.x system by using the Remote
Job Server. When you install Information Steward, you install the Remote Job Server component on
the computer where the Enterprise XI 3.x system resides. Then you use the Information Steward Service
Configuration to configure the Remote Job Server. For information about installing the Remote Job
Server on the SAP BusinessObjects Enterprise XI 3.x system, see “Remote Job Server Installation” in
the Installation Guide.

If you are using the Secure Sockets Layer (SSL) protocol for all network communication between clients
and servers in your SAP BusinessObjects EnterpriseXI 3.x and SAP BusinessObjects Business
intelligence platform 4.0 deployments, you can use SSL for the network communication between the
Remote Job Server and the Metadata Integrator on Information Steward. In this enviironment:
• The server is the Remote Job server on the SAP BusinessObjects EnterpriseXI 3.x. To enable SSL,
the server must have both keystore and truststore files defined.
• The client is the Metadata Integrator on Information Steward 4.0. To enable SSL, the client must
use the same truststore and password as the server.

To set up SSL between the the Remote Job Server and the Metadata Integrator, you need to perform
the following tasks:
• Create keystore and truststore files for the Remote Job Server and copy the truststore file to the
SAP BusinessObjects Business intelligence platform 4.0 system.
• Configure the location of SAP BusinessObjects XI 3.x SSL certificates and key file names (from
Server Intelligence Agent (SIA)).

Related Topics
• Creating the keystore and truststore files for the Remote Job Server
• Configuring the SSL protocol for the Remote Job Server

29 2011-04-06
Securing SAP BusinessObjects Information Steward

3.2.3.1 Creating the keystore and truststore files for the Remote Job Server

To set up SSL protocol for communication to the Remote Job Server, use the keytool command to:
• Create a certificate and store it in a keystore file on the computer where SAP BusinessObjects
Enterprise XI 3.x resides
• Create a trust certificate and store it in a truststore file on the computer where SAP BusinessObjects
Enterprise XI 3.x resides
• Copy the trust certificate into a truststore file on the computer where SAP BusinessObjects Business
Intelligence Platform 4.0. resides
1. On the computer where you installed SAP BusinessObjects Enterprise XI 3.x, generate a keystore
file and export it:
a. Open a cmd window and go to the directory where Metadata Management configuration files are
stored for Information Steward.
For example, type the following command:
cd C:\Program Files (x86)\SAP BusinessObjects\InformationSteward\MM\config

b. Generate the keystore file using the keytool command.


For example, type the following command to generate a keystore file with the name is.keystore
and password mypwstore:
%JAVA_HOME%\bin\keytool -genkey -keystore is.keystore
-keyalg RSA -dname "CN=localhost,OU=ICC,O=ICC,L=PA,ST=CA,C=US"
-keypass mypwkey -storepass mypwstore

The value for keypass protects the generated key.


This command creates the key and certificate in the keystore file in the C:\Program Files
(x86)\SAP BusinessObjects\InformationSteward\MM\config directory.
c. Export the certificate from the keystore file using the keytool command.
For example, type the following command to export to the file isClient.cer:
%JAVA_HOME%\bin\keytool -export -keystore is.keystore
-file isClient.cer -keypass mypwkey -storepass mypwstore

d. Import the certificate from the export file to create the truststore file.
For example, type the following command to import the certificate into a truststore file named
is.truststore.keystore:
%JAVA_HOME%\bin\keytool -import -v -trustcacerts
-file isClient.cer -keystore is.truststore.keystore
-keypass mypwkey -storepass mypwstore

This command stores the certificate in the truststore file in the InformationSteward\MM\Config
directory.

2. Copy the truststore file to the computer where SAP BusinessObjects Business Intelligence Platform
4.0. resides.

30 2011-04-06
Securing SAP BusinessObjects Information Steward

a. Ensure that the is.truststore.keystore file is in a directory that is accessible to both the
computer where SAP BusinessObjects Enterprise XI 3.x is installed and the computer where
SAP BusinessObjects Business Intelligence Platform 4.0. is installed.
b. Copy the is.truststore.keystore file to the directory where Metadata Management
configuration files are stored for Information Steward.
For example:
C:\Program Files (x86)\SAP BusinessObjects\InformationSteward\MM\config

3.2.3.2 Configuring the SSL protocol for the Remote Job Server

After you create a key and certificate on the Remote Job Server computer and store them in a secure
location, you need to provide the Information Steward Service Configuration on SAP BusinessObjects
XI 3.x with the secure location.

To configure the SSL protocol in the CCM:


1. Obtain the SSL certificate and key file names from the Server Intelligence Agent.properties.
a. Access the Central Configuration Manager (CCM) from the Windows Start > Programs menu
under your version of SAP BusinessObjects XI 3.x:
SAP BusinessObjects Information Steward 4.0 > Central Configuration Manager
b. In the CCM, right-click Server Intelligence Agent and choose Properties.
c. Obtain the values for the following options:
• SSL Certificates Folder
• Server SSL Certificate File
• SSL Trusted Certificate File
• SSL Private Key File
• SSL Private Key File Passphrase File
For more information about these options, see “Configuring servers for SSL” in the SAP
BusinessObjects Business Intelligence Platform Administrator's Guide.
d. Click Cancel to close the "Server Intelligence Agent Properties" window.
2. Access the Information Steward Service Configuration tool from the Windows Start > Programs
menu under your version of SAP BusinessObjects XI 3.x:
SAP BusinessObjects Information Steward 4.0 > Information Steward Service Configuration.
3. In the "SAP BusinessObjects Information Steward Service Configuration" window, click the "Protocol"
tab.
4. Make sure Enable SSL is selected.
5. For the following options, enter the values for the SIA that you obtained from the CCM in step 1c
above.
• SSL Certificates Folder
• Server SSL Certificate File

31 2011-04-06
Securing SAP BusinessObjects Information Steward

• SSL Trusted Certificate File


• SSL Private Key File
• SSL Private Key File Passphrase File
6. For Keystore File and Keystore Password, enter the values that you specified in the keystore
and storepass options of the keytool -genkey command in step 1b in Creating the keystore
and truststore files for the Remote Job Server.
7. For Truststore File and Truststore Password, enter the values that you specified in the keystore
and storepass options of the keytool -import command in step 1d in Creating the keystore
and truststore files for the Remote Job Server.
8. Click Apply.
9. Restart the Remote Job Server.
a. Click the "General" tab in "SAP BusinessObjects Information Steward Service Configuration"
window.
b. Click Stop.
c. Click Start.

Note:
To run an integrator source with SSL enabled on the Remote Job Server, set the following run-time
JVM parameter. For more details, see Metadata collection using the Remote Job Server with SSL .
-Dbusinessobjects.migration=on

3.2.4 Reverse proxy servers

SAP BusinessObjects Information Steward can be deployed in an environment with one or more reverse
proxy servers. A reverse proxy server is typically deployed in front of the web application servers in
order to hide them behind a single IP address. This configuration routes all Internet traffic that is
addressed to private web application servers through the reverse proxy server, hiding private IP
addresses.

Because the reverse proxy server translates the public URLs to internal URLs, it must be configured
with the URLs of the Information Steward web applications that are deployed on the internal network.

For information about supported reverse proxy servers and how to configure them, see “Information
platform services and reverse proxy servers” and “Configuring reverse proxy servers for Information
platform” in the SAP information platform services Administrator's Guide.

32 2011-04-06
Users and Groups Management

Users and Groups Management

4.1 Users and Groups overview

The Central Management System (CMS) manages security information, such as user accounts, group
memberships, and object rights that define user and group privileges. When a user attempts an action
on an Information Steward object, the CMS authorizes the action only after it verifies that the user's
account or group membership has sufficient privileges.

Use BusinessObjects Business Intelligence platform (formely known as BusinessObjects Enterprise)


security to create users and authorize user access to the objects and actions within the Information
Steward modules:
• Data Insight
• Metadata Management
• Metapedia (under Metadata Management in the Central Management Console)
• Cleansing Package Builder

Information Steward provides pre-defined user groups that have specific rights on objects unique to
each module. These user groups enable you to grant rights to multiple users by adding the users to a
group instead of modifying the rights for each user account individually. You also have the ability to
create your own user groups.

Related Topics
• Information Steward pre-defined users and groups
• Managing users in Information Steward
• User rights in Data Insight
• User rights in Metadata Management
• User rights in Cleansing Package Builder

4.2 Information Steward pre-defined users and groups

SAP BusinessObjects Information Steward provides pre-defined user groups for each module (Data
Insight, Metadata Management, Cleansing Package Builder) to facilitate managing security on the
objects within each module.

33 2011-04-06
Users and Groups Management

The following diagram provides an overview of the pre-defined Information Steward user groups and
their relation to the Administrator group in SAP BusinessObjects Business Intelligence platform.
• The Administrator group in Business Intelligence platform is for users that:
• Create users and custom groups for all Information Steward modules
• Have rights to perform all tasks within all Information Steward modules.
• Grant users access to cleansing packages
• The Data Insight Administrator is for users that grant users and groups access to connections and
projects (by default, all pre-defined Data Insight groups are granted access to all connections and
projects). A Data Insight Administrator also has access to all Data Insight actions.
• The Metadata Management Administrator grants users access to integrator sources. The Metadata
Management Administrator also has access to all Metadata Management actions.
• Within each Information Steward module, additional user groups have specific rights for the objects
within that module. For example, the Data Insight Analyst group can create profile tasks and rules,
but only the Data Insight Rule Approver can approve rules.

Subsequent topics describe these pre-defined groups and the rights they have on the specific objects.

34 2011-04-06
Users and Groups Management

Related Topics
• Data Insight pre-defined user groups
• Type-specific rights for Data Insight objects
• Metadata Management pre-defined user groups
• Type-specific rights for Metadata Management objects

4.3 Managing users in Information Steward

35 2011-04-06
Users and Groups Management

This section contains the steps to create users and add them to Information Steward groups.

4.3.1 Creating users for Information Steward

Create user accounts and assign them to groups or assign rights to control their access to objects in
SAP BusinessObjects Information Steward.
To create users:
1. Log on to the Central Management Console (CMC) with a user name that belongs to the Administrator
group.
2. At the CMC home page, click Users and Groups.
3. Click Manage > New > New User.
4. To create a user:
a. Select your authentication type from the Authentication Type list.
Information Steward can use the following user authentication:
• Enterprise (default)
• LDAP
• Windows Active Directory (AD)
• SAP ERP and Business Warehouse (BW)
b. Type the account name, full name, email, and description information.
Tip:
Use the description area to include extra information about the user or account.
c. Specify the password information and settings.
5. To create a user that will logon using a different authentication type, select the appropriate option
from the Authentication Type list, and type the account name.
6. Specify how to designate the user account according to options stipulated by your SAP
BusinessObjects Business Intelligence platform license agreement.
If your license agreement is based on user roles, select one of the following options:
• BI Viewer: access to Business Intelligence platform applications for all accounts under the BI
Viewer role is defined in the license agreement. Users are restricted to access application
workflows that are defined for the BI Viewer role. Access rights are generally limited to viewing
business intelligence documents. This role is typically suitable for users who consume content
through Business Intelligence platform applications.
• BI Analyst: access to SAP BusinessObjects Enterprise applications for all accounts under the
BI Analyst role is defined in the license agreement. Users can access all applications workflows
that are defined for the BI Analyst role. Access rights include viewing and modifying business
intelligence documents. This role is typically suitable for users who create and modify content
for Business Intelligence platform applications
If your license agreement is not based on user roles, specify a connection type for the user account.

36 2011-04-06
Users and Groups Management

• Choose Concurrent User if this user belongs to a license agreement that states the number of
users allowed to be connected at one time.
• Choose Named User if this user belongs to a license agreement that associates a specific user
with a license. Named user licenses are useful for people who require access to BusinessObjects
Enterprise regardless of the number of other people who are currently connected.

7. Click Create & Close.


The user is added to the system and is automatically added to the Everyone group. An inbox is
automatically created for the user, as is an Enterprise alias. You can now add the user to a group or
specify rights for the user.

Related Topics
• Adding users and user groups to Information Steward groups
• BusinessObjects Enterprise Administrator's Guide: Managing users and groups

4.3.2 Adding users and user groups to Information Steward groups

Groups are collections of users who share the same rights to different objects. SAP BusinessObjects
Information Steward provides groups for Data Insight and Metadata Management, such as Data Insight
Analyst group and Metadata Management User group.

To add a user to a group:


1. Log on to the Central Management Console (CMC) with a user name that belongs to the Administrator
group.
2. At the CMC home page, click Users and Groups.
3. Select the User List or Group List node in the navigation tree.
4. Select the name of the user or user group in the right panel.
5. Click Actions > Join Group.
6. On the "Join Group" dialog box, select the Group List node in the navigation tree.
7. Select one or more names from the "Available Groups" list, and click > to place them in the
"Destination Group(s)" list.
8. Click OK.

Related Topics
• Data Insight pre-defined user groups
• Group rights for connections
• Group rights for projects
• Group rights for tasks
• Metadata Management pre-defined user groups

37 2011-04-06
Users and Groups Management

4.3.3 Denying access

To deny access to Information Steward for an individual user or group:


1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Data Insight Administrator group.
2. At the CMC home page, click Information Steward.
3. Click Manage > Security > User Security.
4. On the "User Security" page, click Add Principals.
5. In the list of "Available users/groups", select the name of each user or group that you want to deny
access, and click the > button to move the names to the "Selected users/groups" list.
6. Click Add and Assign Security.
7. Click the Advanced tab.
8. Click the Add/Remove Rights link.
9. On the General rights list, deny the "View objects"right.
10. Click Apply or OK.

4.4 User rights in Data Insight

The Data Insight module of SAP BusinessObjects Information Steward contains the following objects
that have specific rights that allow various actions on them.
• Connections through which users view data sources and import tables and files to profile the data.
In addition to rights to the connection, a user must also be granted permission on the source data:
For database connections, the Database Administrator must grant privileges on the tables to the
user.

For file connections, the users that run the following services must have permissions on the directory
where the file resides:
• Information Steward Web Application Server (for example, Tomcat)
• Data Services service
• Server Intelligence Agent that runs EIMAdaptiveProcessingServer and ISJobServer

• Views that can join tables and files from multiple connections.
• Projects that contain profile tasks, rule tasks, and scorecards in specific business areas, such as
HR or Sales.
• Profile tasks that collect profile attributes to help you determine the quality and structure of the data.
• Rule tasks that validate the data according to your business and quality rules.

38 2011-04-06
Users and Groups Management

The following diagram shows a pictorial view of users and groups who are granted rights to access
Connections, Projects, and Tasks.

Related Topics
• Data Insight pre-defined user groups
• Type-specific rights for Data Insight objects
• Customizing rights on Data Insight objects
• Managing users in Information Steward

4.4.1 Data Insight pre-defined user groups

SAP BusinessObjects Information Steward provides the following pre-defined Data Insight user groups
to facilitate the assignment of rights on connections, projects, and tasks. These groups enable you to
change the rights for multiple users in one place (a group) instead of modifying the rights for each user
account individually.

The following table describes the pre-defined user groups in ascending order of rights.

39 2011-04-06
Users and Groups Management

Table 4-1: Data Insight User Groups

User group Description

Users that can only view the connections, projects, source data, profile re-
Data Insight User sults, sample profile data, rules, sample data that failed rules, and scorecard
results.

Users that have all the rights of a Data Insight User, plus the following rights:
• Add tables and files to a project
• Remove tables and files from a project
Data Insight Analyst • Create, edit, and delete profile tasks and rule tasks
• Create, edit, and delete rules, bind rules
• Schedule profile tasks and rule tasks

Data Insight Rule Ap- Users that have all the rights of a Data Insight Analyst, plus the right to
prover approve and reject rules.

Users that have all the rights of a Data Insight Analyst, plus the right to
Data Insight Scorecard
create and edit scorecards that consist of rules for specific business areas
Manager
called Key Data Domains.

Users that have all the above rights on Data Insight objects, plus the follow-
ing rights:
• Configure, edit, delete, run, schedule, view history of Information Steward
Data Insight Administra- utilities
tor • Create, edit, and delete Data Insight connections and projects
• Configure Information Steward application settings
• Change Information Steward repository user and password

Related Topics
• Group rights for Data Insight folders and objects in the CMC
• Group rights for connections
• User rights for views
• Group rights for projects
• Group rights for tasks

4.4.2 Type-specific rights for Data Insight objects

Rights are the base units for controlling user access to the objects, users, applications, servers, and
other features in SAP BusinessObjects Enterprise.

40 2011-04-06
Users and Groups Management

“Type-specific rights” are rights that affect specific object types only, such as Data Insight connections,
Data Insight projects, profile tasks, or rule tasks.

Rights are set on objects, such as a Data Insight connection or project, rather than on the "principals"
(the users and groups) who access them. By default, the pre-defined Data Insight user groups are
granted access to newly created connections and projects. If you want some users to access only
certain connections and projects, then do not add them to a pre-defined group, but add their user names
to the list of principals for each individual connection and project and assign the appropriate type -specific
rights. For example, to give a user access to a particular connection, you add the user to the list of
principals who have access to the connection.

Type-specific rights consist of the following:


• General rights for the object type
These rights are identical to general global rights (for example, the right to add, delete, or edit an
object), but you set them on specific object types to override the general global rights settings.
• Specific rights for the object type
These rights are available for specific object types only. For example, the right to view data appears
for Data Insight connections but not for Data Insight projects.

For more information, see "How rights work in BusinessObjects Enterprise" in the BusinessObjects
Enterprise Administrator's Guide.

Related Topics
• Group rights for connections
• Group rights for projects
• Group rights for tasks

4.4.2.1 Group rights for Data Insight folders and objects in the CMC

Each pre-defined Data Insight user group provides specific rights on the folders and objects in the CMC,
as the following tables shows.

41 2011-04-06
Users and Groups Management

Table 4-2: Rights for Data Insight folders in the CMC

Pre-defined User Groups

CMC Folder or Right Data In-


Description Data Insight Data Insight Data In- Data In-
Object Name sight
Administra- Scorecard sight sight
Rule Ap-
tor Manager Analyst User
prover

View View Data Insight


Yes Yes Yes Yes Yes
objects folder on the CMC
Data Insight
folder Modify the rights
Modify
users have to Data Yes No No No No
Rights
Insight

View View connections in


Yes Yes Yes Yes Yes
objects Connections folder

Add
Objects
Connections Create connection Yes No No No No
to Fold-
folder
er

Modify the rights


Modify
users have to con- Yes No No No No
Rights
nections

View View connection


Yes Yes Yes Yes No
objects properties

Edit ob- Edit connection


Yes No No No No
jects properties
Data Insight
Connection Delete
Delete connection Yes No No No No
objects

Modify the rights


Modify
users have to Data Yes No No No No
Rights
Insight connections

View
View Projects folder Yes Yes Yes Yes Yes
objects

Add ob-
Create project Yes No No No No
Projects folder jects

Modify the rights


Modify
users have to Yes No No No No
Rights
projects

42 2011-04-06
Users and Groups Management

Pre-defined User Groups

CMC Folder or Right Data In-


Description Data Insight Data Insight Data In- Data In-
Object Name sight
Administra- Scorecard sight sight
Rule Ap-
tor Manager Analyst User
prover

View View project proper-


Yes Yes Yes Yes No
objects ties

Edit ob- Edit project proper-


Yes No No No No
jects ties
Projects Delete
Delete project Yes No No No No
objects

Modify the rights


Modify
users have to Yes No No No No
Rights
projects

View
View task properties Yes Yes Yes Yes Yes
objects

View in- View task history


Yes Yes Yes Yes No
stance and logs

Sched
Schedule task Yes Yes Yes Yes No
Profile task or ule
Rule task Resched Change the sched-
Yes No No No No
ule ule of task

Delete
Delete task Yes Yes Yes Yes No
objects

Modify Modify the rights


Yes No No No No
Rights users have to tasks.

Related Topics
• User rights for Information Steward administrative tasks

4.4.2.2 Group rights for connections

SAP BusinessObjects Information Steward provides pre-defined Data Insight user groups that have
specific rights on connections, as the following table shows. You can add users to these groups to
control their rights on connections.

43 2011-04-06
Users and Groups Management

On the "Data Insight" tab of Information Steward:


• A connection is visible in the "Browse Metadata" window if a user has the right to view a connection.
• Tables and Views are visible if the user has the View right on the connection.

Table 4-3: Rights for Data Insight Connections on Information Steward

Pre-defined User Groups

Data In-
Right Description Data Insight Data Insight Data In- Data In-
sight
Administra- Scorecard sight sight Us-
Rule Ap-
tor Manager Analyst er
prover

View connections and ta-


View ob-
bles, browse metadata in Yes Yes Yes Yes Yes
jects
the connection

View View external data in the


Yes Yes Yes Yes Yes
Data connection

View View profile sample data


Sample and sample data that failed Yes Yes Yes Yes Yes
Data rules

Export viewed data, profile


Export
sample data, and sample Yes Yes Yes Yes Yes
Data
data that failed rules

Pro-
file/Rule Create profile tasks and rule
Yes Yes Yes Yes No
permis- tasks
sion

Note:
• Rights to a Data Insight connection are granted to users or groups when the Data Insight Admnistrator
adds them to the "Principals" list for the connection. For more information, see Adding users and
user groups to Information Steward groups.
• The Administrator and the Data Insight Administrator can create, edit, and delete connections in the
CMC. See Group rights for Data Insight folders and objects in the CMC.
• For a database connection, the Database Administrator must grant the Data Insight user access to
the tables.
• For a file connection, the users that run the following services must have permissions on the directory
where the file resides:
• Information Steward Web Application Server (for example, Tomcat)
• Data Services service
• Server Intelligence Agent that runs EIMAdaptiveProcessingServer and ISJobServer

44 2011-04-06
Users and Groups Management

Related Topics
• User rights in Data Insight
• Data Insight pre-defined user groups
• Assigning users to specific Data Insight objects

4.4.2.3 Group rights for projects

Each pre-defined Data Insight user group provides specific rights on projects on Information Steward,
as the following table shows.

Table 4-4: Rights for Data Insight Projects on Information Steward

Pre-defined User Groups

Right Description Data Insight Data In- Data In-


Data Insight Ad- Data In-
Scorecard Man- sight Rule sight An-
ministrator sight User
ager Approver alyst

• View project
• View profile results
View ob- • View rule results
Yes Yes Yes Yes Yes
jects • View rules
• View scorecard re-
sult

• Add and remove ta-


bles and files in
Edit ob- project Yes Yes Yes Yes No
jects • Add, edit, and re-
move views

• Create profile tasks


Add ob- • Create rule tasks Yes Yes Yes Yes No
jects
• Copy views

• Create, edit, and re-


Manage move rules Yes Yes Yes Yes No
rule • Bind rules

• Import rules
Import Yes Yes Yes Yes No
• Import views

45 2011-04-06
Users and Groups Management

Pre-defined User Groups

Right Description Data Insight Data In- Data In-


Data Insight Ad- Data In-
Scorecard Man- sight Rule sight An-
ministrator sight User
ager Approver alyst

Manage
Add, edit, and remove
score- Yes Yes No No No
Key Data Domains
card

Approve Approve or reject rules


Yes No Yes No No
rule on Information Steward

On the "Data Insight" tab of Information Steward:


• A project is visible if a user has the right to view a project.
• The Add Tables and Remove buttons are enabled in the Workspace if a user has the right to edit
a project.
• The Bind and Unbind buttons are enabled on the "Rule" tab if a user has the right to manage rules.
• The Profile and Calculate Score buttons are enabled in the Workspace if a user has the right to
add objects to the project.

Note:
Only the Administrator and Data Insight Administrator can create, edit, and delete projects in the CMC.
See Group rights for Data Insight folders and objects in the CMC.

4.4.2.4 User rights for views

The rights each user has on a view is inherited from the rights the user has on the connections that
comprise the view.

For example, suppose View1 is comprised of the following connections and tables:
• ConnectionA, Table1
• ConnectionB, Table2

Suppose User1 has the Edit right on ConnectionA but not on ConnectionB. Therefore, User1 cannot
edit View1 because the denied Edit right is inherited from ConnectionB.

Similarly, if User2 has the Edit right on ConnectionA and ConnectionB, then User2 can edit View1.

This inheritance applies to all of the rights on views, as the following table shows.

46 2011-04-06
Users and Groups Management

Table 4-5: Rights for Data Insight Views on Information Steward

Right Description Required inherited right

View ob- The view name and columns are visible View objects right on all source connec-
jects in Workspace Home window. tions

Look at the source data in each table or


View Data View Data right on all source connections
file that comprise a view

View Sam- View profile sample data and sample data View Sample Data right on all source
ple Data that failed rules connectionss

Profile/Rule
Create profile tasks and rule tasks Profile/Rule right on all source connections
permission

Export viewed data, profile sample data, Export Data right on all source connec-
Export Data
and sample data that failed rules tions

Note:
The following actions require rights on the project:
• To add, edit, or remove views, a user must have the Edit objects right on the project.
• To copy a view, a user must have the Add objects right on the project.

4.4.2.5 Group rights for tasks

Each pre-defined Data Insight user group provides specific rights on profile tasks and rule tasks in
Information Steward, as the following table shows.

Table 4-6: Rights for Data Insight Tasks on Information Steward

Pre-defined User Groups

Right Description Data Insight Data In- Data In-


Data Insight Ad- Data In-
Scorecard Man- sight Rule sight An-
ministrator sight User
ager Approver alyst

View the task in the


View ob-
"Task" tab of the Yes Yes Yes Yes Yes
jects
Workspace

• Edit profile task


Edit ob- • Edit rule task or Yes Yes Yes Yes No
jects score calculator task

47 2011-04-06
Users and Groups Management

Pre-defined User Groups

Right Description Data Insight Data In- Data In-


Data Insight Ad- Data In-
Scorecard Man- sight Rule sight An-
ministrator sight User
ager Approver alyst

Delete Delete profile or rule


Yes Yes Yes Yes No
objects task

Sched • Run profile task


Yes Yes Yes Yes No
ule • Run rule task

Related Topics
• Group rights for Data Insight folders and objects in the CMC

4.4.3 Customizing rights on Data Insight objects

To facilitate user managment, assign users to a pre-defined Information Steward user group. By default,
Information Steward assigns all pre-defined user groups to all Data Insight connections and projects.
However, you might want to limit a user's access such as the following:
• Only view and profile data in a subset of connections
• Only create rules on a subset of projects
• Only create scorecard and approve rules on a subset of projects
• Restrict project access
• Restrict access to Data Insight

To limit access, take one of the following actions:


• Add the user or group to only a subset of connections. In addition, create projects for specific business
areas, such as HR or Sales, and assign only certain users to access these projects to create profile
tasks and rule tasks.
• Deny a right from an existing user group for a specific connection and specific project.

Related Topics
• Assigning users to specific Data Insight objects
• Denying user rights to specific Data Insight objects
• Data Insight pre-defined user groups

48 2011-04-06
Users and Groups Management

4.4.3.1 Denying user rights to specific Data Insight objects

By default, the pre-defined Data Insight user groups are added to the access list of connections and
projects when you create them. However, you might want to deny one or a small subset of users access
to a specific Data Insight connection and project.

To allow a user or group to access all but one specific Data Insight connection and project:
1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Data Insight Administrator group.
2. Add the user name to a pre-defined Data Insight group because the user would still have access to
most connections and projects. For details, see Adding users and user groups to Information Steward
groups
3. At the CMC home page, click Information Steward.
4. Select the object type.
• To deny rights to a connection:
• Select the Connections node in the Tree panel.
• Select the connection name in the right panel.
• To deny rights to a project::
• Expand the Data Insight node, and expand the Projects node in the Tree panel.
• Select the project name in the Tree panel.
• To deny rights to a profile or rule task:
• Expand the Data Insight node, and expand the Projects node in the Tree panel.
• Select the project name in the Tree panel.
• Select the task name in the right panel.

5. Click Manage > Security > User Security.


6. On the "User Security" page, click Add Principals.
7. In the list of "Available users/groups", select the name of each user or group that you want to deny
rights to this connection or project, and click the > button to move the names to the "Selected
users/groups" list.
Note:
To select multiple names, hold down the Ctrl key when you click each name.

8. Click Add and Assign Security.


9. Click the "Advanced" tab.
10. Click the Add/Remove Rights link.
11. To deny rights to this connection:
a. Expand the "Application" node and select "Data Insight Connection".
b. Click the Denied column for each right that you want to deny this user or group.

49 2011-04-06
Users and Groups Management

For example, to deny the right to view and profile the data in this connection, click the Deny column
for the following "Specific Rights for Data Insight Connection":
• Profile/Rule permsiion
• View Data
• View Sample Data
12. To deny project rights:
a. Expand the "Application" node and select "Data Insight Project".
b. Click the Denied column for each right that you want to deny this user or group.
For example, to deny the right to create scorecard and approve rules in this project, click the Denied
column for the following "Specific Rights for Data Insight Connection":
• Approve Rule
• Manage Rule
13. To deny profile or rule task rights:
a. Expand the "Application" node and select "Information Steward Profiler Task".
b. Click the Override General Global column and the Denied column for each right that you want
to deny this user or group.
For example, to deny the right to schedule a profile task or rule task, click the Override General
Global column and the Denied column for the following "General Rights for Data Insight Profiler
Task":
• Rechedule instances that the user owns
• Schedule document to run
• View document instances
14. Click OK and verify that the list under "Right Name" does not display the rights you denied.
15. Click name of the principal you just added, click View Security and verify that the list of rights that
you denied has the red icon in the "Status" column.
16. Click OK and close the "User Security" window.

Related Topics
• Data Insight pre-defined user groups

4.4.3.2 Assigning users to specific Data Insight objects

You might want to limit a user's or group's access to only specific Data Insight connections, projects,
and tasks. In this case, you would add the user or group to the list of principals for each specific Data
Insight object (instead of adding to a pre-defined Data Insight user group).

Note:
You must assign the user to both the connection and the project to be able to add tables or files, create
profile tasks, and create rules.

50 2011-04-06
Users and Groups Management

To add a user or group to a specific Data Insight connection and project:


1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Data Insight Administrator group.
2. At the CMC home page, click Information Steward.
3. Select the object type.
• For a connection:
a. Select the Connections node in the Tree panel .
b. Select the connection name in the right panel.
• For a project:
a. Expand the Data Insight node, and expand the Projects node in the Tree panel.
b. Select the project name in the Tree panel.
• For a profile task or rule task:
a. Expand the Data Insight node, and expand Projects node in the Tree panel.
b. Select the task name in the right panel.

4. Click Manage > Security > User Security.


5. On the "User Security" page, click Add Principals.
6. In the list of "Available users/groups":
a. Select the name of each user or group that you want to authorize to this connection or project
or task.
Note:
To select multiple names, hold down the Ctrl key when you click each name.
b. Click the > button to move the names to the "Selected users/groups" list.
7. Click Add and Assign Security.
8. Click the "Advanced" tab.
9. Click the Add/Remove Rights link.
10. To assign connection rights:
a. Expand the "Application" node and select "Data Insight Connection".
b. Click the Granted column for each right that you want this user or group to have.
For example, to grant the right to view and profile the data in this connection, click the Granted
column for the following "Specific Rights for Data Insight Connection":
• Profile/Rule permission
• View Data
• View Sample Data
11. To assign project rights:
a. Expand the "Application" node and select "Data Insight Project".
b. Click the Granted column for each right that you want this user or group to have.
For example, to grant the right to create scorecard and approve rules in this project, click the Granted
column for the following "Specific Rights for Data Insight Connection":
• Approve Rule
• Manage Rule

51 2011-04-06
Users and Groups Management

12. To assign profile or rule task rights:


a. Expand the "Application" node and select "Data Insight Profiler Task".
b. Click the Override General Global column and the Granted (green check mark icon) column
for each right that you want this user or group to have.
For example, to grant the right to schedule a profile task or rule task, click the Override General
Global column and the Granted column for the following "General Rights for Data Insight Profiler
Task":
• Reschedule instances that the user owns
• Schedule document to run
• View document instances
13. Click OK and verify that the list under "Right Name" displays the rights you just added.
14. Click OK and verify that the list of principals includes the name or names you just added.
15. Close the "User Security" window.

Related Topics
• Data Insight pre-defined user groups

4.4.3.3 Restricting rights of a user group to one project

Whenever a project is created, all pre-defined Data Insight user groups are automatically added to its
principal list. This feature facilitates user rights management when you want the same user or same
group of users to have rights on all projects.

You might want to limit rights of a subset of users to only one project. For example, you might want to
limit the Manage Scorecards right on the Human Resources project to only User A, and you want only
User B to have the Manage Scorecards right on the Finance project.

To restrict the Manage Scorecards right to a specific user for each project:
1. Create User A and User B. For details, see Creating users for Information Steward .
2. Add User A and User B to the Data Insight Analyst user group. For details, see Adding users and
user groups to Information Steward groups.
This Data Insight Analyst user group has all of the rights of the Data Insight Scorecard Manager
user group except the Manage Scorecard right, which you will grant to specific users within a project
in subsequent steps.
3. Create the Human Resources project and the Finance project. For details, see Creating a project.
4. To grant User A the Manage Scorecard right on the Human Resources project:
a. Log on to the Central Management Console (CMC) with a user name that is a member of either
the Administrator group or the Data Insight Administrator group.
b. At the CMC home page, click Information Steward, expand the Data Insight node, and expand
the Projects node in the Tree panel.
c. Select the Human Resources project and click Manage > Security > User Security.

52 2011-04-06
Users and Groups Management

d. On the "User Security" page, click Add Principals.


e. In the list of "Available users/groups", select User A and click the > button to move the names to
the "Selected users/groups" list.
f. Click Add and Assign Security.
g. Click the Advanced tab.
h. Click the Add/Remove Rights link.
i. Expand the "Application" node and click "Information Steward Project".
j. Click the Granted column for the "Manage Scorecards" right under "Specific Rights for Data
Insight Connection".
k. Click OK and verify that the list under "Right Name" displays the "Manage Scorecards" right.
l. Click OK and close the "User Security" window.
5. Repeat steps 3a through 3j in the Finance project for User B.

Related Topics
• Creating users for Information Steward
• Adding users and user groups to Information Steward groups
• Denying user rights to specific Data Insight objects

4.5 User rights in Metadata Management

The Metadata Mangement module of SAP BusinessObjects Information Steward contains the following
objects that have object-specific rights that allow various actions on them.
• Metadata Management application through which users can view relationships (such as Same As,
Impact, and Lineage) between integrator sources.
• Integrator Sources through which users collect metadata.
• Integrator Source Groups to subset the metadata when viewing relationships.
• Metapedia through which users define terms related to their business data and organize the terms
into categories.

The CMS manages security information, such as user accounts, group memberships, and object rights
that define user and group privileges. When a user attempts an action on a Metadata Management
object, the CMS authorizes the action only after it verifies that the user's account or group membership
has sufficient privileges.

Related Topics
• Metadata Management pre-defined user groups
• Group rights for Metadata Management objects in Information Steward
• Group rights for Metadata Management folders and objects
• Assigning users to specific Metadata Management objects
• Managing users in Information Steward

53 2011-04-06
Users and Groups Management

4.5.1 Metadata Management pre-defined user groups

SAP BusinessObjects Information Steward provides the following Metadata Management user groups
to enable you to change the rights for multiple users in one place (a group) instead of modifying the
rights for each user account individually.

Table 4-7: Metadata Management pre-defined user groups

Pre-defined User
Description
Group

Metadata Management Users that can only view metadata in the Metadata Management tab of Information
User Steward.

Users that have all the rights of a Metadata Management User, plus the following
rights:
Metadata Management • Create and edit annotations
Data Steward • Create custom attributes and edit values of custom attributes
• Define Metapedia categories and terms

Users that have all the rights of a Metadata Management Data Steward, plus the
following rights:
Metadata Management • Create, edit, and delete Metadata Management integrator sources and source
Administrator groups
• Run and schedule Metadata Integrators.
• Configure, edit, delete, schedule, view history of Information Steward utilities

Related Topics
• Information Steward pre-defined users and groups

4.5.2 Type-specific rights for Metadata Management objects

Type-specific rights affect only specific object types, such as integrator sources or Metapedia objects.
The following topics describe type-specific rights for each Information Steward object in the CMC and
SAP BusinessObjects Information Steward.

Related Topics
• Group rights for Metadata Management folders and objects

54 2011-04-06
Users and Groups Management

• Group rights for Metadata Management objects in Information Steward

4.5.2.1 Group rights for Metadata Management folders and objects

Each pre-defined Metadata Management user group provides specific rights on the folders and objects
in the Central Management Console (CMC), as the following table shows.

Table 4-8: Rights for Metadata Management folders in the CMC

Pre-defined User Groups

CMC Folder or Ob- Right Metadata


Description Metadata Man- Metadata Man-
ject Name Manage-
agement Admin- agement Data
ment Us-
istrator Steward
er

View ob- View Metadata Manage-


Yes Yes Yes
jects ment folder

Metadata Manage- Edit ob- Change limits for integra-


Yes No No
ment folder jects tor source runs

Modify Manage user security for


Yes No No
Rights Metadata Management

View integrator sources,


View ob-
their run history, their Yes Yes Yes
jects
logs, and so on.

Add ob-
Integrator Source Configure new integrator
jects to Yes No No
folder sources
the folder

Modify the rights users


Modify
have to integrator Yes No No
Rights
sources

55 2011-04-06
Users and Groups Management

Pre-defined User Groups

CMC Folder or Ob- Right Metadata


Description Metadata Man- Metadata Man-
ject Name Manage-
agement Admin- agement Data
ment Us-
istrator Steward
er

View In- View history of metadata


Yes No No
stance integrator runs

Add ob-
Create integrator source
jects to Yes No No
configurations
the folder

Edit ob- Edit the properties of inte-


Yes No No
jects grator sources

• Delete integrator
sources
Delete • Delete integrator
source instances Yes No No
objects
Integrator Sources
• Delete the integrator
source schedule

Schedule and set run-


Schedule time parameters for inte- Yes No No
grator source

Change the schedule or


Resched
run-time parameters for Yes No No
ule
integrator source

Modify the rights users


Modify
have to integrator Yes No No
Rights
sources.

View ob- View Source Groups and


Yes No No
jects access properties

Add ob-
Create integrator source
jects to Yes No No
groups
the folder

Source Groups Edit ob- Edit integrator source


Yes No No
jects groups

Delete Delete integrator source


Yes No No
objects groups

Modify Modify the rights users


Yes No No
Rights have to source groups

56 2011-04-06
Users and Groups Management

Pre-defined User Groups

CMC Folder or Ob- Right Metadata


Description Metadata Man- Metadata Man-
ject Name Manage-
agement Admin- agement Data
ment Us-
istrator Steward
er

View terms and cate-


View ob-
Metapedia Folder gories on Information Yes Yes No
jects
Steward

4.5.2.2 Group rights for Metadata Management objects in Information Steward

Each pre-defined Metadata Management user group provides specific rights on objects in Information
Steward, as the following table shows.

57 2011-04-06
Users and Groups Management

Table 4-9: Rights for Metadata Management objects in Information Steward

Pre-defined User Groups


Information Right
Description Metadata Metadata Metadata
Steward tab Name
Management Management Management
Administrator Data Steward User

• View all of the metadata ob-


jects and their relationships
• View the custom attributes
and values
View • View the Preferences page
Yes Yes Yes
objects • Search all metadata sources
• View Metadata Management
lineage from View Lineage
option on the "Documents "
tab of BI Launch Pad.

• Update Preferences page


• Edit custom attributes and
display order
Edit ob- • Edit custom attribute values Yes Yes No
jects
• Edit annotations
Metadata • Edit user-defined relation-
Management ships between objects
tab
• Create new custom attribute
• Associate custom attribute
to an object type
• Update Preferences page
• Create custom attributes and
Add ob- display order Yes Yes No
jects
• Create custom attribute val-
ues
• Create annotations
• Create user-defined relation-
ships between objects

• Delete custom attribute


Delete • Create annotations
Yes Yes No
objects • Delete user-defined relation-
ships between objects

58 2011-04-06
Users and Groups Management

Pre-defined User Groups


Information Right
Description Metadata Metadata Metadata
Steward tab Name
Management Management Management
Administrator Data Steward User

• View "Metapedia" tab


• View Categories
View Yes Yes No
• View terms
• Export to Excel

• Create Category
• Create Term
• Import to Excel
Add ob- • Edit Category
jects to • Edit Term (including Ap-
Metapedia tab the proval)
folder,
• Add Terms to Categories
Edit ob- • Relate Terms Yes Yes No
jects, • Associate objects to a Term
Delete • Delete Related Terms
objects • Delete Associated Objects
• Delete associated Terms
• Delete Category
• Delete Term

4.5.3 Assigning users to specific Metadata Management objects

By default, the pre-defined Metadata Management user groups are added to the access list of integrator
sources and source groups when you create them. You might want to allow only certain users to
configure integrator sources, define source groups, or define Metapedia categories and terms. In these
cases, you would add the user or group to the specific Metadata Management object's access list
(instead of adding to a pre-defined Metadata Management user group).
1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group or the Metadata Management Administrator group.
2. At the CMC home page, click Information Steward.
3. Expand the Metadata Management node in the Tree panel.
4. Select the object type.
• For an integrator source:
a. Select the Integrator Sources node in the Tree panel.

59 2011-04-06
Users and Groups Management

b. Select the integrator source name in the right panel.


• For Metapedia categories and terms:
a. Select the Metapedia node in the Tree panel.
• For a source group:
a. Expand the Source Groups node in the Tree panel.
b. Select the source group name in the right panel.

5. Click Manage > Security > User Security.


6. On the "User Security" page, click Add Principals.
7. In the list of "Available users/groups":
a. Select the name of each user or group that you want to authorize to this inegrator source or
Metapedia or source group.
Note:
To select multiple names, hold down the Ctrl key when you click each name.
b. Click the > button to move the names to the "Selected users/groups" list.
8. Click Add and Assign Security.
9. Click the "Advanced" tab.
10. Click the Add/Remove Rights link.
11. To assign integrator source rights:
a. Expand the "Application" node and select "Metadata Management Integrator configuration".
b. Click the Override General Global column and the Granted (green check mark icon) column
for each right that you want this user or group to have.
For example, to grant the right to schedule and view integrator instances, click the Override General
Global column and the Granted column for the following "General Rights for Metadata Management
Integrator configuration ":
• Pause and resume document instances
• Schedule document to run
• View document instances
12. To assign Metapedia rights:
a. Expand the "Content" node and select "Folder".
b. Click the Granted column for each right that you want this user or group to have.
For example, to grant the right to create, edit, and delete Metapedia categories and terms, click the
Granted column for the following "General Rights for Folder":
• Add objects to folder
• Delete objects
• Edit objects
• View objects
13. To assign source group rights:
a. Expand the "Application" node and select "Metadata Management Source Group".
b. Click the Override General Global column and the Granted (green check mark icon) column
for each right that you want this user or group to have.

60 2011-04-06
Users and Groups Management

For example, to grant the right to schedule and view integrator instances, click the Override General
Global column and the Granted column for the following "General Rights for Metadata Management
Integrator configuration ":
• Pause and resume document instances
• Schedule document to run
• View document instances
14. Click OK and verify that the list under "Right Name" displays the rights you just added.
15. Click OK and verify that the list of principals includes the name or names you just added.
16. Close the "User Security" window.

4.6 User rights in Cleansing Package Builder

The Cleansing Package Builder module of SAP BusinessObjects Information Steward contains the
following objects to which you control access.
• Private cleansing packages: Private cleansing packages are viewed or edited by the user who owns
them and are listed under My Cleansing Packages. Private cleansing packages include those created
by using the New Cleansing Package Wizard or by importing a published cleansing package.
• Published cleansing packages: Published cleansing packages are cleansing packages included
with SAP BusinessObjects Information Steward or cleansing packages which a data steward created
and then published. Published cleansing packages are available to all users and can be used in an
SAP BusinessObjects Data Services Data Cleanse transform or imported and used as the basis for
a new cleansing package.

Related Topics
• Group rights for cleansing packages
• Managing users in Information Steward

4.6.1 Group rights for cleansing packages

The Administrator and the pre-defined Cleansing Package Builder User groups can perform specific
actions on cleansing packages, as the following table shows.

61 2011-04-06
Users and Groups Management

Table 4-10: Rights for Cleansing Package Builder

Pre-defined User Groups


Software in which
Rights Description this action is per- Cleansing
formed Administrator Package
Builder User

Cleansing Package
Create Create cleansing package. Yes Yes
Builder

Change, delete, and re-


Cleansing Package
Edit name your own private Yes Yes
Builder
cleansing packages.

Publish cleansing pack- Cleansing Package


Publish Yes Yes
ages. Builder

Browse your own private


cleansing packages and all Cleansing Package
View Yes Yes
published cleansing pack- Builder
ages.

Access the Central Management


Log in to the CMC. Yes No
CMC Console

Control Cleans- Start and stop the Cleans-


Central Management
ing Package ing Package Builder Ser- Yes No
Console
Builder server vice.

Set up users Create users and add Central Management


Yes No
and groups users to groups. Console

Reassign ownership of pri- Central Management


Reassign Yes No
vate cleansing packages. Console

Delete private and pub- Central Management


Delete Yes No
lished cleansing packages. Console

See all private and pub- Central Management


View all Yes No
lished cleansing packages. Console

Change description, owner,


Central Management
Edit and state of cleansing Yes No
Console
packages.

4.7 User rights for Information Steward administrative tasks

62 2011-04-06
Users and Groups Management

The following table describes the Information Steward actions in the "Applications" area of the CMC.
To perform any of these Information Steward actions, a user must belong to one fo the following groups:
• Administrator
• Data Insight Administrator
• Metadata Management Administrator

Table 4-11: Actions for Information Steward in the CMC "Applications" area

Item Action Description

View list of job servers in the Information


View Data Services Job Server View
Steward job server group

View instance View history of utility runs

Create Create utility instance

Edit Edit run-time parameters for utility


Information Steward Utilities
Delete Delete utility instance

Schedule Define schedule utility

Reschedule Change the schedule of utility

Change configuration settings for Data In-


Information Steward Settings Edit
sight Profiling, Rules, or Performance

Information Steward repository Edit Update repository password

Related Topics
• Job server group
• Utilities overview
• Configuration settings
• Viewing and editing repository information
• Group rights for Data Insight folders and objects in the CMC
• Group rights for Metadata Management folders and objects

4.7.1 Viewing and editing repository information

You might want to view or edit the SAP BusinessObjects Information Steward repository connection
information for situations such as the following:
• View connection information, such as database type and server name.

63 2011-04-06
Users and Groups Management

• Change the database user and password for the repository.


• If the database type of the repository is an Oracle RAC database, change the connection string to
add another server or tune parameters for failover.

To view or edit the Information Steward repository:


1. Log on to the Central Management Console (CMC) with a user name that is a member of either the
Administrator group.
2. Select Applications from the navigation list at the top of the CMC Home page.
3. In the "Applications Name" list, select Information Steward.
4. Click Action > Configure Repository.
The connection information for the Information Steward repository was defined at installation time.
a. For most of the database types, you can only view the connection information here.
b. If the database type is Oracle RAC, you can modify the connection string here if you want to add
another server or tune parameters for failover.
5. You can change the user name and password for the Information Steward repository.
Caution:
You must have the appropriate credentials to access the Information Steward database. After you
change the user name and password, you must restart the Web Application Server and CMS.

64 2011-04-06
Data Insight Administration

Data Insight Administration

5.1 Administration overview for Data Insight

Each deployment of SAP BusinessObjects Information Steward supports multiple users in one or more
Data Insight projects to assess and monitor the quality of data from various sources.
• A project is a collaborative workspace for data stewards and data analysts to assess and monitor
the data quality of a specific domain and for a specific purpose (such as customer quality assessment,
sales system migration, and master data quality monitoring).
• A connection defines the parameters for Information Steward to access a data source. A data source
can be a relational database, application, or file.

Data Insight users in a project can perform tasks such as browse tables and files in connections, add
tables of interest to the project, profile the data and execute rules to measure the data quality, create
data quality scorecards and monitor data quality. For more details, see the "Data Insight" section in the
User Guide.
Before Data Insight users can perform the above tasks, administrators must do the following tasks on
the Central Management Console (CMC) of SAP BusinessObjects Business Intelligence Platform:
• Define connections to data sources
• Create projects
• Add user names to pre-defined Data Insight user groups to access connections and projects
Tip:
Users of a project should be given access to the same set of connections so that they can collaborate
on the same set data with a project.
• Optionally, define connections to save all data that failed rules for rule tasks. Each rule task can use
a different failed data connection.

After a Data Insight user creates tasks, the Data Insight Administrator can do the following tasks:
• Create schedules to run the profile task and rule task at regular intervals.
• Modify the default schedule to run utilities if the frequency of the profile and rule tasks warrant it.
• Modify application-level settings to change the default configuration for Data Insight.
Tip:
It is recommended that the Data Insight Administrator be someone who is familiar with the CMC.

Related Topics
• Data Insight Connections

65 2011-04-06
Data Insight Administration

• Data Insight projects


• User rights in Data Insight
• Scheduling a utility
• Configuration settings

5.2 Data Insight Connections

Data Insight provides you the capability to define the following types of connections:
• For data profiling
You can configure connections to the following types of data sources to collect profile attributes that
can help you determine the data quality and structure:
• Databases such as Microsoft SQL Server, IBM DB2, Oracle, MySQL, Informix IDS, Sybase ASE,
and ODBC
• Applications such as SAP Business Suite and SAP NetWeaver Business Warehouse
• Text files

• For failed data


You can configure connections to databases to store all data that fail your quality validation rules.
Each rule task can use a different failed data connection. For more information, see “Viewing failed
rule data” in the User Guide.

Note:
For a complete list of supported databases, applications, and their versions, see the Platform Availability
Matrix available at http://service.sap.com/PAM.

Related Topics
• Defining a Data Insight connection to a database
• Defining a Data Insight connection to an application
• Defining a Data Insight connection to a file
• Displaying and editing Data Insight connection parameters

5.2.1 Defining a Data Insight connection to a database

You define a Data Insight connection to a database for any of the following purposes:
• Contains data that you want to profile, run validation rules on, and calculate quality scorecords to
determine the quality and structure of the data.

66 2011-04-06
Data Insight Administration

• Store the data that failed quality validation rules.

To create a Data Insight connection to a database source:


1. Ensure that you have the proper privileges on the source database.
• If you will profile and run validation rules on the data, you must have privileges to read the
metadata and data from the source tables.
• If you will store failed data, you must have privileges to create, modify, and delete tables, and to
create stored procedures.
2. Log on to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or that has the Create right on Connections in Information Steward.
3. At the CMC home page, click Information Steward.
4. Select the Connections node in the Tree panel.
5. Click Manage > New > Data Insight Connection.
6. On the "Create Connection" page, enter the following information.

Option Description

Name that you want to use for this Data Insight connection.
• Maximum length is 64 characters
• Can be multi-byte
• Case insensitive
Connection Name
• Can include underscores and spaces
• Cannot include other special characters : ?!@#$%^&*()-
+={}[]:";'/\|.,`~
You cannot change the name after you save the connection.

Description (Optional) Text to describe this profile source.

7. In the Connection Type drop-down list, select the Database connection value.
8. In the Purpose of connection drop-down list, select one of the following options:
• For data profiling if you want to profile the data in this connection
• For failed data if you want to use this connection to store all the data that fail specific quality
validation rules.
9. In the Database Type drop-down list, select the database that contains the data that you want to
profile or that will store the data that fail validation rules.
10. Enter the relevant connection information for the database type.
11. If you want to verify that Information Steward can connect successfully before you save this profile
connection, click Test connection.
12. Click Save.
Note:
After you save the connection, you cannot change its name, connection type, purpose and connection
parameters which uniquely identify a database..
The newly configured connection appears in the list on the right of the "Information Steward" page.

67 2011-04-06
Data Insight Administration

After you create a connection, you must authorize users to it so that they can perform tasks such as
view the data, run profile tasks and run validation rules on the data.

Related Topics
• User rights in Data Insight
• SAP In-Memory Database connection parameters
• IBM DB2 connection parameters
• Informix database connection parameters
• Microsoft SQL Server connection parameters
• MySQL database connection parameters
• Netezza connection parameters
• ODBC connection parameteres
• Oracle database connection parameters
• Sybase ASE connection parameters
• Sybase IQ connection parameters
• Teradata connection parameters

5.2.1.1 SAP In-Memory Database connection parameters

SAP In-Memory Database


Possible values Description
option

Select the version of your SAP In-Memory


SAP In-Memory Database Database client. This is the version of SAP In-
Database version
<version number> Memory Database that this Data Insight connec-
tion accesses.

Select or type the Data Source Name defined in


Refer to the requirements of
Data Source the ODBC Administrator for connecting to your
your database.
database.

The value is specific to the


Enter the user name of the account through which
User name database server and lan-
Information Steward accesses the database.
guage.

The value is specific to the


Password database server and lan- Enter the user's password.
guage.

default
Language Select the correct language Language of the data in the database.
for your database server.

68 2011-04-06
Data Insight Administration

SAP In-Memory Database


Possible values Description
option

See "Supported locales and


encodings" in the SAP Busi-
Client Code page Code page of the database client.
nessObjects Data Services
Reference Guide.

See "Supported locales and


encodings" in the SAP Busi-
Server code page Code page of the database server.
nessObjects Data Services
Reference Guide.

5.2.1.2 HP Neoview connection parameters

ODBC option Values Description

Select the version of your HP Neoview client. This


HP Neoview <version
Database version is the version of HP Neoview that this profile
number>
connection accesses.

Select or type the Data Source Name defined in


Refer to the requirements of
Data Source Name the ODBC Administrator for connecting to the
your database.
database you want to profile.

Enter the user name of the account through which


the software accesses the database.
The value is specific to the Note:
User name database server and lan- If you use the Neoview utility Nvtencrsrv for stor-
guage. ing the encrypted words in the security file when
using Neoview Transporter, enter the encrypted
user name

Enter the user's password.


Note:
The value is specific to the
If you use the Neoview utility Nvtencrsrv for stor-
Password database server and lan-
ing the encrypted words in the security file when
guage.
using Neoview Transporter, enter the encrypted
password.

69 2011-04-06
Data Insight Administration

ODBC option Values Description

default
Language Select the correct language Language of the data in the database.
for your database server.

See "Supported locales and


encodings" in the SAP Busi-
Client Code page Code page of the database client.
nessObjects Data Services
Reference Guide

See "Supported locales and


encodings" in the SAP Busi-
Server code page Code page of the database server.
nessObjects Data Services
Reference Guide

5.2.1.3 IBM DB2 connection parameters

DB2 option Possible values Description

Select the version of your DB2 client. This is the


DB2 UDB <version
Database version version of DB2 that this Data Insight connection
number>
accesses.

Refer to the requirements of Type the data source name defined in DB2 for
Data Source Name
your database connecting to your database.

The value is specific to the


Enter the user name of the account through which
User name database server and lan-
SAP Information Steward accesses the database.
guage.

The value is specific to the


Password database server and lan- Enter the user's password.
guage.

default
Language Select the correct language Language of the data in the database.
for your database server.

See "Supported locales and


encodings" in the SAP Busi-
Client Code page Code page of the database client.
nessObjects Data Services
Reference Guide.

70 2011-04-06
Data Insight Administration

DB2 option Possible values Description

See "Supported locales and


encodings" in the SAP Busi-
Server code page Code page of the database server.
nessObjects Data Services
Reference Guide.

5.2.1.4 Informix database connection parameters

Informix option Possible values Description

Select the version of your Informix client. This


Informix IDS <version
Database version is the version of Informix that this profile connec-
number>
tion accesses.

Refer to the requirements of Type the Data Source Name defined in the
Data Source Name
your database ODBC.

Enter the user name of the account through


The value is specific to the
User name which SAP Information Steward accesses the
database server and language.
database.

The value is specific to the


Password Enter the user's password.
database server and language.

default
Language Select the correct language for Language of the data in the database.
your database server.

See "Supported locales and


encodings" in the SAP Busines-
Client Code page Code page of the database client.
sObjects Data Services Refer-
ence Guide

See "Supported locales and


encodings" in the SAP Busines-
Server code page Code page of the database server.
sObjects Data Services Refer-
ence Guide

71 2011-04-06
Data Insight Administration

5.2.1.5 Microsoft SQL Server connection parameters

To use Microsoft SQL Server as a profile source when SAP BusinessObjects Information Steward is
running on a UNIX platform, you must use an ODBC driver, such as the DataDirect ODBC driver.
For more information about how to obtain the driver, see the Platforms Availability Report (PAR) available
in the SAP BusinessObjects Support > Documentation > Supported Platforms section of the SAP
Service Marketplace: http://service.sap.com/bosap-support
.

Microsoft SQL Server


Possible values Description
option

Select the version of your SQL Server client. This is


Microsoft SQL Server
Database version the version of SQL Server that this profile source ac-
<version number>
cesses.

Computer name, fully


Enter the name of machine where the SQL Server in-
Server Name qualified domain name, or
stance is located.
IP address

Refer to the requirements Enter the name of the database to which the profiler
Database Name
of your database. connects.

Enter the user name of the account through which In-


formation Steward accesses the database.
The value is specific to the
User Name database server and lan- Note:
guage. Information Steward supports both Windows authenti-
cation and Microsoft SQL Server authentication.

The value is specific to the


Password database server and lan- Enter the user's password.
guage.

default
Select the correct lan-
Language Language of the data in the database.
guage for your database
server.

See "Supported locales


and encodings" in the SAP
Client Code page Code page of the database client.
BusinessObjects Data
Services Reference Guide.

72 2011-04-06
Data Insight Administration

Microsoft SQL Server


Possible values Description
option

See "Supported locales


and encodings" in the SAP
Server code page Code page of the database server.
BusinessObjects Data
Services Reference Guide.

5.2.1.6 MySQL database connection parameters

MySQL option Values Description

Select the version of your MySQL client. This is


Database version MySQL <version number> the version of MySQL that this profile connection
accesses.

Select or type the Data Source Name defined in


Refer to the requirements of
Data Source Name the ODBC Administrator for connecting to the
your database.
database you want to profile.

The value is specific to the


Enter the user name of the account through which
User name database server and lan-
the software accesses the database.
guage.

The value is specific to the


Password database server and lan- Enter the user's password.
guage.

default
Language Select the correct language Language of the data in the database.
for your database server.

See "Supported locales and


encodings" in the SAP Busi-
Client Code page Code page of the database client.
nessObjects Data Services
Reference Guide

See "Supported locales and


encodings" in the SAP Busi-
Server code page Code page of the database server.
nessObjects Data Services
Reference Guide

73 2011-04-06
Data Insight Administration

5.2.1.7 Netezza connection parameters

Netezza option Values Description

Select the version of your Netezza client. This is


Netezza NPS <version
Database version the version of Netezza that this profile connection
number>
accesses.

Select or type the Data Source Name defined in


Refer to the requirements of
Data Source Name the ODBC Administrator for connecting to the
your database.
database you want to profile.

The value is specific to the


Enter the user name of the account through which
User name database server and lan-
the software accesses the database.
guage.

The value is specific to the


Password database server and lan- Enter the user's password.
guage.

default
Language Select the correct language Language of the data in the database.
for your database server.

ee "Supported locales and


encodings" in the SAP Busi-
Client Code page Code page of the database client.
nessObjects Data Services
Reference Guide

ee "Supported locales and


encodings" in the SAP Busi-
Server code page Code page of the database server.
nessObjects Data Services
Reference Guide

5.2.1.8 ODBC connection parameteres

74 2011-04-06
Data Insight Administration

ODBC option Values Description

Select or type the Data Source Name defined in


Refer to the requirements of
Data Source Name the ODBC Administrator for connecting to the
your database.
database you want to profile.

Enter information for any additional parameters


that the data source supports (parameters that
the data source's ODBC driver and database
Alphanumeric characters and
Additional information support). Use the format:
underscores, or blank
<parameter1=value1; parameter2=val
ue2>

The value is specific to the


Enter the user name of the account through which
User name database server and lan-
the software accesses the database.
guage.

The value is specific to the


Password database server and lan- Enter the user's password.
guage.

default
Language Select the correct language Language of the data in the database.
for your database server.

See "Supported locales and


encodings" in the SAP Busi-
Client Code page Code page of the database client.
nessObjects Data Services
Reference Guide.

See "Supported locales and


encodings" in the SAP Busi-
Server code page Code page of the database server.
nessObjects Data Services
Reference Guide.

Table 5-10: Connection

ODBC option Values Description

Enter information for any additional parameters that the data


source supports (parameters that the data source's ODBC
Additional connec- Alphanumeric characters
driver and database support). Use the format:
tion information and underscores, or blank
<parameter1=value1; parameter2=value2>

75 2011-04-06
Data Insight Administration

5.2.1.9 Oracle database connection parameters

Oracle option Possible values Description

Select the version of your Oracle client. This is the


Oracle <version
Database version version of Oracle that this profile conncection ac-
number>
cesses.

Enter an existing Oracle connection through which


Database Connection Refer to the requirements
the software accesses sources defined in this profile
name of your database
conncection.

The value is specific to the


Enter the user name of the account through which
User Name database server and lan-
the software accesses the database.
guage.

The value is specific to the


Password database server and lan- Enter the user's password.
guage.

default
Language Select the correct language Language of the data in the database.
for your database server.

See "Supported locales and


encodings" in the SAP
Client Code page Code page of the database client.
BusinessObjects Data Ser-
vices Reference Guide.

See "Supported locales and


encodings" in the SAP
Server code page Code page of the database client.
BusinessObjects Data Ser-
vices Reference Guide.

5.2.1.10 Sybase ASE connection parameters

76 2011-04-06
Data Insight Administration

Sybase ASE option Possible values Description

Select the version of your Sybase ASE client. This


Sybase ASE <version
Database version is the version of Sybase that this profile connection
number>
accesses.

Enter the name of the computer where the Sybase


Server Name Computer name
ASE instance is located.

Refer to the requirements Enter the name of the database to which the profiler
Database Name
of your database connects.

The value is specific to the


Enter the user name of the account through which
User name database server and lan-
the software accesses the database.
guage.

The value is specific to the


Password database server and lan- Enter the user's password.
guage.

default
Language Select the correct language Language of the data in the database.
for your database server..

See "Supported locales and


encodings" in the SAP
Client Code page Code page of the database client.
BusinessObjects Data Ser-
vices Reference Guide

See "Supported locales and


encodings" in the SAP
Server code page Code page of the database server.
BusinessObjects Data Ser-
vices Reference Guide

5.2.1.11 Sybase IQ connection parameters

Sybase IQ option Possible values Description

Select the version of your Sybase IQ client. This


Database version Currently supported versions is the version of Sybase IQ that this datastore
accesses.

77 2011-04-06
Data Insight Administration

Sybase IQ option Possible values Description

Select or type the Data Source Name defined


Refer to the requirements of
Data Source Name in the ODBC Administrator for connecting to
your database
your database.

The value is specific to the Enter the user name of the account through
User Name
database server and language. which the software accesses the database.

The value is specific to the


Password Enter the user's password.
database server and language.

default
Language Select the correct language for Language of the data in the database.
your database server..

See "Supported locales and


encodings" in the SAP Busines-
Client Code page Code page of the database client.
sObjects Data Services Refer-
ence Guide

See "Supported locales and


encodings" in the SAP Busines-
Server code page Code page of the database server.
sObjects Data Services Refer-
ence Guide

5.2.1.12 Teradata connection parameters

Teradata option Possible values Description

Select the version of your Teradata


Database version Teradata <version number> client. This is the version of Terada-
ta that this datastore accesses.

Type the Data Source Name de-


Refer to the requirements of your
Data Source Name fined in the ODBC Administrator for
database
connecting to your database.

Enter the user name of the account


The value is specific to the
User Name through which the software access-
database server and language.
es the database.

The value is specific to the


Password Enter the user's password.
database server and language.

78 2011-04-06
Data Insight Administration

Teradata option Possible values Description

default
Language of the data in the
Language Select the correct language for your
database.
database server.

See "Supported locales and encod-


Client Code page ings" in the SAP BusinessObjects Code page of the database client.
Data Services Reference Guide

See "Supported locales and encod-


Server code page ings" in the SAP BusinessObjects Code page of the database server.
Data Services Reference Guide

5.2.2 Defining a Data Insight connection to an application

You must define a connection to any application that contains data that you want to profile to determine
the quality and structure of the data.

To create a connection to an application system:


1. Log on to the Central Management Console (CMC) with a user name that has the following
authorizations:
• Authorization to read data on the source application system. For authorizations to SAP
Applications, see “SAP user authorizations” in the SAP BusinessObjects Data Services
Supplement for SAP.
• Either belongs to the Data Insight Administrator group or has the Create right on the Connections
node .
2. At the CMC home page, click Information Steward.
3. Select the Connections node in the navigation tree on the left.
4. Click Manage > New > Data Insight connection.
5. On the "Create Connection" page, enter the following information.

79 2011-04-06
Data Insight Administration

Option Description

Name that you want to use for this Data Insight source.
• Maximum length is 64 characters
• Can be multi-byte
• Case insensitive
Connection Name • Can include underscores and spaces
• Cannot include other special characters : ?!@#$%^&*()-
+={}[]:";'/\|.,`~
You cannot change the name after you save the connec-
tion.

Description (Optional) Text to describe this Data Insight source.

6. In the Connection Type drop-down list, select the Application connection value.
7. In the Application Type drop-down list, select one of the following applications that contains the
data you want to profile:
• SAP Netweaver Business Warehouse
• SAP Applications
For the specific components that Information Steward can profile the data, see the Product Availability
Matrix available at http://service.sap.com/PAM.
8. Enter the relevant connection information for the application type.
9. If you want to verify that Information Steward can connect successfully before you save this profile
connection, click Test connection.
10. Click Save.
The newly configured connection appears in the list of Profile Connections on the right of the
"Information Steward" page.

After you create a connection, you must authorize users to it so that they can perform tasks such as
view the data, run profile tasks and run validation rules on the data

Related Topics
• User rights in Data Insight
• SAP Applications connection parameters
• SAP NetWeaver Business Warehouse connection parameters

5.2.2.1 SAP NetWeaver Business Warehouse connection parameters

The SAP NetWeaver Business Warehouse connection has the same options as the SAP Applicaitons
connection type.

80 2011-04-06
Data Insight Administration

Related Topics
• SAP Applications connection parameters

5.2.2.2 SAP Applications connection parameters

To connect to SAP Applications, the Data Insight module of Information Steward uses the same
connections as Data Services. For more information, see the SAP BusinessObjects Data Services
Supplement for SAP.

SAP Applications option Possible values Description

Computer name, fully qualified Name of the remote SAP application computer
Server Name
domain name, or IP address (host) to which the software connects.

Client Number 000-999 The three-digit SAP client number. Default is 800.

System Number 00-99 The two-digit SAP system number. Default is 00.

Alphanumeric characters and Enter the name of the account through which the
User Name
underscores software accesses the SAP application server.

Alphanumeric characters, un-


Password Enter the user's password.
derscores, and punctuation

E - English
Select the login language from the drop-down list.
G - German You can enter a customized SAP language in this
Application Language
F - French option. For example, you can type S for Spanish
or I for Italian.
J - Japanese

See the section "Supported locales and encod-


Code Page ings" in the SAP BusinessObjects Data Services
Reference Guide.

Information Steward does not take advantage of


ABAP execution option Not applicable
this feature in release 4.0.

Information Steward does not take advantage of


this feature in release 4.0.
Execute in background Dialog processing or background processing is
Not applicable
(batch) defined by the user type of the SAP user config-
ured for the communication between Information
Steward and the SAP server.

81 2011-04-06
Data Insight Administration

SAP Applications option Possible values Description

A directory on the SAP application server where


the software can write intermediate files. This di-
rectory also stores the transport file used by the
FTP and shared-directory data transfer methods.
Working directory on
Directory path By default, the value in this field uses the value
SAP server
that was typed into the Application server field.
For example, if the value sap01 was typed into
the Application server field, the value of Working
directory on SAP server becomes \\sap01\.

Generated ABAP direc- Information Steward does not take advantage of


Not applicable
tory this feature in release 4.0.

Information Steward does not take advantage of


Security Profile Not applicable
this feature in release 4.0.

The number of times the software tries to establish


Number of connection a connection with the SAP application server.
Positive integer
retries
Defaults to 3.

The time delay in seconds between connection


Interval between retries retries.
Positive integer
(sec)
Defaults to 10.

82 2011-04-06
Data Insight Administration

SAP Applications option Possible values Description

Define how to retrieve data from the SAP applica-


tion server to the SAP BusinessObjects Data
Services server:
None: Synchronous data transfer. If the SAP user
used for the connection is of user type Dialog, the
amount of data that can be transferred is bound
to dialog process restrictions. Users of type Sys-
tem/Communication can also perform syn-
chronous Remote Function Call (RFC) data
transfers using the background processes and
resources available.
None
Shared directory: Asynchronous processing.
Data transfer method Shared directory Data is written into a file and saved in a shared
directory to which the SAP server and Information
FTP
Steward both have access. If you select this data
transfer method, the Application Shared Direc-
tory appears.

FTP: Asynchronous processing. Information


Steward (through Data Services) obtains the data
from an FTP folder. If you select this data transfer
method, the FTP options appear (Local directory,
FTP relative path, and so forth).

For performance information, see Using SAP ap-


plications as a source.

If you selected the Shared directory data transfer


method, specify the SAP application server direc-
tory that stores data extracted by SAP Busines-
Application Shared Di- sObjects Data Services , and to which both Data
Directory path
rectory Services and SAP application server have direct
access. After the extraction is completed, Informa-
tion Steward picks up the file for further process-
ing.

FTP These options are visible if you selected the FTP data transfer method.

If you selected the FTP data transfer method,


Local directory Directory path select a client-side directory to which data from
the SAP application server downloads.

Indicate the path from the FTP root directory to


FTP relative path Directory path the SAP server's working directory. When you
select FTP, this directory is required.

83 2011-04-06
Data Insight Administration

SAP Applications option Possible values Description

Computer (host) name, fully


FTP host name qualified domain name, or IP Must be defined to use FTP.
address

Alphanumeric characters and


FTP login user name Must be defined to use FTP.
underscores

Alphanumeric characters and


FTP login password Enter the FTP password.
underscores, or blank

5.2.3 Defining a Data Insight connection to a file

You must define a connection to a text file that contains data that you want to profile to determine the
quality and structure of the data.

To create a Data Insight connection to a file source:


1. Ensure that your file meets the following requirements:
• The users running the following services must have permission on the shared directory where
the file resides:
• Web Application Server (for example, Tomcat)
• Data Services
• Service Intelligence Agent
• The file must be located on a shared location that is accessible to the Data Services Job Server
and the View Data Server which are components of SAP BusinessObjects Data Services.
2. Go to the Information Steward area of the CMC.
3. Click Manage > New > Connection.
4. On the "Create Connection" page, enter the following information.

Option Description

Name that you want to use for this data source.


• Maximum length is 64 characters
• Can be multi-byte
• Case insensitive
Connection Name
• Can include underscores and spaces
• Cannot include other special characters : ?!@#$%^&*()-
+={}[]:";'/\|.,`~
You cannot change the name after you save the connection.

84 2011-04-06
Data Insight Administration

Option Description

Description (Optional) Text to describe this data source.

5. In the Connection Type drop-down list, select the File connection option.
6. Enter the path for the file in Directory Path.
7. Click Save.
The name of the newly configured connection appears in the list on the right pane of the "Information
Steward" page.

After you create a connection, you must authorize users to it so that they can perform tasks such as
view the data, run profile tasks and run validation rules on the data.

Related Topics
• User rights in Data Insight

5.2.4 Displaying and editing Data Insight connection parameters

Situations when you might want to view or change Data Insight connection parameters include:
• You need to view the connection parameters on a development or test system so that you can
recreate the Data Insight connection when you move to a production system.
• You have several source systems and you want to ensure that you are connecting to the appropraite
source.

To edit connection parameters of your Data Insight project:


1. Ensure that you have Edit rights on the connection or you are a member of the Data Insight
Administrator group.
2. Go to the Information Control Center area of the CMC.
3. Expand the Data Insight node in the Tree panel, and select Connections.
4. In the list of connections on the right, select the name of your profiling connection, click "Action" in
the top menu tool bar, and select Properties.
5. Type your changes in the following fields:
• Description
• Database version
• User Name
• Password
• Language
• Client Code Page
• Server Code Page

85 2011-04-06
Data Insight Administration

6. If you want to verify that Information Steward can connect successfully before you save this Data
Insight connection, click Test connection.
7. Click Save.
The edited description appears in the list of projects on the right pane of the "Information Steward"
page.

5.2.5 Deleting a Data Insight connection

The following table shows the Data insight objects that you can delete from the Central Management
Console (CMC).
For information about dependencies when deleting connections and projects on the Central Management
Console (CMC), see “Deleting a connection” and “Deleting a project” in the Administrator Guide.
Note:
You cannot delete a connection if a table or file is being used in a project. You must remove the table
or file from all projects on Information Steward before you can delete the connection or project in the
CMC.

Table 5-18: Delete object dependents and dependencies

Object dependencies that pre-


Data insight object to delete Dependent objects that will also
vent deletion because they are
from CMC be deleted
used in another project

Profile Task none • Profile Task Instance

• Referenced project
Connection You must delete each table from • Table metadata
the Workspace of each project
in Data Insight.

Warning message appears, but • Referenced project


you can click OK to go ahead • Rule
Project
and delete all dependent ob-
• View
jects.

1. Log on to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or that has the Create right on Connections in Information Steward.
2. At the CMC home page, click Information Steward.
3. Click the Connections node in the Tree panel.
4. From the list in the right pane, select the name of the connection and click Manage > Delete.
5. To confirm that you want to delete this connection, click OK in the warning pop-up window.

86 2011-04-06
Data Insight Administration

6. If the following message appears, you must must delete each table from the "Workspace" of each
Data Insight project listed in the message.
The connection cannot be deleted because it is referenced by the following:
Project: projectname

5.3 Data Insight projects

A project is a collaborative workspace for data stewards and data analysts to assess and monitor the
data quality of a specific domain and for a specific purpose (such as customer quality assessment,
sales system migration, and master data quality monitoring).

5.3.1 Creating a project

Create a Data Insight project in the Central Management Console (CMC) to allow your users to define
the project's tasks to profile and validate data in SAP BusinessObjects Information Steward.
1. Log on to the CMC with a user name that belongs to the Data Insight Administrator group or that
has the Create right on Projects in Information Steward.
2. At the CMC home page, click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
3. Expand the Data Insight node in the Tree panel, and select Projects.
4. Click Manage > New > Data Insight Project.
5. On the "New Profiling Project" window, enter the following information.

Option Description

Name that you want to use for this profile project.


• Maximum length is 64 characters
• Can be multi-byte
• Case insensitive
Name
• Can include underscores and spaces
• Cannot include other special characters : ?!@#$%^&*()-
+={}[]:";'/\|.,`~
You cannot change the name after you save the project.

Description (Optional) Text to describe this profile project.

6. Click Save.

87 2011-04-06
Data Insight Administration

Note:
After you save the project, you cannot change its name.
The new project appears in the list of projects on the right pane of the "Information Steward" page.

After you create a project, you must grant users rights to perform actions such as create profile and
rule tasks, run these tasks, or create score cards.

Related Topics
• User rights in Data Insight

5.3.2 Editing a project description

To edit a Data Insight project, you must have Edit rights on the project or be a member of the Data
Insight Administrator group.

To edit the Description field of your profile project:


1. Log on to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or that has the Edit right on Projects in Information Steward.
2. At the CMC home page, click Information Steward.
3. Expand the Data Insight node in the Tree panel, and select Projects.
4. In the list of projects on the right, select the name of your profiling project and click Action > Prop
erties.
5. Type the changes you want in the "Description" field.
6. Click Save to see the edited description in the list of projects on the right pane of the "Information
Steward" page.

5.3.3 Deleting a project

To delete a Data Insight project:


1. Log on to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or that has the Create right on Projects in Information Steward.
2. At the CMC home page, click Information Steward.
3. Expand the Data Insight node in the Tree panel, and select Projects.
4. From the list in the right pane, select the name of the project and click Manage > Delete.
5. To confirm that you want to delete this project, click OK in the warning pop-up window.

88 2011-04-06
Data Insight Administration

Caution:
When you delete a project, you delete all the contents of the project which include unapproved rules,
tasks, scorecards, profile results, sample data, views, and so forth.

5.4 Data Insight tasks

The Data Insight module of SAP BusinessObjects Information Steward provides you the capability to
create the following types of tasks that you can run immediately or schedule on the CMC to run later
or recurringly:
• Profile tasks to generate profile attributes about data tables and files
• Rule tasks to execute rules bound to table columns to measure the data quality
.

5.4.1 Scheduling a task

The Data Insight Administrator can schedule a profile or rule task for a given project in the Central
Management Console (CMC). A Data Insight user must have already created the profile task or rule
task on Information Steward (see “Creating a profile task” or “Creating a rule task and setting an alert”
in the User Guide).

You would schedule a profile task or rule task for reasons including the following:
• Set up a recurring time to run a profile task or rule task.
• Specify a specific time to run the profile task or rule task.

To define or modify a schedule for a Data Insight task:


1. Log on to the CMC with a user name that belongs to the Data Insight Administrator group or that
has the Create right on Projects in Information Steward.
2. At the CMC home page, click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
3. In the Tree panel, expand the Data Insight node expand the Projects node.
4. Select the name of your project in the Tree panel.
5. Select the name of the task from the list on the right panel.
6. Click Action > Schedule in the top menu tool bar.
The "Instance Title" pane appears on the right in the "Schedule" window.
7. If you do not want the default value for Instance Title, change the title to a value you want.
8. To define when and how often to run the task:

89 2011-04-06
Data Insight Administration

a. Click Recurrence in the navigation tree in the left pane of the "Schedule" window.
b. Select the frequency in the Run object drop-down list.
c. Select the additional relevant values for the recurrence option.
For a list of the recurrence options and the additional values, see Recurrence options.
d. Optionally, set the Number of retries to a value other than the default 0 and change the Retry
interval in seconds from the default value 1800..
9. If you want to trigger the execution of this utility when an event occurs, expand Events, and fill in
the appropriate information. For more information about Events , see the SAP BusinessObjects
Business Intelligence Platform Administrator Guide.
10. If you want to change the default values for the run-time paraemeters for the task, click the
Parameters node on the left. For parameter descriptions, see Common runtime parameters for
Information Steward.
11. Click Schedule.

Related Topics
• Pausing and resuming a schedule

5.4.2 Recurrence options

When you schedule an integrator source or an SAP BusinessObjects Information Steward utility, you
can choose the frequency to run it in the Recurrence option. The following table describes each
recurrence option and shows the additional relevant values that you must select for each recurrence
option.

Recurrence Option Description

Now The utility will run when you click Schedule .

The utility will run once only. Select the values for
• Start Date/Time
Once
• End Date/Time
.

The utility will run every N hours and X minutes. Select the values
for:
• Hour(N)
Hourly • Minute(X)
• Start Date/Time
• End Date/Time

Daily The utility will run once every N days. Select the value for Days(N).

90 2011-04-06
Data Insight Administration

Recurrence Option Description

The utility will run once every week on the selected days. Select the
following values:
Weekly • Days of the week
• Start Date/Time
• End Date/Time

The utility will run once every N months. Select the following values:
• Month(N)
Monthly • Start Date/Time
• End Date/Time

The utility will run on the Nth day of each month. Select the following
values:
Nth Day of Month • Day(N)
• Start Date/Time
• End Date/Time

The utility will run on the first Monday of each month. Select the fol-
lowing values:
1st Monday of Month • Start Date/Time
• End Date/Time

The utility will run on the last day of each month. Select the following
values:
Last Day of Month • Start Date/Time
• End Date/Time

The utility will run on the X day of the Nth week of each month. Select
the following values:
X Day of Nth Week of the • Week(N)
Month • Day(X)
• Start Date/Time
• End Date/Time

The utility will run on the days you specified as "run" days on a calen-
Calendar dar you have created in the "Calendars" management area of the
CMC.

5.4.3 Configuring for task completion notification

91 2011-04-06
Data Insight Administration

The Data Insight Administrator or someone in the Administrator group can configure the server to
provide email notifications in the Central Management Console (CMC). A Data Insight user must have
already created the profile task or rule task on Information Steward (see “Creating a profile task” or
“Creating a rule task and setting an alert” in the User Guide).

When a task is schedule to run via the Central Management Console (CMC), you can be notified whether
the task completed successfully or with errors. A profiling task is considered to be in error when profiling
any of the tables fails, either because of an infrastructure error, or due to invalid source information
such as an invalid connection, table, column and so on.

The calculate score task may only fail when a table in the task was unable to generate its score, either
due to an infrastructure error or due to invalid source information such as an invalid connection, table,
column and so on.

To configure the notification server for completed processing:


1. In the CMC Home window, click Information Steward.
2. ChooseData Insight > Projects > <your project name> and then highlight the profiling or calculate
score task that you want to schedule and be notified of the completion.
3. Choose Actions > Scheduleand set up the scheduling information.
4. Choose Notification and expand Email notification. Define whether you want to receive notification
when the job is completed successfully and/or failed. You can override the default configurations.

Option Description

Use Job Server defaults Select to use the settings already defined in the Job Server.

The following options override those defined in the Job


Set values to be used here
Server.

"From" Provide the return email address.

"To" Specify the email address of the alert subscriber.

Specify which recipient(s) should receive carbon copies of


"CC"
alerts sent through email.

Specify the default subject heading used in emails containing


"Subject"
system alerts.

Specify the default message to include in emails containing


"Message"
system alerts.

5. In the CMC Home window, click Servers. Select the ISJobServer. If more than one is available,
configure each one.
6. Choose Manage > Properties, and then click Destination in the navigation pane.
7. Select Email from the Destination drop-down list and then click Add.
8. To set up a notification server for completed processing, you must enter information into the Domain,
Host, and Port fields. All other fields are optional. The following table describes the fields on the
Destination page:

92 2011-04-06
Data Insight Administration

Option Description

Domain (required) Enter the fully qualified domain of the SMTP server.

Host (required) Enter the name of the SMTP server.

Enter the port that the SMTP server is listening on. (This standard
Port (required)
SMTP port is 25.)

Select Plain or Login if the job server must be authenticated using one
Authentication
of these methods in order to send email.

Provide the Job Server with a user name that has permission to send
User Name
email and attachments through the SMTP server.

Password Provide the Job Server with the password for the SMTP server.

Provide the return email address. Users can override this default when
From
they schedule an object.

Specify the email address of the alert subscriber.


Note:
To It is recommended that you keep the %SI_EMAIL_ADDRESS%If you
specify a specific email address or recipient all system alerts are sent
to that address by default.

Specify which recipient(s) should receive carbon copies of alerts sent


CC
through email.

Specify the default subject heading used in emails containing system


Subject
alerts.

Specify the default message to include in emails containing system


Message
alerts.

You can add placeholder variables to the message body using the Add
placeholder list. For example, you can add the report title, author, or
Add placeholder
the URL for the viewer in which you want the email recipient to view
the report.

Add Attachment Not used in this scenario.

9. Click Save.

Related Topics
• Rule threshold notification

93 2011-04-06
Data Insight Administration

5.4.4 Rule threshold notification

The Data Insight Administrator or someone in the Administrator group can configure the server to
provide email notifications in the Central Management Console (CMC). A Data Insight user must have
already created the profile task or rule task on Information Steward (see “Creating a profile task” or
“Creating a rule task and setting an alert” in the User Guide).

When a calculate score task is created, you have the option to provide an email address for notification
when the rule score falls below the low threshold setting. You must configure the notification server
before processing the task so the server has the correct information to send in the alert. To configure
the notification server to receive processing information:
You'll receive an email when the task is complete.
1. In the CMC Home window, click Servers.
2. Select Server List and then highlight the AdaptiveProcessingServer. If you have more than one
available, you must configure all of them.
3. Choose Manage > Properties and then select Destination in the navigation pane.
4. Select Email from the Destination drop-down list and then click Add.
5. To set up a notification server for rules, you must enter information into the Domain, Host, Port and
From fields. All other fields are optional. The following table describes the fields on the Destination
page:

Option Description

Domain (required) Enter the fully qualified domain of the SMTP server.

Host (required) Enter the name of the SMTP server.

Enter the port that the SMTP server is listening on. (This standard
Port (required)
SMTP port is 25.)

Select Plain or Login if the job server must be authenticated using one
Authentication
of these methods in order to send email.

Provide the Job Server with a user name that has permission to send
User Name
email and attachments through the SMTP server.

Password Provide the Job Server with the password for the SMTP server.

Provide the return email address. Users can override this default when
From (required)
they schedule an object.

Not used in this scenario. The email address specified when creating
To
the task is used.

CC Not used in this scenario.

94 2011-04-06
Data Insight Administration

Option Description

Subject Not used in this scenario.

Message Not used in this scenario.

You can add placeholder variables to the message body using the Add
placeholder list. For example, you can add the report title, author, or
Add placeholder
the URL for the viewer in which you want the email recipient to view
the report.

Add Attachment Not used in this scenario.

6. Click Save.

Related Topics
• Configuring for task completion notification

5.4.5 Monitoring a task

To look at the status of or the progress of a profile task or rule task:


1. Login to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or Administrator group.
2. At the CMC home page, click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
3. In the Tree panel, expand the Data Insight node expand the Projects node.
4. Select the name of your project in the Tree panel.
A list of tasks appears in the right panel with the date and time each was last run.
5. To update the "Last Run Status" and "Last Run" columns, click the Refresh icon.
6. To view the history of a task, select its name and click Action > History in the top menu tool bar.
The "Data Insight Task history" pane idsplays each instance the task was executed with the status,
start time end time, and duration. The "Schedule Status" column can contain the following values:

Schedule Status Description

Failed Task did not complete successfully.

Task is scheduled to run one time. When it actually runs, the status
Pending
changes to “Running."

95 2011-04-06
Data Insight Administration

Schedule Status Description

Task is scheduled to recur. When it actually runs, there will be another


Recurring
instance with status “Running."

Running Task is currently executing.

Success Task completed successfully.

7. To see the progress of a task instance:


a. Select the instance name and click the icon for View the database log in the top menu bar of
the "Data Insight Task History" page.
The "Database Log" window shows the task messages.
Note:
The "Database Log" shows a subset of the messages in the log file.
b. To find specific messages in the "Database Log" window, enter a string in the text box and click
Filter.
For example, you might enter error to see if there are any errors.
c. To close the "Database Log" window, click the X in the upper right corner.
8. To save a copy of a task log:
a. Scroll to the right of the "Data Insight Task History" page, and click the Download link in the "Log
File" column in the row of the utility instance you want.
b. Click Save.
c. On the "Save As" window, browse to the directory where you want to save the log and change
the default file name if you want.
Note:
This downloaded log file contains more messages than the "Database Log" because its default
logging level is set lower.

9. To close the "Data Insight Task History" page, click the X in the upper right corner.

Related Topics
• Scheduling a task
• Information Steward logs
• Log levels
• Changing log levels

5.4.6 Pausing and resuming a schedule

96 2011-04-06
Data Insight Administration

You pause a recurring schedule for a task when you do not want to run it at its regularly scheduled time
until you resume the schedule.

To pause a schedule:
1. Log on to the CMC with a user name that belongs to the Data Insight Administrator group or that
has the Create right on Projects in Information Steward.
2. At the CMC home page, click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
3. In the Tree panel, expand the Data Insight node expand the Projects node.
4. Select the name of your project in the Tree panel.
5. Select the name of the task from the list on the right panel.
6. Click Action > History in the top menu tool bar.
A list of instances for the task appears in the right pane.
7. Select the task instance that has the value "Recurring" in the "Schedule Status " column and click
Action > Pause in the top menu tool bar.
8. When you are ready to resume the recurring schedule, select he task instance that has the value
"Paused" in the "Schedule Status " column and click Action > Resume in the top menu tool bar.

5.4.7 Common runtime parameters for Information Steward

When you schedule a profile task, rule task, or integrator source, you can change the default values of
the runtime parameters in the Parameters option when you schedule the instance. The following table
describes the runtime parameters that are applicable to all metadata integrators, profile tasks, and rule
tasks. For information about runtime parameters that apply to only specific metadata integrators, see
the topics in Related Topics below.

Runtime parameter Description

This log is in the SAP BusinessObjects Information Steward repository. You


can view this log while a Data Insight task or Metadata Integrator is running.
The default logging level is Information. Usually you can keep the default
Database Log Level
logging level. However, if you need to provide more detailed information
about your integrator run, you can change the level to log tracing information.
For a description of log levels, see "Log levels".

97 2011-04-06
Data Insight Administration

Runtime parameter Description

A Data Insight task or Metadata Integrator creates this log in in the Business
Objects installation directory and copies it to the File Repository Server.
You can download this log file after the Data Insight task or Metadata Inte-
grator run completed.
File Log Level
The default logging level for this log is Configuration. Usually you can
keep the default logging level. However, if you need to debug your task or
integrator run, you can change the level to log tracing information. For in-
structions to change log levels, see "Changing log levels".

The Information Steward Job Server creates a Java process to perform the
profile task, rule task or metadata collection. Use the JVM Arguments pa-
rameter to configure runtime parameters for the Java process. For example,
JVM Arguments
if there are many parsed values per row in the data being used by Cleansing
Package Builder, you might want to provide more memory than the default
value for the -Xmx argument.

Optional runtime parameters for the metadata integrator source or Data In-
Additional Arguments sight task. For more information, see "mm_admin_runtime_parm_boe_in-
teg.dita#icc14.0.0_topic_45BF540E11CB43F78E2AB9FE71384A8E".

Related Topics
• JVM runtime parameters

5.5 Configuration settings

The "Information Steward Settings" window in Central Management Console allows you to configure
data profiling and rule settings in Data Insight. These settings control the behavior and performance of
data profiling tasks and rule tasks. Some settings provide default values for parameters that you can
override when you define a profiling task or rule task on Data Insight.

Related Topics
• Configuring profiling tasks and rule tasks
• Profiling task settings and rule task settings

98 2011-04-06
Data Insight Administration

5.5.1 Profiling task settings and rule task settings

The following parameters in the "Information Steward Settings" window control the behavior of profiling
tasks, rule tasks, and their performance.
Note:
• The parameters Max input size, Max sample data size, and Optimization period affect the amount
of data to process. The more data to process, the more resources are required for efficient processing.
For performance considerations, see Input data settings.
• The parameters Max sample data size, Number of distinct values, Number of patterns, Number
of words, and Results retention period affect the size of the Information Steward repository. If
you increase these values, the repository size also increases and you might need to free space
more often. For more information, see Settings to control repository size.

99 2011-04-06
Data Insight Administration

Data Insight sub- Default


Parameter Description
category value

Maximum number of rows to profile.


The default value -1 processes all
records.
Max input size -1 If the tables you profile are very large and
you want to reduce memory consumption,
specify a maximum number of rows to
profile.

Rate at which rows are read. Set 1 to


Input sampling rate 1 read every row, set 2 to read every sec-
ond row, and so on.

Maximum number of records to save for


Max sample data size 25
each profile attribute.

Number of distinct values to store of data


Number of distinct values 25
distribution result.

Number of patterns to store of pattern


"Profiling" Number of patterns 25
distribution result.

Number of words to store of word distribu-


Number of words 25
tion result.

Duration in days before the profiling re-


sults are deleted from the Information
Steward repository by the Purge utility.
For more information about the Purge
Results retention period 90
utility, see Utilities overview.
The default of 90 days is approximately
3 months.

Number of hours before profiling results


will be replaced for a table by a scheduled
Optimization period 24
task. Set to 0 to refresh results with every
task run.

Address Reference Data Directory location for the reference data


directory for address profiling.

100 2011-04-06
Data Insight Administration

Data Insight sub- Default


Parameter Description
category value

Maximum number of rows to validate for


rules.
The default value -1 processes all
records.

If the tables you process are very large


Maximum input size -1
and you want to reduce memory consump-
tion, specify a maximum number of rows
to validate for rules.

For performance considerations, see In-


put data settings.

Number of rows to read before a row is


validated for rules.
Input sampling rate 1
For performance considerations, see In-
put data settings.

Maximum number of failed records to


Max sample data size 100
save for each rule.

"Rules" Number of days before the scores are


purged.
Score retention period 1095
The default of 1095 days is approximately
3 years.

Number of hours before a new score will


be calculated for a table by a scheduled
task. Set 0 to always refresh immediately.
Optimization period 1
For performance considerations, see
Data Insight result set optimization.

Default value for Low Threshold when


binding a rule and defining a scorecard.
Default low threshold 5
You can override this for an individual rule
or scorecard. The range is 0 to 10.

Default value for High Threshold when


binding a rule and defining a scorecard.
Default high threshold 8
You can override this for individual rule
or scorecard. The range is 0 to 10.

101 2011-04-06
Data Insight Administration

Data Insight sub- Default


Parameter Description
category value

Number of parallel processing threads for


Degree of parallelism 4
a task.

Average number of profile or rule tasks


that can execute in parallel in the Data
Services Job Server Group for Informa-
tion Steward.

Information Steward multiplies this value


times the number of Job Servers available
Average concurrent tasks 5 in the Job Server Group. . For example,
if Average concurrent tasks is set to 5
and the Job Server Group contains 2 job
servers, then 10 tables or sub-tables can
be processed in parallel.

"Performance" For more information, see Data Insight


related services .

Maximum number of threads for file pro-


File processing threads 2
cessing.

Directory location to use for caching if


profiling and rule tasks require more
memory.

Pageable cache directory Ensure that you specify a pageable cache


directory that: contains enough disk space
for the amount of data you plan to profile.
For more information, see Hard disk re-
quirements .

Task distribution level Table

102 2011-04-06
Data Insight Administration

Data Insight sub- Default


Parameter Description
category value

Level to distribute the execution of a pro-


file or rule task in a Data Services job
server group. The term “table” in this
context means “table, file, or view”.
Table: The entire table executes on one
job server.

Sub-table: The table is partitioned to ex-


ecute in parallel on multiple job servers.

Note:
If you specify Sub-table, ensure that:
• You set up a pageable cache directory
on a shared directory that is accessi-
ble to all job servers in the job server
group.
• The network is efficient between the
job servers in the group.
• The requirements for a Data Services
job server group are met.
For more information, see Distribution
level sub-table .

Related Topics
• Configuring profiling tasks and rule tasks

5.5.2 Configuring profiling tasks and rule tasks

1. Log on to the Central Management Console (CMC) with a user name that belongs to the Data Insight
Administrator group or Administrator group.
2. Select Applications from the navigation list at the top of the CMC Home page.
3. In the Application Name list, select Information Steward application.
4. Click Actions > Configure Application to display the "Information Steward Settings" window.
5. Keep or change the parameters values listed in the "Information Steward Settings" window.
6. Click Save to apply the changed settings. Changes that affect the user interface will only be visible
to users once they log out and log back in.
To reset the settings to their default values, click Reset. To cancel the changes made to the "settings",
click Cancel.

103 2011-04-06
Data Insight Administration

Related Topics
• Profiling task settings and rule task settings

104 2011-04-06
Metadata Management Administration

Metadata Management Administration

6.1 Administration overview for Metadata Management

The Metadata Management module of SAP BusinessObjects Information Steward collects metadata
about objects from different source systems and stores the metadata in a repository. Source systems
include Business Intelligence (SAP BusinessObjects Enterprise and SAP NetWeaver Business
Warehouse), Data Modeling, Data Integration (SAP BusinessObjects Data Services and SAP
BusinessObjects Data Federation), and Relational Database systems.

When you access Information Steward as a user that belongs to the Metadata Management Administrator
group, you can perform the following tasks in the CMC:
• Configure integrator sources from which to collect metadata (see Configuring sources for Metadata
Integrators)
• Run Integrators to collect metadata (see Running a Metadata Integrator )
• View the status of Metadata Integrator runs (see Viewing integrator run progress and history )
• Organize Metadata Integrator Sources into groups for relationship analysis (see Grouping Metadata
Integrator sources )
• Manage user security of Metadata Integrator sources, source groups, Metapedia (see Type-specific
rights for Metadata Management objects.
• Compute and store end-to-end impact and lineage information for Reporting (see Computing and
storing lineage information for reporting)
• Manage the Metadata Management search indexes (see Recreating search indexes on Metadata
Management)

6.1.1 Configuring sources for Metadata Integrators

SAP BusinessObjects Metadata Integrators collect metadata from repository sources that you configure,
and they populate the SAP BusinessObjects Information Steward repository with the collected metadata.

When you install Information Steward, you can select the following Metadata Integrators:
• SAP BusinessObjects Enterprise Metadata Integrator
• SAP NetWeaver Business Warehouse Metadata Integrator
• Common Warehouse Metamodel (CWM) Metadata Integrator
• SAP BusinessObjects Data Federator Metadata Integrator

105 2011-04-06
Metadata Management Administration

• SAP BusinessObjects Data Services Metadata Integrator


• Relational Database Metadata Integrator
• Meta Integration Metadata Bridge (MIMB) Metadata Integrator - also known as MITI Integrator

6.1.1.1 Configuring sources for SAP BusinessObjects Enterprise Metadata


Integrator

This section describes how to configure the SAP BusinessObjects Enterprise Metadata Integrator for
the SAP BusinessObjects Enterprise repository which is managed by SAP BusinessObjects Central
Management Server (CMS). This Integrator collects metadata for Universes, Crystal Reports, Web
Intelligence documents, and Desktop Intelligence documents.

Note:
Ensure that you selected the BusinessObjects Enterprise Metadata Integrator when you installed SAP
BusinessObjects Information Steward .
To configure the BusinessObjects Enterprise Integrator, you must belong to the Metadata Management
Administrator group or add Objects right on the Integrator Sources folder.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Click Manage > New > Integrator Source in the top menu tool bar.
3. In the Integrator Type drop-down list, select BusinessObjects Enterprise.
4. On the "New Integrator Source" page, enter the following information.

Option Description

Name that you want to use for this metadata integrator source. The
Name
maximum length of an integrator source name is 128 characters.

Description (Optional) Text to describe this metadata integrator source.

Host name of the CMS (Central Management Server). This value is


required.
The CMS is responsible for maintaining a database of information
about your BusinessObjects Enterprise system. The data stored by
the CMS includes information about users and groups, security levels,
BusinessObjects Enterprise content, and servers. For more informa-
CMS Server Name tion about the CMS, see SAP BusinessObjects Business Intelligence
Platform Administrator'Guide.
Note:
The version of BusinessObjects Enterprise installed on the Metadata
Management host must match the version of BusinessObjects Enter-
prise that this CMS manages.

106 2011-04-06
Metadata Management Administration

Option Description

The CMS user name to connect to the CMS server.


The default value is Administrator. If you want a user other than
User Name
Administrator to run the Metadata Integrator, change the value
to the appropriate name.

The password to connect to the CMS server to register and run the
Password
Metadata Integrator.

The process that CMS uses to verify the identity of a user who at-
tempts to access the system. The default value is Enterprise. See
Authentication Method
the Business Intelligence Platform Administrator's Guide for available
modes.

The name of the "BI Launchpad" (formerly known as InfoView) user


to invoke the Information Steward lineage diagrams when View Lin-
InfoView Integration Us- eage is selected for each document in the "Documents List" of "BI
er Launchpad".
The default value is Administrator.

The password of the InfoView user to connect to Information Steward


Password
to display the lineage diagram for a document in InfoView.

5. If you want to verify that Information Steward can connect successfully before you save this source,
click Test connection.
6. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the Information
Steward page.

Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history

6.1.1.1.1 Checkpointing
SAP BusinessObjects Information Steward can run the SAP BusinessObjects Enterprise Metadata
Integrator for extended periods of time to collect large quantities of objects. If unexpected problems
occur during object collection, Information Steward automatically records warning, error, and failure
incidents in your log file for you to analyze later.

As additional failure management, Information Steward uses an automatic checkpointing mechanism


with preset "safe start" points to ensure that processing restarts from the nearest "safe start" point
(instead of from the beginning of the job). Regardless of reason for the failure (power outage, accidental

107 2011-04-06
Metadata Management Administration

shutdown, or some other incident), the next time you run the BusinessObjects Enterprise Metadata
Integrator, it restarts from the safe start point to finish object collection in the least amount of time.

6.1.1.2 Configuring sources for SAP NetWeaver Business Warehouse Metadata


Integrator

This section describes how to configure the Metadata Integrator for SAP NetWeaver Business
Warehouse.

Note:
Ensure that you selected the SAP NetWeaver Business Warehouse Metadata Integrator when you
installed SAP BusinessObjects Information Steward.
To configure an SAP NetWeaver Business Warehouse integrator source, you must have the Create
or Add permission on the integrator source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Click the down arrow next to Manage in the top menu tool bar and select New > Integrator Source.
3. In the Integrator Type drop-down list, select SAP NetWeaver Business Warehouse.
4. On the "New Integrator Source" page, enter the following information.

Option Description

Name Name that you want to use for this integrator


source. The maximum length of an integrator
source name is 128 characters.
Description (Optional) Text to describe this source.
Connection Type One of the following connection types for this
source:
• Custom Application Server
• Group/Server Selection

Application Server SAP Application Server host name when Con-


nection Type is Custom Application Server.
Message Server SAP NetWeaver BW Message Server host name
when Connection Type is Group/Server Selec-
tion.
Group/Server SAP group name when Connection Type is
Group/Server Selection.
Client ID number for the SAP NetWeaver BW client.
System Number Number for the SAP NetWeaver BW system.

108 2011-04-06
Metadata Management Administration

Option Description

SAProuter String
(Optional) String that contains the information
required by SAProuter to set up a connection
between the Metadata Integrator and the SAP
NetWeaver BW system. The string contains the
host name, the service port, and the password,
if one was given.

SAP User Name of the user that will connect to the SAP
NetWeaver BW system.
SAP Password Password for the user that will connect to the
SAP NetWeaver BW system.
Language
Language to use for the descriptions of SAP
NetWeaver BW objects. Specify the 2-character
ISO code for the language (for example, en for
English).

5. To verify that Information Steward can connect successfully before you save this source, click Test
connection.
6. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.

Related Topics
• SAP router string information: http://help.sap.com/saphelp_nw70ehp1/helpda
ta/en/4f/992df1446d11d189700000e8322d00/content.htm
• Running a Metadata Integrator immediately
• Accessing Information Steward for administrative tasks
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history

6.1.1.3 Configuring sources for Common Warehouse Metamodel (CWM) Metadata


Integrator

This section describes how to configure the Metadata Integrator for Common Warehouse Metamodel
(CWM).

Note:
Ensure that you selected the CWM Metadata Integrator when you installed SAP BusinessObjects
Information Steward.

109 2011-04-06
Metadata Management Administration

To configure the CWM Integrator, you must have the right to add objects in the Integrator Sources
folder.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
2. Click the down arrow next to "Manage" in the top menu tool bar and select New > Integrator Source.
3. In the Integrator Type drop-down list, select Common Warehouse Modeling .
4. On the "CWM Integrator Configuration" page, enter the following information.

Option Description

Name that you want to use for this source. The maximum length of an integrator
Source Name
source name is 128 characters.

Description (Optional) Text to describe this source.

Name of the file with the CWM content. For example: C:\data\cwm_ex
port.xml
This value is required. The file should be accessible from the computer where
the Metadata Management web browser is running.

Click the Browse button to find the file.

Note:
Metadata Management copies this file to the Input File Repository Server on
SAP BusinessObjects Business Intelligence Platform. Therefore, if the original
file is subsequently updated, you must take the following steps to obtain the
updates before you run the Integrator again:
• Update the configuration to recopy the CWM file.
File Name a. From the Integrator Sources list, select the CWM integrator source name
and click Action > Properties.
The file name displays in the comments under the File Name text box,
and the file name has "frs:" prefacing it.
b. Browse to the original file again.
c. Click Save.
• Create a new schedule for the CWM integrator because the old schedule
has a copy of the previous file.
a. With the CWM integrator source name still selected in the Integrator list,
click Action > Schedules.
b. Select the Recurrence and Parameter options that you want.
c. Click Schedule.

5. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.

110 2011-04-06
Metadata Management Administration

Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history

6.1.1.4 Configuring sources for SAP BusinessObjects Data Federator Metadata


Integrator

This section describes how to configure the Metadata Integrator for SAP BusinessObjects Data Federator.
Note:
Ensure that you selected the SAP BusinessObjects Data FederatorMetadata Integrator when you
installed SAP BusinessObjects Information Steward.
To configure an SAP BusinessObjects Data Federator integrator source, you must have the Create or
Add right on the integrator source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
2. Click the down arrow next to Manage in the top menu tool bar and select New > Integrator Source.
3. In the Integrator Type drop-down list, select Data Federator.
4. On the "New Integrator Source" page, enter the following information.

Option Description

Name Name that you want to use for this source. The maximum length of an in
tegrator source name is 128 characters.
Description (Optional) Text to describe this source.
DF Designer Server Name or IP address of the computer where the Data Federator Designer
Address resides. For example, if you installed the Data Federator Designer on the
same computer as the Data Federator Integrator, type localhost.
DF Designer Server Port number for the Data Federator Designer. The default value is 3081.
Port
User name Name of the user that will connect to the Data Federator Designer.
Password Password for the user that will connect to the Data Federator Designer.

5. If you want to verify that Metadata Management can connect successfully before you save this
source, click Test connection.
6. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.

111 2011-04-06
Metadata Management Administration

Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history

6.1.1.5 Configuring sources for SAP BusinessObjects Data Services Metadata


Integrator

This section describes how to configure the Metadata Integrator for SAP BusinessObjects Data Services.
Note:
Ensure that you selected the SAP BusinessObjects Data Services Metadata Integrator when you
installed SAP BusinessObjects Information Steward.
To configure the Data Services Integrator, you must have the Create or Add right on the integrator
source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
2. Click the down arrow next to Manage in the top menu tool bar and select New > Integrator Source.
3. In the Integrator Type drop-down list, select Data Services.
4. On the "New Integrator Source" page, in the Integrator Type drop-down list, select BusinessObjects
Data Services information.
5. Enter the following Data Services information.

Option Description

Name that you want to use for this source. The maximum length of an
Name
integrator source name is 128 characters.

Description (Optional) Text to describe this source.

The database type of the Data Services repository. The available


database types are:
• DB2
Database Type • Microsoft SQL Server
• MySQL
• Oracle
• Sybase

Computer Name Name of the computer where the Data Services repository resides.

Database Port Num-


Port number of the database.
ber

112 2011-04-06
Metadata Management Administration

Option Description

The name of the database, data source, or service name. Specify the
following name for the database type of the Data Services repository:
• DB2 - Data source name
Datasource,
Database Name, or • Microsoft_SQL_Server - Database name
Service name • Oracle - SID/Service name
• Sybase - Database name

Database User Name of the user that will connect to the Data Services repository.

The password for the user that will connect to the Data Services reposi-
Database Password
tory.

6. If you want to verify that Metadata Management can connect successfully before you save this
source click Test Connection.
7. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the window.

Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history

6.1.1.6 Configuring sources for Meta Integration Metadata Bridge Metadata


Integrator

This section describes how to configure the Metadata Integrator for Meta Integration® Metadata Bridge
(MIMB). For a description of the objects collected by the MIMB Integrator, see the MIMB documentation
at http://www.metaintegration.net/Products/MIMB/Documentation/.

Note:
Ensure that you selected the Meta Integration Metadata Bridge (MIMB) Metadata Integrator when you
installed SAP BusinessObjects Information Steward.
To configure the MIMB Integrator, you must have the Create or Add right on the integrator source.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
2. Click the down arrow next to Manage in the top menu tool bar and select New > Integrator Source.
3. In the Integrator Type drop-down list, select Meta Integration Metadata Bridge.

113 2011-04-06
Metadata Management Administration

4. On the "New Integrator Source" page, enter values for Name and Description. The maximum length
of an integrator source name is 128 characters.
5. In the Bridge drop-down list, select the type of integrator source from which you want to collect
metadata and follow the instructions on the user interface to configure the connection information.
6. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.

Related Topics

• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history

6.1.1.7 Configuring sources for Relational Database Metadata Integrator

This section describes how to configure and run the Metadata Integrator for a DB2, JDBC, Microsoft
SQL Server, MySQL, or Oracle relational database.

Note:
Ensure that you selected the Relational Database System Metadata Integrator when you installed SAP
BusinessObjects Information Steward.
To configure the Relational Database Integrator, you must add objects in the Integrator Sources folder.
1. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
2. To access the "New Integrator Source" page, take one of the following actions:
• Click the left-most icon, "Create an Integrator source", in top menu bar.
• On the Manage menu, point to New and click Integrator Source.
The "New Integrator Source" page displays.
3. In the Integrator Type drop-down list, select Relational Database.
4. Specify the pertinent connection information for the relational database that you specify in Connection
Type.

Option Description

Name that you want to use for this source. The


Name maximum length of an integrator source name
is 128 characters.

Description (Optional) Text to describe this source.

114 2011-04-06
Metadata Management Administration

Option Description

The type of database for which you want to


collect metadata. Select one of the following
database types:
• Universe Connection for secure connections
defined in CMS. For options specific to a
universe connection source, see Configuring
sources for universe connections .
• DB2
• Microsoft SQL Server
Connection Type
• MySQL
• Oracle For information to download and in-
stall a JDBC driver for an Oracle database,
see “Extra requirements for Oracle” in the
Installation Guide.
• JDBC (Java Database Connectivity) for
databases such as Teradata. For options
specific to a JDBC source, see Configuring
sources for JDBC connections .

The name of the Central Management System


(CMS) connection.
You must select a value for Connections when
Connections
Connection Type is set to Universe Connec-
tion. The drop-down list displays the secure
connections defined in the CMS.

Host name on which the database server is


Computer name
running.

Database port number Port number of the database.

The name of your DB2 database, Microsoft SQL


Database Name, or Service name Server database, MySQL database, or Oracle
database service (SID).

The name of the user or owner of the database


Database User
or data source.

The password of the user for the database or


Database Password
data source.

115 2011-04-06
Metadata Management Administration

Option Description

(Optional) Specify the name of the schema that


you want to import from this source database.
If you do not specify a schema name:
• Metadata Management imports all available
schemas for SQL Server or DB2.
• Metadata Management uses the user name
Table Schema to import the schema for Oracle.
Table Schema applies to the following connec-
tion types:
• DB2
• Microsoft SQL Server
• Oracle

5. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.

Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history

6.1.1.7.1 Configuring sources for JDBC connections


If you plan to use a JDBC source (such as Teradata) for the Relational Database Metadata Integrator,
do the following steps:
1. Obtain the JDBC driver from your database server web site or utilities CD.
2. Unzip the JDBC driver into a folder such as the following:
c:\temp\teradata

3. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
4. Take one of the following actions to access the "New Integrator Source" page.
• Click the left-most icon, "Create an Integrator source", in top menu bar.
• On the Manage menu, point to New and click Integrator Source.
The "New Integrator Source" page displays.
5. In the Integrator Type drop-down list, select Relational Database.
6. Specify the following JDBC connection parameters:

116 2011-04-06
Metadata Management Administration

JDBC parameter Description

Name Name that you want to use for this source.

Description (Optional) Text to describe this source.

Connection Type Select JDBC from the drop-down list.

Name of the JDBC driver class that you ob-


Driver
tained in step 1 above.

URL address that specifies the JDBC connec-


URL
tion to the database.

(Optional) The name of the catalog in the Tera-


Catalog
data database.

Database User Name of the user or owner of the database.

Database Password Password for the user of the database.

(Optional) Specify the name of the schema that


Table Schema
you want to import from this source database.

The jar files, separated by semicolons. For ex


ample: c:\temp\teradata\tdgssja
va.jar;c:\temp\teradata\terajd
bc4.jar;c:\temp\teradata\tdgsscon
fig.jar
Library files
Note:
In a distributed deployment, you must set Li
brary Files to the classpath on the computer
where the integrator runs.

7. Click Test connection if you want to verify that Metadata Management can connect successfully
before you save this source.
8. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of page.

Related Topics
• Configuring sources for Relational Database Metadata Integrator
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history

117 2011-04-06
Metadata Management Administration

6.1.1.7.2 Configuring sources for universe connections


The Relational Database Integrator can collect metadata from secured universe connections that use
JDBC and ODBC. For the most current list of supported universe connection types, refer to the Release
Notes.
To configure a universe connection source that uses a JDBC or ODBC connection:
1. If you want to configure a universe connection source that uses a JDBC connection, perform the
following steps:
a. Obtain the JDBC driver from your database server web site or utilities CD.
b. Unzip the JDBC driver into a folder such as the following:
c:\temp\teradata

2. If you want to configure a universe connection source that uses an ODBC connection, ensure that
the ODBC Datasource exists in the computer where the integrator will run.
3. Log on to the Central Management Console (CMC) and access the Information Steward area. For
details, see Accessing Information Steward for administrative tasks.
The "Information Steward" page opens with the Integrator Sources node selected in the tree on
the left.
4. Take one of the following actions to access the "New Integrator Source" page.
• Click the left-most icon, "Create an Integrator source", in top menu bar.
• On the Manage menu, point to New and click Integrator Source.
The "New Integrator Source" page displays.
5. In the Integrator Type drop-down list, select Relational Database.
6. Specify the following universe connection parameters:

118 2011-04-06
Metadata Management Administration

Option Description

Name Name that you want to use for this source.


Description (Optional) Text to describe this source.
Connection Type Select Universe Connection from the drop-down list.
Connections The name of the Central Management System (CMS) connection. The drop-
down list displays the secure connections defined in the CMS.
Table Schema (Optional) Specify the name of the schema that you want to import from this
source database. If you do not specify a schema name:
• Metadata Management imports all available schemas for SQL Server or
DB2.
• Metadata Management uses the user name to import the schema for Oracle.

Library Files The full paths to the Java library files (separated by semicolons)required by
the Universe Connection. For example:
Note:
In a distributed deployment, you must set Library Files to the classpath on
the computer where the integrator runs.

7. Click Test connection if you want to verify that Metadata Management can connect successfully
before you save this source.
8. Click Save.
The newly configured source appears in the list of Integrator Sources on the right of the page.

Related Topics
• Configuring sources for Relational Database Metadata Integrator
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator
• Viewing integrator run progress and history

6.1.2 Managing integrator sources and instances

You manage integrator sources and instances in the SAP BusinessObjects Information Steward area
of the CMC.

From the list of configured integrator sources, you can select an integrator source and perform a task
from Manage or Actions in the top menu tool bar.

You can perform the following tasks from the Manage menu.

119 2011-04-06
Metadata Management Administration

Manage task Description

New Create a new Integrator Source or Source Group.


Security Manage user security for Integrator Sources, Source Groups, or Metapedia objects.
Refresh Obtain the latest Integrator Sources information.
Delete Delete this source configuration (see Deleting an integrator source ) and its asso-
ciated schedules, source runs, and logs.
Purge Remove all integrator source runs. This option keeps the source configuration,
file logs, and schedules.

You can perform the following tasks from the Actions menu.

Action task Description

View the current and previous executions of this Metadata Integrator source
History
(see Viewing integrator run progress and history ).

View and edit the configuration information for this Metadata Integrator
Properties
source.

Run the Metadata Integrator at regular intervals (see Defining a schedule


Schedule
to run a Metadata Integrator ).

Run the Metadata Integrator immediately (see Running a Metadata Integra-


Run now
tor immediately ).

Related Topics
• Viewing and editing an integrator source
• Deleting an integrator source
• Changing log levels
• Changing limits

6.1.2.1 Viewing and editing an integrator source

You can view and modify the definition of an integrator source in its "Properties" dialog box to change
its description, connection information, and other pertinent information for the integrator source.
• To view the definition, you must have the right to View the integrator source.
• To modify the definition, you must have the right to Edit the integrator source.

1. From the Central Management Console (CMC) click Information Steward.

120 2011-04-06
Metadata Management Administration

The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
2. Expand the "Metadata Management" node in the Tree panel and select Integrator Sources.
3. From the list in the right pane, select the name of the integrator source that you want.
Note:
If you click the integrator source type, you display the version and customer support information for
the integrator.

4. Access the "Properties" dialog box in one of the following ways:


• Double-click the row for the integrator source.
• Click Actions > Properties in the top menu tool bar.
5. You can change any property for the integrator source, except its type and name.
6. To verify the database connection information on this "Integrator Source Properties" dialog box,
click Test connection.
7. Click Save to save your changes to the configuration.
8. To change parameters such as Log Level or Update Option (BusinessObjects Enterprise Integrator
only), expand the Schedule node in the tree on the left and click Parameters.

Related Topics
• Defining a schedule to run a Metadata Integrator

6.1.2.2 Deleting an integrator source

You might want to delete an integrator source in situations such as the following:
• You want to rename your integrator source
Note:
If you rename your integrator source, you lose all the previously collected metadata.
• You no longer need your integrator source
To delete an integrator source, you must belong to the Metadata Management Administrator user group
or have the right to Delete the integrator source.
1. From the Central Management Console (CMC) click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
2. Expand the Metadata Management node.
3. Select the Integrator Sources node.
A list of configured integrator sources appears in the right panel with the date and time each was
last run.

121 2011-04-06
Metadata Management Administration

4. Select the integrator source and click Manage > Delete in the top menu tool bar.
Note:
If you delete an integrator source, you also delete the metadata from that source that was stored in
the Metadata Management repository.

5. Reply to the confirmation prompt.

6.1.2.3 Changing limits

Each time you run a metadata integrator or SAP BusinessObjects Information Steward utility, SAP
BusinessObjects Information Steward creates a new instance and log files for it. By default, the maximum
number of instances to keep is 100. When this maximum number is exceeded, SAP BusinessObjects
Enterprise deletes the oldest instance and its associated log file.

Note:
The Purge utility deletes the database log in the Metadata Management repository for each instance
that was deleted.
To change the limits to delete integrator source instances, you must have Full Control access level
on the Metadata Management folder.
1. Log on to the CMC with a user name that belongs to the Metadata Management Administrator or
Administrator user group.
2. From the Central Management Console (CMC) click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
3. Expand the Metadata Management node.
4. Click Actions > Limits in the top menu bar.
The "Limits: Metadata Management" window appears.
5. If you want to change the default value of 100 maximum number of instances to keep:
a. Select the check box for the option Delete excess instances when there are more than N
instances.
b. Enter a new number in the box under this option.
c. Click Update to save your changes.
6. If you want to specify a maximum number of instances to keep for a specific user or group:
a. Click the Add button next to Delete excess instances for the following users/groups.
b. Select the user or group name from the "Available users/groups" pane and click >.
c. Click OK.
d. If you want to change the default value of 100 maximum number of instances to keep, type a
new number under Maximum instance count per object per user.
e. Click Update to save your changes.
7. If you want to specify a maximum number of days to keep instances for a specific user or group:

122 2011-04-06
Metadata Management Administration

a. Click the Add button next to Delete instances after N days for the following users or groups
.
b. Select the user or group name from the "Available users/groups" pane and click >.
c. Click OK.
d. If you want to change the maximum number of instances to keep (default value 100), type a new
number under Maximum instance count per object per user.
e. Click Update to save your changes.
8. To close the "Limits: Metadata Management" window, click the X in the upper right corner.

6.1.3 Running a Metadata Integrator

Run the Metadata Integrator to collect the metadata for each source that you configured. When you
select Integrator Sources under the "Metadata Management " node in the tree panel on the left of the
SAP BusinessObjects SAP BusinessObjects Information Steward page in the CMC, all configured
integrator sources appear. When you select an integrator source from the list in the right pane, you can
can run it immediately or define a schedule to run it.

Related Topics
• Running a Metadata Integrator immediately
• Defining a schedule to run a Metadata Integrator

6.1.3.1 Running a Metadata Integrator immediately

To run a Metadata Integrator immediately:


1. Log on to the CMC with a user name that belongs to the Metadata Management Administrator or
Administrator user group.
2. From the Central Management Console (CMC) click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
3. Expand the Metadata Management node and click Integrator Sources.
4. From the list of configured sources that appears on the right, select the integrator source that you
want by clicking anywhere on the row except its type.
Note:
If you click the integrator source type, you display the version and customer support information for
the integrator. If you double-click the row, you open the "Properties" dialog box for the integrator
source.

123 2011-04-06
Metadata Management Administration

5. Click Actions > Run Now in the top menu tool bar and select .
Tip:
You can also click the icon "Run selected object(s) now" in the icon bar under Manage and Actions.

6. To view the progress of the integrator run, select the integrator source, and click Action > History.
Tip:
If you select Now in the Run object option under Action > Schedule > Recurrence and click
Schedule, the "Integrator History" page automatically displays.

7. Click the Refresh icon to update the status.


For more details about the "Integrator History" page, see Viewing integrator run progress and history
.
8. If you use impact and lineage reports on the Reports option in the Open drop-down menu in the
"Metadata Management" tab of Information Steward, you must recompute the contents of the lineage
staging table to incorporate changes from the Integrator runs. For more information, see Computing
and storing lineage information for reporting.

6.1.3.2 Defining a schedule to run a Metadata Integrator

To run a Metadata Integrator at regular intervals, define a schedule for it.

To define a schedule for an integrator source, you must have the right to Schedule the integrator source.
1. From the Central Management Console (CMC) click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
2. Expand the Metadata Management node.
3. Select the Integrator Sources node.
A list of configured integrator sources appears in the right panel with the date and time each was
last run.
4. From the list of configured sources that appears on the right, select the source from which you want
to collect metadata by clicking anywhere on the row except its type.
Note:
If you click the source type, you display the version and customer support information for the metadata
integrator.

5. Click Actions > Schedule.


The "Instance Title" pane of the "Schedule" page appears with the name of the configured source.
6. If you do not want the default value for Instance Title, change it to a unique name that describes
this schedule.
7. Select the Recurrence node on the left to choose the frequency in the Run object drop-down list.

124 2011-04-06
Metadata Management Administration

8. Choose the additional relevant values for the selected recurrence option. For details, see Recurrence
options.
9. If you want to send notification when the integrator has run, select the Notification node on the left.
For more information about Notification, see the SAP BusinessObjects Business Intelligence
Platform Administrator Guide.
10. If you want to trigger the execution of a Metadata Integrator when an event occurs, select the Events
node on the left. For more information about Events, see the SAP BusinessObjects Business
Intelligence Platform Administrator Guide.
11. Select the Parameters node on the left to change the default values for run-time parameters for the
metadata integrator. For details, see Common run-time parameters for metadata integrators.
12. Click Schedule.
13. If you use impact and lineage reports on the Open > Reports option on Metadata Management tab
in Information Steward, you must recompute the contents of the lineage staging table to incorporate
changes from the Integrator runs. Similar to setting up a regular schedule to run an Integrator, you
can set up a schedule to compute the lineage staging table at regular intervals. For more information,
see Computing and storing lineage information for reporting.

6.1.4 Changing run-time parameters for integrator sources

When you schedule an integrator source, you can change the default values of the run-time parameters
on the "Parameters" page in the Central Management Console (CMC). The following sections describe
the runtime parameters that you can set for different integrator sources.

To change the run-time parameters for an integrator source:


1. From the Central Management Console (CMC) click Information Steward.
2. Expand the "Metadata Management" node.
3. Select the Integrator Sources node, and select your integrator source
4. Click Actions > Schedule.
5. On the "Schedule" window, click Parameters in the navigation tree.
6. In the text box for JVM Arguments or Additional Arguments, enter the parameter and value you
want to change.
7. Click Recurrence in the navigation tree to schedule this integrator source. For more information,
see "recurrence_options.dita#icc14.0.0_reference_285CCB668AEE45E890D59C6C149B2D23".
8. Click Schedule.

6.1.4.1 Common run-time parameters for metadata integrators

125 2011-04-06
Metadata Management Administration

The following table describes the run-time parameters that are applicable to all metadata integrators.
For information about run-time parameters that apply to only specific metadata integrators, see the
topics in Related Topics below.

Run-time parameter Description

This log is in the SAP BusinessObjects Metadata Management Repository.


You can view this log while the Metadata Integrator is running.
The default logging level is Information. Usually you can keep the default
Database Log Level
logging level. However, if you need to provide more detailed information
about your integrator run, you can change the level to log tracing information.
For a description of log levels, see Log levels.

The Metadata Integrator creates this log in in the Business Objects installa-
tion directory and copies it to the File Repository Server. You can download
this log file after the Metadata Integrator run completed.
File Log Level The default logging level for this log is Configuration. Usually you can
keep the default logging level. However, if you need to debug your integrator
run, you can change the level to log tracing information. For a description
of log levels, see Log levels.

The Metadata Management Job Server creates a Java process to perform


the metadata collection. Use the JVM Arguments parameter to configure
JVM Arguments run-time parameters for the Java process. For example, if the metadata
source is very large, you might want to provide more memory than the de-
fault.

Optional run-time parameters for the metadata integrator source. For more
Additional Arguments
information, see User collection parameters.

Related Topics
• Metadata collection using the Remote Job Server with SSL

6.1.4.2 Run-time parameters for SAP BusinessObjects Enterprise integrator


source

6.1.4.2.1 User collection parameters


The SAP BusinessObjects Enterprise metadata integrator provides the following run-time parameter
to adjust memory usage when collecting user permissions.

126 2011-04-06
Metadata Management Administration

Run-time parame- Default val-


Description
ter ue

Enable or disable user permissions collection


Specify true to enable user permissions collection.
Note:
If you specify true, increase the memory with the -Xmx parame-
collectUserPermis ter in the JVM Arguments. If the amount of memory is available, false
sions you can set this value as high as -Xmx1500m, where "m" indi-
cates megabytes.

Set this parameter in Additional Arguments on the "Parame-


ters" page.

6.1.4.2.2 Selective CMS object collection


Selective collection of CMS metadata through the SAP BusinessObjects Enterprise Metadata Integrator
reduces processing time. To view a complete picture of the information in your BusinessObjects
Enterprise system, run the integrator multiple times, specifying a different component for each run.
Also, you might want to incrementally collect CMS metadata for a large SAP BusinessObjects Enterprise
deployment by using selective collection.

To add metadata to the previous metadata collections, select the Update existing objects and add
newly selected objects option. For example, if you have collected Web Intelligence documents on the
first run, then you collect Crystal Reports on the second run, then you will see the Web Intelligence
documents and Crystal Reports metadata together in the SAP BusinessObjects Metadata Management
Explorer. The first time you schedule and run a metadata collection for a specific object type, all metadata
for that object is collected. Subsequent runs will only collect changes since the last run.

To delete metadata from previous metadata collections, select the Delete existing objects before
starting object collection option. For example, if you have collected Web Intelligence documents on
a previous run, and then choose to collect Crystal Reports on the next run, you will see only Crystal
Reports metadata in the Metadata Management Explorer.

You can collect metadata from Universe, Public, or Personal folders. Anyone using the SAP
BusinessObjects Metadata Management Explorer can see the contents of these folders. However, you
must have the proper permissions to be able to run collections on the objects in these folders. In your
Personal folder, only you or an administrator can run a collection on those objects.

Note:
SAP BusinessObjects Enterprise Metadata Integrator does not collect metadata from Inboxes or
Categories.

127 2011-04-06
Metadata Management Administration

Option Description

Specifies the names of the folders that you want


in the collection using a Java Regular Expression.
For example,
• an asterisk (*) means that all folders are col-
lected.
• folderName collects metadata within a specific
folder. It will also collect metadata in any asso-
ciated subfolders.
• folderName* an asterisk at the end of a string
value includes all folders with that string.
Folder Name Expression • folderName|folderName a pipe (also called
vertical bar separator) between folder names
includes one or the other folder. For example,
Sales|Finance means either the Sales or the
Finance folder.
• ^(?!folderName$) excludes the folder with
this specific folder name.
• ^(?!folderName) excludes any folders that
start with this specific folder name.
• (?!folderName) excludes any folders that has
this specific folder name within the name.

Collect the SAP BusinessObjects Enterprise uni-


verse metadata in the folders specified in the
Folder Name Expression option.
Collect Universes Note:
If you uncheck this option, and choose to collect
any report that uses a universe, the integrator
collects the universe metadata as well.

Collect Web Intelligence Documents and source Collects Web Intelligence documents and source
Universes universes from Public and/or Personal folders.

Collects Desktop Intelligence documents and


Collect Desktop Intelligence Documents and
source universes from Public and/or Personal
source Universes
folders.

Collects Crystal Reports and associated universes


Collect Crystal Reports and associated Universes
from Public and/or Personal folders.

128 2011-04-06
Metadata Management Administration

Option Description

This option only appears for an SAP BusinessOb-


jects Enterprise Metadata Integrator.
• Delete existing objects before starting ob-
ject collection
This choice is the default and collects all
metadata the integrator source.
Update Option:
• Update existing objects and add newly se-
Delete existing objects before starting object lected objects
collection
This choice only collects metadata about ob-
Update existing objects and add newly select- jects that are new or have changed since the
ed objects last time the integrator source was run.

Note:
It is recommended that you specify Delete exist-
ing objects before starting object collection
the first time you run the integrator source, but
specify Update existing objects and add newly
selected objects for subsequent runs.

6.1.4.2.3 Metadata collection using the Remote Job Server with SSL
You can configure an integrator source on SAP BusinessObjects Information Steward 4.0 to collect
metadata from an SAP BusinessObjects Enterprise XI 3.x system. The Information Steward installation
program installs a Remote Job Server component on the Enterprise XI 3.x to enable this collection. If
SSL is enabled between the Information Steward 4.0 and the Remote Job Server systems, you must
set the businessobjects.migration run-time parameter when you schedule the integrator source.

When you schedule an integrator source to collect metadata from an SAP BusinessObjects Enterprise
XI 3.x system that has SSL enabled, you must set the businessobjects.migration run-time
parameter

Note:
You cannot run remote integrators if Federal Information Processing Standards (FIPS) mode is enabled
on Enterprise XI 3.x.
To set the run-time parameter for the SAP BusinessObjects Enterprise XI 3.x integrator source when
SSL is enabled:
1. From the Central Management Console (CMC) click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
2. Expand the Metadata Management node and click Integrator Sources.

129 2011-04-06
Metadata Management Administration

3. From the list of configured sources that appears on the right, select the source by clicking anywhere
on the row except its type.
Note:
If you click the source type, you display the version and customer support information for the metadata
integrator.

4. Click Actions > Schedule.


5. On the "Schedule" window, click Parameters in the navigation tree.
6. In the text box for JVM Arguments, enter the following parameter:
-Dbusinessobjects.migration=on
7. Click Recurrence in the navigation tree to schedule this integrator source. For more information,
see Recurrence options.
8. Click Schedule.

6.1.4.3 Run-time parameters for an SAP NetWeaver BW integrator source

The SAP NetWeaver Business Warehouse metadata integrator provides run-time parameters to adjust
the number of threads to use when collecting metadata from the SAP system and to filter the queries
or workbooks to collect.

Run-time parameter Description

Specifies the number of threads to use when


collecting metadata from the SAP system. You
might want to increase the number of threads if
your SAP NetWeaver BW system has a large
number of objects and it has available work pro-
cesses.
The default value is 5. In a multiprocessor envi-
Number of Threads
ronment, you can increase this value. The number
of threads is limited by the number of processors
configured on the SAP NetWeaver BW server. If
the number of threads is greater than the available
processors, the thread is put on a queue and will
be processed when a processor becomes avail-
able.

130 2011-04-06
Metadata Management Administration

Run-time parameter Description

Specifies the names of the queries that you want


in the collection using a Java Regular Expression.
For example:
• * an asterisk (*) means that all queries are
collected.
• queryName* an asterisk at the end of a string
value includes all queries with that string.
• queryName|queryName a pipe (also called
vertical bar separator) between query names
Query Name Expression includes one or the other query. For example,
Sales|Finance means either the Sales or the
Finance query.
• ^(?!queryName$) excludes the query with this
specific query name.
• ^(?!queryName) excludes any queries that
start with this specific query name.
• (?!queryName) excludes any queries that has
this specific query name within the name.

Specifies the names of the workbooks that you


want the integrator source to collect using a Java
Regular Expression.
Workbook Name Expression
Examples are similar to those for Query Name
Expression

6.1.5 Viewing integrator run progress and history

When you select the Integrator Sources in the Tree panel on the left on the SAP BusinessObjects
Information Steward page in the CMC, all configured integrator sources display. Next to each source
name is the date and time when the Integrator was last run for that source.

To view all runs for a metadata integrator source, you must have the right to Viewintegrator sources:
1. From the list of all configured integrator sources, select the name of the source that you want to see
the history of runs.
2. Click the down arrow next to "Actions" in the top menu tool bar and select History from the drop-down
list.
3. The "Integrator History" page displays the following information:
• All "Schedule Names" for the integrator source.

131 2011-04-06
Metadata Management Administration

• Status of each schedule. The possible values are, Success, Failed, Running, Paused, Resumed,
and Stopped.
• "Start Time", "End time", and "Duration" of each integrator run.
• "Log File" for that integrator run.
4. Click the "View the database log" icon (fifth from the left) in the menu bar or click Actions > Database
Log to view the progress messages of the integrator run. For details on logs, see Information Steward
logs
By default, Metadata Management writes high-level messages (such as number of universe processed
and number of reports processed) to the log. You can change the message level on the configuration
page for the integrator source. For details, see Changing log levels.
5. In the top menu bar of the "Integrator History" page, use either the Actions drop-down list or the
icons to perform any of the following options:

Option Description

Run Now Run this integrator immediately


Stop Stop the execution of this integrator instance if it is currently running.
Pause Pause the execution of this integrator instance if it has a status of Pending or
Recurring.
Resume Resume the execution of this integrator instance if it is currently paused.
Reschedule Define a new schedule to execute this integrator.
Database log View messages that indicate which metadata objects have been collected from
the integrator source.
Refresh Refresh the status of this integrator instance.
Delete Delete this integrator instance from the history list.

6.1.6 Troubleshooting

The following sections tell you how to interpret the warning and error messages that you might see in
the database log and file log for each Metadata Integrator run.

6.1.6.1 Crystal Report message

Crystal Report [reportname]. Unable to find class for universe object


[objectname]. Universe class cannot be uniquely identified for object

132 2011-04-06
Metadata Management Administration

[objectname]. Data association cannot be established directly through the


object. Reference will be established directly to the column.
Cause: The BusinessObjects Enterprise Metadata Integrator cannot uniquely identify a universe object
that is used to create a Crystal Report when the object has the same name as another object in a
different universe class. Therefore, the Integrator cannot establish the correct relationship between the
universe object and the report. However, the Integrator can establish a relationship between the source
table or column and the report because the SQL parser can find the source column used by the universe
object.

Action: Name the objects uniquely across the classes in a universe.

6.1.6.2 DeskI document error

DeskI document [reportname]. Universe [ ] not found.


Cause: Data providers in the Desktop Intelligence document refer to an invalid or non-existent universe.

Action: Open the Desktop Intelligence document and edit the data provider to specify a valid universe.

6.1.6.3 Out of memory error

Error occurred during initialization of VM


Could not reserve enough space for object heap

Cause: The SAP BusinessObjects Enterprise Metadata Integrator does not have enough memory to
run.

Action: Decrease the value of the MaxPermSize run-time parameter in the JVM Arguments on the
"Parameters" page when you schedule the integrator source. For example, enter
-XX:MaxPermSize=256m and rerun the metadata integrator.

6.1.6.4 Parsing failure

Parsing failure. Unable to find columns for SQL with select *.


Cause: The BusinessObjects Enterprise Metadata Integrator cannot collect the metadata for a universe
derived table if the SQL used in the derived table is of the form SELECT * FROM TABLE.

133 2011-04-06
Metadata Management Administration

Action: Always use the fully-qualified column names in the projection list of the SELECT clause.

6.1.6.5 Parsing failure for derived table

Parsing failure for derived Table <table_name>. Unable to find table


associated with column <column_name>
Cause: The SQL parser in the BusinessObjects Enterprise Metadata Integrator requires column names
in a derived table to be qualified by the table name: table_name.column_name. If you do not qualify
the column name, the Metadata Integrator cannot associate the column to the correct table.

Action: Fully qualify the column reference or add the tables used by the derived tables to the universe.
The Metadata Integrator treats the universe tables as a system catalog to find the table and column
references.

6.1.6.6 Unable to parse SQL

Unable to parse SQL ... <error message>


Cause: The SQL parser in Metadata Management has limited parsing capabilities to extract column
names and table names to build the relationships. For example, if a Metadata Integrator fails to parse
the SQL for a view, it cannot build the source-target relationship between the view and the table or
tables upon which the view is based.

However, the Metadata Integrators collect the SQL statement and the Metadata Management Explorer
displays it.

Action: Analyze the SQL statement in the Metadata Management Explorer and establish a user-defined
relationship for these tables and columns.

6.1.6.7 Unable to retrieve SQL to parse

Unable to retrieve SQL to parse. <error message>


The cause of this warning message can be one of the following:

Cause: You do not have sufficient privilege to extract metadata about Web Intelligence documents.

Action: Take one of the following actions:

134 2011-04-06
Metadata Management Administration

• Have your administrator change your security profile to give you permission to refresh Web Intelligence
documents.
• Run the Metadata Integrator with a different user id that has permission to refresh Web Intelligence
documents.

Cause: If a database connection is not configured for Trusted Authentication in SAP BusinessObjects
Business Intelligence Platform, you must supply the user id and password at runtime. If you try to collect
metadata for a report that uses a non-Trusted connection to the database, the report collection fails.

Action: Configure both your SAP BusinessObjects Business Intelligence Platform server and client to
enable Trusted Authentication. For details, see the SAP BusinessObjects Business Intelligence Platform
Administrator Guide.
Cause: The extract for Web Intelligence documents fails if you create your Web Intelligence documents
with the Refresh on Open option and the computer on which you run the BusinessObjects Enterprise
Metadata Integrator does not have connection to the source database on which the reports are defined.

Action: Take one of the following actions:


• Run the BusinessObjects Enterprise Information Steward on the computer where BusinessObjects
Enterprise is installed
• Define the database connection on the computer where you run the BusinessObjects Enterprise
Metadata Management.

6.1.6.8 Connection with Data Federator Designer

If the Data Federator Integrator connects successfully to the Data Federator Designer, but Data Federator
returns an error:
1. Login to the Data Federator Designer.
2. Within Data Federator Designer, you can obtain a more detailed error message.

6.1.7 Grouping Metadata Integrator sources

SAP BusinessObjects Information Steward provides the capability to group Metadata Sources into
groups such as Development System, Test System, and Production System. After the groups are
defined, you can view impact and lineage diagrams for a specific Source Group.

Related Topics
• Creating source groups
• Modifying source groups

135 2011-04-06
Metadata Management Administration

6.1.7.1 Creating source groups

You must have the Add Objects right on the SAP BusinessObjects Information Steward Source group
folder to create a Metadata Integrator source group.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Select Source Groups node in the tree on the left on the page.
3. Access the "Source Group" window in one of the following ways:
• Click the second icon "Create a Source Group" in the menu bar on top.
• In the menu bar on top, click Manage > New > Source Group.
4. Define the configuration on the Source Group page.
a. Enter the Name and Description.
b. Select integrator sources to add to this source group by clicking the check box to the left of each
integrator source name.
c. Click Save.

The new source group name appears on the right side of the page.

6.1.7.2 Modifying source groups

You must have the Edit right on the Metadata Integrator source group.
1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Select Source Groups node in the tree on the left on the page.
3. Select the source group that you want to modify.
4. In the menu bar on top, click Actions > Properties.
5. You can change any of the following properties of the source group:
• Name
• Description
• Integrator sources that you want to remove or add to the source group.
6. Click on User Security to do any of the following tasks:
• Add principals to this source group.
• View security for a selected principal.
• Assign security for a selected principal.
7. Click Save.

136 2011-04-06
Metadata Management Administration

6.1.7.3 Deleting source groups

You must have the Delete right on the Metadata Integrator source group.

1. Log on to the Central Management Console (CMC) and access the Information Steward area.
2. Select Source Groups node in the tree on the left on the page.
3. Select the source group that you want to delete.
4. In the menu bar on top, click Manage > Delete.
5. Click OK to confirm the deletion.

137 2011-04-06
Metadata Management Administration

138 2011-04-06
Cleansing Package Builder Administsration

Cleansing Package Builder Administsration

7.1 Changing ownership of a cleansing package

Ownership can be reassigned only by an Information Steward administrator. You can change ownership
only for private cleansing packages. Published cleansing packages are either unowned or linked to
private cleansing packages and therefore when you change the ownership of a private cleansing
package, the new owner automatically can republish to the linked cleansing package.

To reassign ownership of a cleansing package:


1. Log in to the Central Management Console (CMC) as an administrator.
2. At the CMC home page, click Information Steward.
3. In the left pane, expand the Cleansing Packages node and then click Private.
4. In the right pane, right-click the cleansing package whose owner you want to reassign. In the pop-up
menu, choose Properties to open the Properties window.
Other ways of accessing the Properties window:
• Choose Actions > Properties.
• Click the link in the "Name" or "Kind" column for the desired cleansing package.
5. In the Properties window, choose a different owner in the "Owner" drop-down list, and click Save.
The new owner is now assigned to this cleansing package.

7.2 Deleting a cleansing package

To delete a private or published cleansing package:


1. Log in to the Central Management Console (CMC) as an administrator.
2. At the CMC home page, click Information Steward.
3. In the left pane, expand the Cleansing Packages node and then click either Private or Published.
4. In the right pane, right-click the cleansing package that you want to delete. In the pop-up menu,
choose Manage > Delete.
5. In the confirmation window that opens, click OK.
The selected cleansing package is deleted.

139 2011-04-06
Cleansing Package Builder Administsration

7.3 Changing the description of a cleansing package

You can edit the description of published or private cleansing packages.

To edit the description:


1. Log in to the Central Management Console (CMC) as an administrator.
2. At the CMC home page, click Information Steward.
3. In the left pane, click Cleansing Packages and then click Private or Published.
4. In the right pane, right-click the cleansing package whose description you want to edit and select
Properties.
Other ways of accessing the Properties window:
• Choose Actions > Properties.
• Click the link in the "Name" or "Kind" column for the desired cleansing package.
5. In the Properties window, edit the description, and click Save.

7.4 Unlocking a cleansing package

The status of a cleansing package is displayed in the status bar of the "Cleansing Package Tasks"
screen and is also indicated by the cleansing package icon.

It may take some time for a cleansing package with a BUSY status to complete the operation and
change to a READY state. The state of a cleansing package with a BUSY status cannot be changed
by an Information Steward administrator. You can either wait for the operation to complete or delete
the cleansing package.

When a cleansing package is opened for editing, it enters a locked state so that no other user may edit
it. To close a cleansing package, you must either return to the "Cleansing Package Tasks" screen,
switch to another cleansing package, or log off from Cleansing Package Builder. If the browser window
is closed or the computer is shut down without logging off, the cleansing package may become locked.

A cleansing package may become locked when it is in any of the following states:
• OPEN_FOR_READWRITE
• CANCEL_AUTO_ANALYSIS
• CANCEL_PUBLISHING
• AUTO_ANALYSIS
• PUBLISHED
To unlock a cleansing package:

140 2011-04-06
Cleansing Package Builder Administsration

1. (Data steward ) When you encounter a locked cleansing package ( , , or ) in the "Cleansing
Package Tasks" screen, ask your Information Steward administrator to unlock it.
2. (Information Steward administrator) Change the cleansing package state from LOCKED to ERROR.
a. Log in to the Central Management Console (CMC).
b. Select Information Steward.
c. Expand the "Cleansing Package" node and select Private or Published.
d. Right-click the desired cleansing package, and choose Properties.
e. Change the state to ERROR and click Save.
f. Notify the data steward that the cleansing package state is updated to ERROR. .
The data steward must verify the condition of the cleansing package prior to further use

3. (Data steward) When notified that the cleansing package is unlocked and moved to the ERROR
state ( , , or ), do the following:
a. From the "Cleansing Package Tasks" screen, open the cleansing package.
b. Verify the condition of the cleansing package and that it displays information as expected.
c. Close the cleansing package.
If the cleansing package is returned to a READY state and its condition was as you expected,
you may use it.
If the cleansing package returns to the ERROR state or the condition was not as expected, the
cleansing package is corrupt and should be deleted.

Related Topics
• Cleansing package states and statuses

7.5 Cleansing package states and statuses

You can view cleansing package properties, including the state, in the following locations:
• In the "Cleansing Package Tasks" screen.
In Cleansing Package Builder, in the "Cleansing Package Tasks" screen, hover over the desired
cleansing package to display the properties sheet.
• In the Information Steward area of the Central Management Console (must have Information Steward
administrator privileges).
Log in to the Central Management Console (CMC). Select Information Steward. Expand the
"Cleansing Package" node and select Private or Published. Right-click the desired cleansing
package and choose Properties.

The table below describes the possible states for a cleansing package:

141 2011-04-06
Cleansing Package Builder Administsration

State Description

Cleansing package is in good condition and available for editing


READY
or viewing.

CREATE Cleansing package is being created.

OPEN_FOR_READ Cleansing package is being browsed and is in view-only mode.

OPEN_FOR_READWRITE Cleansing package is open for editing.

Cleansing package is in the process of being published. Wait for


PUBLISHING
publishing to complete.

Auto-analysis process is performed after a custom cleansing


AUTO_ANALYSIS
package is initially created.

The ERROR state occurs in two situations: Either 1) The cleansing


package is possibly corrupt, or 2) The cleansing package was
previously locked. An Information Steward administrator changed
the state to ERROR in an attempt to make the cleansing package
ERROR usable again. In either situation, a data steward must open the
cleansing package, check the condition and close the cleansing
package. If the cleansing package is returned to a READY state,
it may be used. If the cleansing package is still in an ERROR
state, it is corrupt and should be deleted

Data steward canceled the auto-analysis process. It may take


CANCEL_AUTO_ANALYSIS
some time for the cleansing package to move to the READY state.

Data steward canceled the publishing process. It may take some


CANCEL_PUBLISHING
time for the cleansing package to move to the READY state.

A cleansing package may have of the following statuses: READY, BUSY, LOCKED, ERROR. The
status of a cleansing package is displayed in the status bar of the "Cleansing Package Tasks" screen
and is also indicated by the cleansing package icon. The following table shows the state, associated
icon and possible user action:

Status Icon Possible user action

, , or
READY Open and edit or view the cleansing package.

, , or Wait for the process (auto-analysis or publishing) to complete or cancel


BUSY
the process.

142 2011-04-06
Cleansing Package Builder Administsration

Status Icon Possible user action

When a cleansing package is opened for editing, its status changes to


LOCKED so that no other user may edit it.

Ensure that you do not have the cleansing package open in a different
browser window.
, , or Wait at least 20 minutes for Cleansing Package Builder to automatically
LOCKED
close the cleansing package and restore it to a READY state.

Contact your Information Steward administrator to unlock the cleansing


package from the Central Management Console (CMC). Unlocking a
cleansing package changes the cleansing package to an ERROR state
and status. For more information, see Unlocking a cleansing package

ER , , or Open the cleansing package and assess its condition. For more information,
ROR see Unlocking a cleansing package.

The state of a cleansing package with a BUSY status cannot be changed by an Information Steward
administrator. You can either wait for the operation to complete or delete the cleansing package.

A cleansing package with a LOCKED status can be unlocked by an Information Steward administrator.
Unlocking a cleansing package changes its status to ERROR. Before further use, the condition of the
cleansing package must be verified by a data steward.

Related Topics
• Unlocking a cleansing package

143 2011-04-06
Cleansing Package Builder Administsration

144 2011-04-06
Information Steward Utilities

Information Steward Utilities

8.1 Utilities overview

SAP BusinessObjects Information Steward provides the following utilities that you manage on the CMC.

Configurable Default
Utility Description
Properties Schedule

Calculates scores of key data domains regularly for data


quality scorecards. Daily
Calculate (You can
Note:
Score- None change this
You can also recalculate scorecards immediately from
card default
the Data Insight tab in Information Steward when you
select Now for the Show score as of option. schedule.)

Computes and stores end-to-end impact and lineage in-


formation across all integrator sources for the Reports
None
option in the Open drop-down list in the Metadata Man-
agement module. Mode (You either
Compute (See Modify- create a
Information Steward provides a configured Compute
Lineage ing utility schedule for
Lineage Report utility that computes impact and lineage
Report configura- a configured
for only the integrator sources that contain changes since
tions) utility or run it
the last time the computation was run. You can also
immediately.)
create another configuration to compute impact and lin-
eage across all integrator sources.

145 2011-04-06
Information Steward Utilities

Configurable Default
Utility Description
Properties Schedule

Increases disk space in the repository in the following


ways:
1. Deletes database logs after integrator sources have
been deleted
2. Deletes profile results, scores, and sample data that Daily
have exceeded the configured retention period
3. Purges logically deleted information that includes (You can
Purge None change this
sample data for profiling and rule tasks.
default
Note: schedule.)
If you want to delete profile results, scores, and sample
data for an individual table or file before the retention
period has been reached, see "Deleting profile results
from the repository" in the User Guide.

None
Recreates Metadata Management search indexes. Integrator
source (You either
Update Information Steward provides a configured Search Index create a
Search utility that rebuilds the search indexes across all integrator (See Modify- schedule for
Index sources. You can create additional configurations to ing utility a configured
recreate search indexes for specific integrator sources. configura- utility or run it
tions) immediately.)

Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Scheduling a utility
• Running a utility on demand
• Monitoring utility executions
• Modifying utility configurations
• Creating a utility configuration

8.1.1 Computing and storing lineage information for reporting

The SAP BusinessObjects Information Steward Repository provides a lineage staging table,
MMT_Alternate_Relationship, that consolidates end-to-end impact and lineage information across all
integrator sources. The Metadata Management module provides pre-defined Crystal Reports from this

146 2011-04-06
Information Steward Utilities

table. You can also create your own reports from this table. To view the reports on the Reports option
in the Open drop-down list in the Metadata Management tab, they must be Crystal Reports (see
"Defining custom reports" in the Users Guide).

Before generating reports that rely on this lineage staging table, you should update the lineage information
in the lineage staging table. You can either schedule or run the Compute Lineage Report utility on
demand to ensure those reports contain the latest lineage information.

The following activities can change the lineage information, and it is recommended that you run the
lineage computation after any of these activities occur:
• Run an Integrator to collect metadata from a source system (see "Running a Metadata Integrator"
in the Metadata Management Administration section).
• Change preferences for relationships between objects (see "Changing preferences for relationships"
in the Users Guide). The data in the lineage staging table uses the values in Impact and Lineage
Preferences and Object Equivalency Rules to determine impact and lineage relationships across
different integrator sources.
• Establish or modify a user-defined relationship of type Impact or Same As (see "Establishing
user-defined relationships between objects" in the Users Guide).

Related Topics
• Scheduling a utility
• Running a utility on demand
• Monitoring utility executions
• Modifying utility configurations
• Creating a utility configuration

8.1.2 Recreating search indexes on Metadata Management

The search feature of the Metadata Management module of SAP BusinessObjects Information Steward
allows you to search for an object that might exist in any metadata integrator source. When you run a
metadata integrator source, Metadata Management updates the search index with any changed
metadata.

You might need to recreate the search indexes in situations such as the following:
• The Search Server was disabled and could not create the index while running a metadata integrator
source.
• The search index is corrupted.

Related Topics
• Scheduling a utility
• Running a utility on demand
• Monitoring utility executions
• Modifying utility configurations

147 2011-04-06
Information Steward Utilities

8.2 Scheduling a utility

You would schedule a utility for reasons including the following:


• To change the default frequency that the Calculate Scorecard utility is run to generate rule results
for the data quality trend graphs in Data Insight.
• To change the default frequency that the Purge utility is run to increase space in the Information
Steward repository,
• To schedule the Compute Lineage Report utility for Reports on Metadata Management.
• To schedule the Update Search Index utility in Metadata Management.

To define or modify a schedule for a utility:


1. Log in to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator group or the Administrator group.
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the "Applications Name" list.
4. Click Action > Manage Utilities in the top menu tool bar.
5. From the list of "Utility Configurations", select the name of the utility configuration that you want to
schedule.
6. In the top menu tool bar, click Actions > Schedule.
7. If you do not want the default value for Instance Title:
a. Click Instance Title in the navigation tree in the left pane of the "Schedule" window.
b. Change the title to a value you want.
8. To define the frequency to execute this utility:
a. Click Recurrence in the navigation tree in the left pane of the "Schedule" window.
b. Select the frequency in the Run object drop-down list.
c. Select the additional relevant values for the recurrence option.
For a list of the recurrence options and the additional values, see Recurrence options.

9. Optionally, set the Number of retries to a value other than the default 0 and change the Retry
interval in seconds from the default value 1800.
10. If you want to be notified when this utility runs successfully or when it fails, expand Notification,
and fill in the appropriate information. For more information about Notification, see the SAP
BusinessObjects Business Intelligence Platform Administrator Guide.
11. If you want to trigger the execution of this utility when an event occurs, expand Events, and fill in
the appropriate information. For more information about Events, see the SAP BusinessObjects
Business Intelligence Platform Administrator Guide.
12. Click Schedule.
13. If you want this newly created schedule to override the default recurring schedule for the Purge or
Calculate Scorecard utility, delete the old recurring instance.

148 2011-04-06
Information Steward Utilities

a. From the list of "Utility Configurations", select the name of the utility whose schedule you want
to delete.
b. Click Actions > History.
c. Select the recurring schedule that you want to delete and click the delete icon in the menu bar.
Note:
To change the recurring schedule directly (instead of creating a new and deleting the old one), see
Rescheduling a utility.

Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Running a utility on demand
• Monitoring utility executions
• Modifying utility configurations
• Creating a utility configuration

8.3 Rescheduling a utility

You would reschedule a utility for reasons including the following:


• Change the default frequency that the Calculate Scorecard utility is run to generate rule results for
the data quality trend graphs in Data Insight.
• Change the default frequency that the Purge utility is run to Increase space in the Information Steward
repository.
• If you setup a recurring schedule for the Compute Lineage Report utility and you want to change
the schedule to compute the lineage information for Reports on Metadata Management.
• If you setup a recurring schedule for the Update Search Index utility and you want to change the
schedule to rebuild the search indexes in Metadata Management.

To reschedule a utility:
1. Login to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator group or the Administrator group.
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the "Applications Name" list.
4. Click Action > Manage Utilities in the top menu tool bar.
5. From the list of "Utility Configurations", select the name of the utility configuration that you want to
reschedule.
6. In the top menu tool bar, click Actions > History.
7. On the "Utility History" screen, select the schedule name with that has a schedule status of "Recurring"
and click Reschedule in the top menu bar.
a. Click Recurrence in the navigation tree in the left pane of the "Reschedule" window.

149 2011-04-06
Information Steward Utilities

b. Select the frequency in the Run object drop-down list.


c. Select the additional relevant values for the recurrence option.
For a list of the recurrence options and the additional values, see Recurrence options.
d. If you want to provide a different name for this schedule, click Instance Title in the navigation
tree and enter the name.
8. Click Schedule.
The newly created "Recurring" schedule appears on the "Utility History" screen.
9. Delete the original "Recurring" schedule:
a. From the list on the "Utility History" window, select the original "Recurring" schedule.
b. Click the Delete icon.

8.4 Running a utility on demand

To run an Information Steward utility on demand:


1. Login to the Central Management Console (CMC) with a user name that belongs to one or more of
the following administration groups:
• Data Insight Administrator
• Metadata Management Administrator
• Administrator
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the "Applications Name" list.
4. Click Action > Manage Utilities in the top menu tool bar.
The list of configured utilities displays with the date and time each was last run.
5. From the "Utility Configurations" list, select the name of the utility you want to run.
6. In the top menu tool bar, click Action > Run Now.
Caution:
If an instance of the lineage report utility is still running, do not start another lineage report utility run.
Starting another utility run might cause deadlocks or delays. The same behavior might occur if you
stop the utility and start another instance right away because the process might still be running in
the repository.

7. On the "Utility Configurations" screen, click the Refresh icon to update the "Last Run" column for
the utility configuration.

Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Scheduling a utility
• Monitoring utility executions

150 2011-04-06
Information Steward Utilities

• Modifying utility configurations


• Creating a utility configuration

8.5 Monitoring utility executions

To look at the status of or the progress of a utility run:


1. Login to the Central Management Console (CMC) with a user name that belongs to one or more of
the following administration groups:
• Data Insight Administrator
• Metadata Management Administrator
• Administrator
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the "Applications Name" list.
4. Click Action > Manage Utilities in the top menu tool bar.
The list of configured utilities displays with the date and time each was last run.
5. Click the Refresh icon to update the "Last Run" column on the "Utility Configurations" screen.
6. To view the status of the utility run, select the utility configuration name and click Action > History.
The "Schedule Status" column can contain the following values:

Schedule Status Description

Failed The utility did not complete successfully.

The utility is scheduled to run one time. When it actually runs, there will
Pending
be another instance with status “Running."

The utility is scheduled to recur. When it actually runs, there will be another
Recurring
instance with status “Running."

Running The utility is currently executing.

Success The utility completed successfully.

7. To see the progress of a utility instance:


a. Select the instance name and click the icon for View the database log in the top menu bar of
the "Utility History" screen.
The "Database Log" window shows the utility messages.
b. To find specific messages in the "Database Log" window, enter a string in the text box and click
Filter.
For example, you might enter error to see if there are any errors.

151 2011-04-06
Information Steward Utilities

c. To close the "Database Log" window, click the X in the upper right corner.
8. To save a copy of a utility log:
a. Scroll to the right of the "Utility History" screen, and click the Download link in the "Log File"
column in the row of the utility instance you want.
b. Click Save.
c. On the "Save As" window, browse to the directory where you want to save the log and optionally
change the default file name.
9. To close the "Utility History" screen, click the X in the upper right corner.

Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Scheduling a utility
• Running a utility on demand
• Modifying utility configurations
• Creating a utility configuration

8.6 Modifying utility configurations

SAP BusinessObjects Information Steward provides a default configuration for each of the utilities. You
can modify the configuration settings for the following utilities:
• Compute Lineage Report utility
The default configuration for the Compute Lineage Report utility has Mode set to Optimized which
recalculates lineage information in the lineage staging table for only integrator sources that have
changed since the utility was last run. You might want to set Mode to Full to recalculate lineage
information across all integrator sources.
• Update Search Index utility
The default configuration for the Update Search Index utility has Integrator Source set to All
Sources, which rebuilds the search indexes in Metadata Management for all integrator sources.
You might want to set Integrator Source to a specific integrator source to rebuild search indexes
for the metadata collected for only that integrator source.

Note:
The Calculate Scorecard and Purge utilities do not have configuration parameters.

To change the configuration of a utility:


1. Login to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator group or the Administrator group.
2. Select Applications from the navigation list at the top of the CMC Home screen.
3. Select Information Steward in the "Applications Name" list.

152 2011-04-06
Information Steward Utilities

4. Click Action > Manage Utilities in the top menu tool bar.
5. Select the utility whose configuration you want to change.
6. Click Actions > Properties in the top menu tool bar.
The "Utilities Configurations" screen appears.
7. For a Compute Lineage Report utility, you can change the following parameters:
• Description
• Mode
Mode can be set to one of the following values:
• Full mode recalculates all impact and lineage information and repopulates the entire lineage
staging table.
Note:
If you select Full mode, the computation can take a long time to run because it recalculates
impact and lineage information across all integrator sources.
• Optimized mode (the default) recalculates recalculates impact and lineage information for
only the integrator sources that contain changes since the last time the computation was run.
For example, if only one Integrator was run, the computation only recalculates impact and
lineage information corresponding to that integrator source and updates the lineage staging
table.

8. For an Update Search Index utility, you can change the following parameters:
• Description
• Integrator Source
• All Sources recreates the search index for all integrator sources that you have configured.
• The specific name of an integrator source that appears in the drop-down list.

9. Click Save.

Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Scheduling a utility
• Running a utility on demand
• Monitoring utility executions
• Creating a utility configuration

8.7 Creating a utility configuration

SAP BusinessObjects Information Steward provides a default configuration for each of the utilities. You
can define another configuration with different settings and still keep the default configuration for the
following utilities:

153 2011-04-06
Information Steward Utilities

• Compute Lineage Report utility


The default configuration for the Compute Lineage Report utility has Mode set to Optimized which
recalculates lineage information in the lineage staging table for only integrator sources that have
changed since the utility was last run. You might want to configure another Compute Lineage Report
utility with Mode set to Full to recalculate lineage information across all integrator sources.
• Update Search Index utility
The default configuration for the Update Search Index utility has Integrator Source set to All
Sources which rebuilds the search indexes in Metadata Management for all integrator sources. You
might want to configure another Update Search Index utility with Integrator Source set to only one
of the integrator sources to rebuild indexes for the metadata collected for only that integrator source.

Note:
You cannot define another configuration for the Calculate Scorecard and Purge utilities.

To define a new configuration for a utility:


1. Login to the Central Management Console (CMC) with a user name that belongs to any of the
following groups:
• Metadata Management Administrator
• Administrator
2. Select Applications from the navigation list at the top of the CMC Home screen.
3. Select Information Steward in the "Applications Name" list.
4. Click Action > Manage Utilities in the top menu tool bar.
5. On the "Utilities Configurations" screen, click Manage > New Utility Configuration... in the top
menu tool bar.
6. In the Utility Type drop-down list, select the utility you want to create a new configuration for.
7. Type a Name and Description for the utility.
8. If you want to recalculate the entire impact and lineage information across all integrator sources,
change the Mode default value from Optimized to Full.
9. If you want to to rebuild indexes for the metadata collected for only one integrator source, select its
name from the Integrator Source drop-down list.
10. Click Save.
The new name appears on the new Utility Configurations screen.

Related Topics
• Computing and storing lineage information for reporting
• Recreating search indexes on Metadata Management
• Scheduling a utility
• Running a utility on demand
• Monitoring utility executions
• Modifying utility configurations

154 2011-04-06
Server Management

Server Management

9.1 Server management overview

SAP BusinessObjects Information Steward uses the following types of servers:


• SAP BusinessObjects Business Intelligence platform servers that are collections of services running
under a Server Intelligence Agent (SIA) on a host. Information Steward uses the following servers
and services:
Enterprise Information Management Adaptive Processing Server which has the following services:
• Cleansing Package Builder Auto-analysis Service
• Cleansing Package Builder Core Service
• Cleansing Package Builder Publishing Service
• Information Steward Administrative Task Service
• Metadata Relationship Service
• Metadata Search Service
• Data Services Metadata Browsing Service
• Data Services View Data Service

Information Steward Job Server which has the following services:


• Information Steward Task Scheduling Service
• Information Steward Integrator Scheduling Service

• SAP BusinessObjects Data Services Job Server which executes the Data Insight profiling tasks.
You can create multiple Job Servers, each on a different computer, to use parallel execution for the
profiling tasks.
For a description of these servers and services, see Servers and services.

This section describes how to manage the above servers for Information Steward.

9.2 Verifying Information Steward servers are running

To verify that the SAP BusinessObjects Information Steward servers are running and enabled:
1. From the CMC Home page, go to the "Servers" management area.
2. Expand the Service Categories node and select Enterprise Information Management Servers.

155 2011-04-06
Server Management

The list of servers in the right pane includes a State column that provides the status for each server
in the list.
3. Verify that the following Enterprise Information Management Servers servers are “Running” and
“Enabled”.
• "EIMAdaptiveProcessingServer"
• "ISJobServer"
4. If a Enterprise Information Management Servers server is not running or enabled, do the following:
a. Select the server name from the list.
b. Open the Actions drop-down menu and select Start Server or Enable Server.

For information about the services that run under Enterprise Information Management Servers , see
the “Architecture” section.

Related Topics
• Services

9.3 Verifying Information Steward services

To verify that the SAP BusinessObjects Information Steward services were added:
1. From the CMC Home page, go to the "Servers" management area.
2. Expand the Service Categories node and select Enterprise Information Management Services.
The right pane lists the following servers:
• "EIMAdaptiveProcessingServer"
• "ISJobServer"
3. Ensure that the relevant services appear for "EIMAdaptiveProcessingServer".
a. Right-click "EIMAdaptiveProcessingServer" and click Stop Server.
b. Right-click "EIMAdaptiveProcessingServer" and click Select Services.
c. Verify that for each Information Steward feature that you installed, the list of services for
EIMAdaptiveProcessingServer includes the following services:

Services

Information Steward Job Server


Feature Installed EIM Adaptive Processing Server
(ISJobServer)

• Information Steward Task


Information Steward Admin Task Scheduling Service
Information Steward Task Server
Service • Information Steward Integrator
Scheduling Service

156 2011-04-06
Server Management

Services

Information Steward Job Server


Feature Installed EIM Adaptive Processing Server
(ISJobServer)

Information Steward Metadata Re-


Metadata Relationship Service None
lationship Service

Metadata Search Service Search Service None

• Cleansing Package Builder


Core service
Cleansing Package Builder Ser- • Cleansing Package Builder Au-
to-analysis service None
vice
• Cleansing Package Builder
Publishing service

• Information Steward Integrator


Scheduling Service
All Metadata Integrators None • Information Steward Integrator
Service

d. If any of the EIMAdaptiveProcessingServer services are not in the list on the right, select the
service name from the "Available services" list on the left, click > to add it, and click OK.
e. Right-click "EIMAdaptiveProcessingServer" and click Start Server.
4. Ensure that the relevant services appear for "ISJobServer".
a. Right-click "ISJobServer" and click Stop Server.
b. Right-click "ISJobServer" and click Select Services.
c. Verify that for each Information Steward feature that you installed, the list of services for
"ISJobServer" includes the services listed in the table in step 3c.
d. If any of the "ISJobServer" services are not in the list on the right, select the service name from
the "Available services" list on the left, click > to add it, and click OK.
e. Right-click "ISJobServer" and click Start Server.

9.4 Configuring Metadata Browsing Service and View Data Service

The installation process of SAP BusinessObjects Data Services configures the following services (under
the server EIMAdaptiveProcessingServer) with default settings.
• Metadata Browsing Service
• View Data Service
These services are used by SAP BusinessObjects Information Steward to connect and view data in
profiling sources. You might want to change the configuration settings to more effectively integrate
Information Steward with your hardware, software, and network configurations.

157 2011-04-06
Server Management

1. Go to the "Servers" management area of the CMC.


2. Expand "Service Categories" in the tree panel and select "Enterprise Information Management
Services".
3. Double-click computername."EIMAdaptiveProcessingServer " in the list in the right pane.
4. On the "Properties " window, scroll down until you find the service whose settings you want to change.
5. Make the changes you want, then click Save or Save & Close.
Note:
Not all changes occur immediately. If a setting cannot change immediately, the "Properties " window
displays both the current setting (in red text) and the updated setting. When you return to the Servers
management area, the server will be marked as Stale. When you restart the server, it will use the
updated settings from the Properties dialog box and the Stale flag is removed from the server.

Related Topics
• Metadata Browsing Service configuration parameters
• View Data Service configuration parameters

9.4.1 Metadata Browsing Service configuration parameters

You can change the following properties of the Metadata Browsing Service.

Server Configuration Pa-


Description Possible Values
rameter

Alphanumeric string with a


maximum length of 64. The
Service Name cannot contain
Service Name Name of the service configuration. any spaces.
Default value: MetadataBrows-
ingService

Maximum number of data source connections integer.


Maximum Data Source
that can be opened at any time under a service
Connections Default value: 200
instance.

Maximum number of attempts to launch a new


Retry attempts to launch
service provider when there is contention to ac- Default value: 1
Service Provider
cess a shared service provider.

Maximum duration which a stateful connection is


Stateful Connection
open. Stateful connections include SAP Applica- Default value: 1200
Timeout (seconds)
tions and SAP BW Source.

158 2011-04-06
Server Management

Server Configuration Pa-


Description Possible Values
rameter

Maximum duration which a stateless connection


Stateless Connection
is open. Stateless connections include all relation- Default value: 1200
Timeout (seconds)
al database sources.

Maximum number of requests that will be pro-


cessed by a service before the Data Services
Recycle Threshold Default value: 50000
backend engine is recycled to free memory that
was allocated for metadata browsing.

Enable or disable logging of trace messages to


Enable Trace Default is enabled.
the log file.

Collect Connection Enable or disable the collection of statistic infor-


Default is enabled.
Statistics mation for each open connection.

Port number used to communicate with the Data


Services backend engine. Four-digit port number that is
not currently in use.
Listener Port If you change the port number, you must restart
the EIMAdaptiveProcessingServer for the change Default value: 4010
to take effect.

Port number used for the JMX Connector.


Four-digit port number that is
If you change the port number, you must restart not currently in use.
JMX Connector Port
the EIMAdaptiveProcessingServer for the change
Default value: 4011
to take effect.

9.4.2 View Data Service configuration parameters

You can change the following properties of the View Data Service.

Server Configuration Pa-


Description Possible Values
rameter

Alphanumeric string with a


maximum length of 64. The
Service Name cannot contain
Service Name Name of the service configuration.
any spaces.
Default value: ViewData

159 2011-04-06
Server Management

Server Configuration Pa-


Description Possible Values
rameter

Port number used to communicate with the Data


Services backend engine.
Four-digit integer.
Listener Port If you change the port number, you must restart
Default value: 4012
the EIMAdaptiveProcessingServer for the change
to take effect.

Port number used for the JMX Connector.


Four-digit integer.
If you change the port number, you must restart
JMX Connector Port
the EIMAdaptiveProcessingServer for the change Default value: 4013
to take effect.

Minimum value: 1000


Size of the data to be stored in a view data re-
Batch Size (kilobytes) Maximum value: 50000
sponse.
Default value: 1000

Minimum number of shared Data Services back-


Minimum Shared Service
end engines that need to be launched at the Default value: 1
Providers
startup time of the service.

Maximum number of shared Data Services


Maximum Shared Service
backend engines that can be launched during the Default value: 5
Providers
time to service the view data requests.

Maximum number of dedicated Data Services


Maximum Dedicated Ser-
backend engines that can be launched at any in- Default value: 10
vice Providers
stant of time.

Maximum number of requests that will be pro-


cessed by a service before the Data Services Any integer.
Recycle Threshold
backend engine is recycled to free memory that Default value: 200
was allocated for viewing data.

Number of attempts to Number of attempts to be made to try launching


Default value: 1
launch service provider the Data Services backend engine instance.

Maximum number of minutes that a Data Services


Maximum idle time for
backend engine can remain without processing
shared service provider Default value: 120
any requests. After this time is exceeded, the
(minutes)
Data Services backend engine is shut down.

Specifies whether to enable or disable logging of


Enable Trace Default is enabled.
trace messages to the log file.

160 2011-04-06
Server Management

9.5 Job server group

The SAP BusinessObjects Data ServicesJob Server performs the profile and rule tasks on data in Data
Insight connections. The Information Steward Task Server sends Data Insight profile tasks to the Data
ServicesJob Server which partitions the data and uses parallel processing to deliver high data throughput
and scalability.

This section contains the tasks to manage Data ServicesJob Servers for Information Steward.

Related Topics
• Configuring a Data Services Job Server for Data Insight
• Adding Data Services Job Servers for Data Insight
• Displaying job servers for Information Steward
• Removing a job server

9.5.1 Configuring a Data Services Job Server for Data Insight

If you will run Data Insight profile and rule tasks, you must access the Data Services Server Manager
to create a job server and associate it with the Information Steward repository. This association adds
the job server to a pre-defined Information Steward job server group that Data Insight will use to run
tasks. For details about how job server groups improve performance, see the “Performance and Sizing
Considerations” section in the Administrator Guide.

To configure a job server and associate it with the Information Steward repository:
1. Access the Data Services Server Manager from the Windows Start menu:
Start > Programs > SAP BusinessObjects Data Services XI 4.0 > Data Services Server Manager
2. On the "Job Server" tab, click Configuration Editor.
3. On the "Job Server Configuration Editor" window, click Add.
4. In the "Job Server Properties" window, enter a name for Job Server name.
5. In the "Associated Repositories" section, click Add and fill in the "Repository Information" of the
Information Steward repository that you want to associate with this Job Server.
• Database type
• Database Server name
• Database name
• Username
• Password
6. Click Apply and OK.

161 2011-04-06
Server Management

The "Job Server Configuration Editor" window now displays the job server you just added.
7. Click Close and Restart to restart the job server with the updated configurations.
For more information about using the Data ServicesServer Manager, see “Server management” in the
SAP BusinessObjects Data Services Administrator's Guide.

9.5.2 Adding Data Services Job Servers for Data Insight

If you installed additional Data Services Job Servers on multiple computers, you can use them to run
Data Insight profile and rule tasks even though Information Steward is not installed on those computers.

On each computer where a Data Services Job Servers is installed, you must add a job server to the
pre-defined Information Steward job server group. For details, see Configuring a Data Services Job
Server for Data Insight.

For more information about job server groups, see “Performance and Sizing Considerations” in the
Administrator Guide.

Related Topics
• Displaying job servers for Information Steward
• Removing a job server

9.5.3 Displaying job servers for Information Steward

To display the job servers that are associated with your SAP BusinessObjects Information Steward
repository:
1. Log in to the Central Management Console (CMC) with a user name that belongs to the Metadata
Management Administrator group or the Administrator group.
2. At the top of the CMC Home screen, select Applications from the navigation list.
3. Select Information Steward Application in the "Applications Name" list.
4. Click Action > View Data Services Job Server in the top menu tool bar.
The "Job Server List" screen displays:
• The name of each Data Server Job Server associated with the Information Steward repository.
• The computer name and port number for each Data Server Job Server.

Related Topics
• Configuring a Data Services Job Server for Data Insight

162 2011-04-06
Server Management

9.5.4 Removing a job server

To remove a job server from the Information Steward job server group on Data Services:
1. On each computer where you installed additional Data Service Job Servers, access the Data Services
Server Manager from the Windows Start menu:
Start > Programs > SAP BusinessObjects Data Services XI 4.0 > Data Services Server Manager
2. On the "Job Server" tab, click Configuration Editor.
3. On the "Job Server Configuration Editor" window, select the name of the job server you want to
delete and click Delete.
4. In the "Job Server Properties" window, in the "Associated Repositories" section, ensure the name
of your Information Steward repository is selected and click Delete.
5. In the " Repository Information" section, enter the password of your Information Steward repository
and click Apply.
6. Click Yes on the prompt that asks if you want to remove persistent cache tables.
7. Click OK to return to the "Job Server Manager" window.
8. Click Close and Restart and then click OK to restart the Data Services job server with the updated
configuration.

Related Topics
• Configuring a Data Services Job Server for Data Insight
• Adding Data Services Job Servers for Data Insight
• Displaying job servers for Information Steward

163 2011-04-06
Server Management

164 2011-04-06
Performance and Scalability Considerations

Performance and Scalability Considerations

10.1 Resource intensive functions

The main Information Steward functions that influence performance and sizing are as follows.

Data profiling
Information Steward can perform basic and advanced data profiling operations to collect information
about data attributes like minimum, maximum values, pattern distribution, data dependency, uniqueness,
address profiling, and so on. These operations require intense, complex computations and are affected
by the amount data that is processed.

Validation rule processing


Information Steward validates source data against business rules to monitor data quality and generate
scores. This also is a computation heavy process and affects performance based on the number of
records and the number of rules.

Complexity of rules also affects performance. For example, if you have lookup functions in rule
processing, it takes more time and disk space. Typically, lookup tables are small, but if they are big, it
can adversely affect performance.

Cleansing Package Builder


The Auto-analysis service for Cleansing Package Builder is the process that is performed after a custom
cleansing package is initially created. Auto-analysis service analyzes the sample data that was provided
during the steps in the wizard to create a custom cleansing package. Some of the analyzing that is
performed is grouping similar records, and generates parsing rules based on analysis of combinations
of parsed values found in the data. With a large number of records and parsed values, this becomes
very computation and memory intensive.

Metadata integrators
Metadata integrators collect metadata about different objects in the source systems and store it in the
Information Steward repository. If a large amount of metadata is being collected, it can affect performance
and the size of the repository.

Lineage and impact analysis


After the metadata is collected, Information Steward can do end-to-end lineage and impact calculations.
Typically, these are periodically run as utilities to capture the latest information. If the amount of metadata
is large, this calculation can affect performance.

165 2011-04-06
Performance and Scalability Considerations

Metadata browsing and view data


You can browse metadata for tables and columns and also view source data. If table has a large number
of columns, the View Data operation is affected because a lot of data must be accessed, transferred,
and displayed in the user interface.

Related Topics
• Factors that influence performance and sizing
• Scalability and performance considerations
• Best practices for performance and scalability

10.2 Architecture

The following diagram shows the architectural components for SAP BusinessObjects Business
Intelligence platform, SAP BusinessObjects Data Services, and SAP BusinessObjects Information
Steward. The resource-intensive servers and services are indicated with a red asterisk:
• Data Services Job Server processes data profiling and rule validation tasks in Data Insight.
• Cleansing Package Builder Auto-analysis Service analyzes sample data and generates parsing
rules for Cleansing Package Builder.
• Information Steward Integrator Scheduling Service processes Metadata Integrators.
• Metadata Relationship Service performs lineage and impact analysis.
• Data Services Metadata Browsing Service obtains metadata from Data Insight connections.
• Data Services Viewdata Service obtains the source data from Data Insight connections.
• Web Applicaiton Server handles requests from users on the web applications for Information Steward.
For more details about these servers and services, see Servers and services.

166 2011-04-06
Performance and Scalability Considerations

10.3 Factors that influence performance and sizing

Use Information Steward to work on large amounts of data and metadata. The following factors affect
the performance and sizing. These factors affect the required processing power (CPUs), RAM, and
hard disk space (for temporary files during processing) and the size of the Information Steward repository.

Related Topics
• Data characteristics and profiling type
• Number of concurrent users

10.3.1 Data Insight

167 2011-04-06
Performance and Scalability Considerations

10.3.1.1 Amount of data

The amount of the data is calculated using the number of records and the number of columns. In general,
the more data that is processed, the more time and resources it requires. This is true for profiling and
rules validation operations. The data may come from one or more sources, multiple tables within a
source, views, or files.

This factor affects the required CPU, RAM, and hard disk space and the Information Steward repository
size. The larger the record size, the more resources are required for efficient processing.

If the data being processed has many columns, it requires more time and resources. If you are doing
column profiling on many columns or if the columns being processed have long textual data, it affects
performance. In short, if the record length is more, more resources are required.

Related Topics
• Factors that influence performance and sizing

10.3.1.2 Data characteristics and profiling type

For distribution profiling such as Value, Pattern, or Word distribution, if the data has many distinct values,
patterns, or words, it requires more resources and time to process.

This factor affects the required CPUs, RAM, and Information Steward repository size. The more distinct
the data, the more resources required.

Related Topics
• Factors that influence performance and sizing

10.3.1.3 Number of concurrent users

Information Steward is a multi-user web-based application. As the number of concurrent users increases,
it affects practically all aspects of the application. More users may mean more profiling tasks, more
scorecard views and rule execution, and so on. The key word is concurrent.

168 2011-04-06
Performance and Scalability Considerations

If all users run tasks concurrently, it affects the required CPUs, RAM, and hard disk and the Information
Steward repository size. If most of the users are just viewing the scorecard, then the performance
depends more on the web application server where the Information Steward repository is created.

Related Topics
• Factors that influence performance and sizing

10.3.2 Metadata Management

10.3.2.1 Number of metadata sources

Metadata integrators collect metadata from various sources. As the number of metadata sources
increases, it takes more resources and time. This factor affects the required CPUs, RAM, and Information
Steward repository size.

Related Topics
• Factors that influence performance and sizing

10.3.2.2 Number of concurrent users

If multiple users view impact and lineage information concurrently, it affects the response time.

Related Topics
• Factors that influence performance and sizing

10.3.3 Cleansing Package Builder

169 2011-04-06
Performance and Scalability Considerations

10.3.3.1 Amount of sample data

For Cleansing Package Builder, the amount of sample data used to create the cleansing package affects
the performance of the Auto-analysis service. This in turn can affect response time for the user interface.

Related Topics
• Factors that influence performance and sizing

10.3.3.2 Data characteristics

When you create custom cleansing packages, if there are many parsed values per row, it requires more
resources and time to analyze and create parsing rules. For Cleansing Package Builder, the larger the
data set, the more CPU processing power is required. The more parsed values per row, the more RAM
is required.

Related Topics
• Factors that influence performance and sizing

10.3.3.3 Number of concurrent users

If there many users create custom cleansing packages concurrently, it requires more CPUs and RAM.

Related Topics
• Factors that influence performance and sizing

10.4 Scalability and performance considerations

170 2011-04-06
Performance and Scalability Considerations

Information Steward uses SAP BusinessObjects Business Intelligence platform and Data Services
platform for most of the heavy computational work. It inherits the service-oriented architecture provided
by these platforms to support a reliable, flexible, highly available, and high performance environment.

Here are some features and recommendations for using the platforms for performance. These are not
mutually exclusive, nor are they sufficient by themselves in all cases. You should employ a combination
of the following ways to improve throughput, reliability, and availability of the deployment.

Related Topics
• Resource intensive functions
• Factors that influence performance and sizing
• Best practices for performance and scalability
• Information Steward web application
• Scheduling tasks
• Queuing tasks
• Degree of parallelism
• Grid computing
• Multi-threaded file read
• Data Insight result set optimization
• Performance settings for input data
• Settings to control repository size
• Settings for Metadata Management
• Settings for Cleansing Package Builder

10.4.1 Scalability levels in deployment

A typical Information Steward environment can be scaled at the following levels:


1. Web tier level: Deploy multiple instances of Information Steward web application for high availability
and large numbers of concurrent users.
2. Business Intelligence platform and Information Steward services level: Deploy multiple Business
Intelligence platform services and/or Information Steward services in a distributed environment for
load balancing and scalability. For example, you can deploy multiple metadata integrators on different
servers for load balancing. You can also have multiple Information Steward task servers on different
servers for high availability.
3. Data Services Job Server level: A Data Services Job Server group used for a given Information
Steward deployment can have one or more Data Services Job Servers added to it. As the need for
Data Services processing increases (for Data Insight operations), you can scale by adding more
Data Services Job Servers to the Job Server group.

Related Topics
• Scalability and performance considerations
• Information Steward web application

171 2011-04-06
Performance and Scalability Considerations

10.4.2 Distributed processing

Business Intelligence platform provides a distributed scalable architecture. This means that services
that are needed for a specific functionality can be distributed across machines in the given landscape.
As long as the services are in the same Business Intelligence platform environment, it doesn't matter
which machine they are on; they just need to be in the same CMS cluster. The Information Steward
web application and the Information Steward repositories can be on different machines.

Information Steward uses some Business Intelligence platform services, and also has its own services
that can be distributed across machines for better throughput. The general principle is that if one of the
services needs many resources, then it should be on a different machine. Similarly, if you add capacity
to existing hardware, it can be used for more than one service.

The Data insight module of Information Steward uses the Data Services Job Server, which supports
distributed processing.

This section offers some recommendations on different combinations of Information Steward services
that can be combined or decoupled.

Related Topics
• Scalability and performance considerations
• Data Insight related services
• Metadata Management related services
• Cleansing Package Builder related services
• Information Steward repository
• Information Steward web application
• Grid computing

10.4.2.1 Data Insight related services

The most important part of the processing for Data insight is done by the Data Services Job Server.
You can install Data Services Job Servers on multiple machines and make them part of the single job
server group that is used for Information Steward. The profiling and rules tasks are distributed by the
Information Steward Job Server to the Data Services Job Server group. The actual tasks are executed
by a specific Data Services Job Server based on the resource availability on that server. So, if one
server is busy, the task can be processed by another server. This way, multiple profiling and rule tasks
can be executed simultaneously.

Related Topics
• Scalability and performance considerations

172 2011-04-06
Performance and Scalability Considerations

• Information Steward web application

10.4.2.2 Metadata Management related services

For Metadata Management, two important services from a performance perspective are the metadata
relationship service and metadata integrators.

Generally, these two services should be on separate servers (or should run at different times, if they
are on the same server). If there are many metadata integrators and they collect a lot of metadata from
different sources, each one of them could be on a separate server.
Schedule intensive processing integrators to run at non-commercial hours.

Related Topics
• Scalability and performance considerations
• Information Steward web application

10.4.2.3 Cleansing Package Builder related services

The Auto-analysis service is a resource intensive service in Cleansing Package Builder. The CPU and
RAM requirements depend on the number of parsed values found in the data. Also, if multiple concurrent
users create large cleansing packages, it is recommended that you dedicate one server to the
Auto-analysis service and allocate enough memory to the Java process.

Related Topics
• Scalability and performance considerations
• Information Steward web application
• Common runtime parameters for Information Steward

10.4.2.4 Information Steward repository

The Information Steward repository stores all of the metadata collected, profiling and rule results, and
sample data. The repository should be on a separate database server. To avoid resource contention,

173 2011-04-06
Performance and Scalability Considerations

the database server should not be the same database server that contains the source data. This may
or may not be on the same database server that hosts the Business Intelligence platform repository.

The Information Steward repository should be on the same subnetwork as the Data Services Job Server
that processes large amounts of data and the metadata integrator that processes the largest amount
of metadata.

Related Topics
• Scalability and performance considerations
• Information Steward web application

10.4.2.5 Information Steward web application

Typically all web applications are installed on a separate server with other web applications. No other
services or repositories are typically installed with the web application, so that response time for
Information Steward user interface users is not affected by services that process data.

Related Topics
• Scalability and performance considerations
• Information Steward web application

10.4.3 Scheduling tasks

Information Steward runs many tasks that are resource intensive. Business Intelligence platform provides
the ability to schedule them. This ability can be used to distribute the tasks so that the same resources
can be utilized for multiple purposes. The following can be scheduled:
• Profiling tasks
• Rule tasks
• Metadata integrators
• Calculate Scorecard utility
• Compute Lineage report utility
• Purge utility
• Update search index utility

If you schedule these tasks so that they run at different times and when few users access the system,
you can achieve good performance with limited resources. This time slicing is highly recommended for
profiling, rules tasks, and metadata integrators. For example, if you have users that process profiling

174 2011-04-06
Performance and Scalability Considerations

tasks on demand during business hours, then the metadata integrators and rules task should be
scheduled during non-business hours.

If there are large profiling jobs, they should be scheduled during non-business hours and ideally on a
dedicated powerful server.

Related Topics
• Scalability and performance considerations
• Scheduling a task
• Scheduling a utility

10.4.4 Queuing tasks

Information Steward can queue when there are many Data Insight tasks that are requested to run at
the same time. This depends on the user configuration for the Average Concurrent Tasks option.
Based on this setting and the number of Data Services Job Servers in the group, Information Steward
calculates the total number of tasks allowed to run simultaneously in a given landscape. Only that many
tasks are sent to the Data Services Job Server group for processing. The remaining tasks are queued.
As soon as one of the running tasks finishes, the next task in the queue is processed.

Using this setting, you can control how many Data Insight-related processes are running so that the
resources can be utilized and scheduled for other processes running on the system.

Related Topics
• Scalability and performance considerations
• Configuration settings

10.4.5 Degree of parallelism

For Data Insight functionality, Information Steward uses the Data Services engine. The Data Services
engine supports parallel processing in multiple ways, one of which is Degree of Parallelism (DOP). The
basic idea is to split a single Data Services job into multiple processing units and utilize available
processing power (CPUs) on a server to work on those processing units in parallel. The distribution of
work is different for profiling vs. rule processing.

Note:
• DOP is only used for Data Insight functionality for column profiling and rule processing. Metadata
Management and Cleansing Package Builder do not use DOP.

175 2011-04-06
Performance and Scalability Considerations

• In general, do not set the DOP more than the number of available CPUs. To fine tune the performance,
set the DOP value based on the number of concurrent tasks and available hard disk and RAM
resources. Gradually increase the value of DOP to reach an optimal setting. For more information,
see the SAP BusinessObjects Data Services Performance Optimization Guide.

Related Topics
• Scalability and performance considerations
• Column profiling
• Rule processing
• Hard disk requirements
• When to use degree of parallelism

10.4.5.1 Column profiling

For column profiling, the task is distributed proportionately for different number of columns. The formula
is: Number of execution units = Number of Columns / DOP.

For example, if there is a column profiling task for 100 million rows with 20 columns, with DOP = 4, the
task will be broken down in execution units that work on 5 columns each for all 100 million rows. The
data is "partitioned" for 5 columns each for each execution unit to work on.

Using multiple CPUs simultaneously causes the throughput to increase proportionately.

Related Topics
• Scalability and performance considerations
• Degree of parallelism

10.4.5.2 Advanced profiling

Advanced profiling tasks (dependency, redundancy, and uniqueness) require complex sorting operations,
so it is important to optimize the degree of parallelism settings. The degree of parallelism setting is used
to execute sorting operations in parallel sorting operations to increase throughput.

Related Topics
• Scalability and performance considerations
• Degree of parallelism

176 2011-04-06
Performance and Scalability Considerations

10.4.5.3 Rule processing

For rule processing, the number of execution units is proportional to the number of rules (as opposed
of number of columns for profiling). The formula is: Number of execution units = Number of rules / DOP.

For example, if there is a rule execution task for 100 million rows with 20 rules, with DOP = 4, the task
will be broken down in execution units that processes 5 rules each for all 100 million rows.

Related Topics
• Scalability and performance considerations
• Degree of parallelism

10.4.5.4 Hard disk requirements

Adding CPUs and increasing DOP does not guarantee improved throughput. When many processes
run in parallel, they also share other hardware resources such RAM and hard disk.

When the Data Services engine processes data, it creates temporary work files in the Pageable Cache
Directory. Naturally, if many processes are running simultaneously, all of them create temporary files
in the same location. Because this directory is accessed by all of the processes simultaneously, there
is a potential for disk contention. In most environments, depending on the hard disk capacity and speed,
you will hit a ceiling after which an increase in DOP will not improve performance proportionately.

Therefore, it is important to enhance all aspects of the hardware at the same time: the number of CPUs,
hard disk capacity, speed, and RAM. You should have a very efficient disk access to go along with the
increased number of DOP. Make sure the pageable cache directory is set accordingly.

Related Topics
• Scalability and performance considerations
• Degree of parallelism
• Grid computing

10.4.5.5 When to use degree of parallelism

Higher DOP should be employed in the following conditions:

177 2011-04-06
Performance and Scalability Considerations

• When you have only a few very powerful machines with a lot of processing power, RAM, and fast
disk access.
• When you have a large amount of data for profiling or rule tasks.

Note:
• If you run many profile tasks simultaneously with DOP > 1, each of them could be split in multiple
execution units. For example, 4 tasks and DOP 4 could result in 16 execution units. Now there are
16 processes competing for resources (CPU, RAM, and hard disk) on the same machine. So it
always a good idea to schedule jobs efficiently or to use multiple Data Services Job servers.
• DOP is a global setting and affects the entire landscape.

Related Topics
• Scalability and performance considerations
• Degree of parallelism
• Grid computing

10.4.6 Grid computing

You can perform grid computing using the Data Services Job Server group. This group is a logical
group of multiple Data Services Job Servers. When you install Information Steward, you can assign a
single Data Services Job Server group for that Information Steward instance. There are two ways you
can utilize the Job Server group with Distribution level setting: distribution level table and distribution
level sub-table.

A single profiling or rule task can work on one or more “tables”. The term “table” is used in a general
sense of the number of records with rows and columns. In reality, this can come from an RDBMS table,
a flat file, an SAP application, and so on.

Note:
DOP and distribution level are global settings and affect the entire landscape.

Related Topics
• Scalability and performance considerations
• Distribution level table
• Distribution level sub-table
• When to use grid computing

178 2011-04-06
Performance and Scalability Considerations

10.4.6.1 Distribution level table

When you set the distribution level to Distribution level table, each table of the task is executed on
separate Data Services Job Server in the group. If you have set DOP, it is effective on the independent
machines and one task on that particular server could be further distributed.

For example, if you have 8 tables in a task and it is submitted to a Data Services Job Server group with
8 Data Services Job Server, then each Data Services Job Server processes one table. If the DOP is
set to 4, then each Data Services Job Server tries to parallelize the task into 4 execution units, one for
each particular table. There is no interdependency between the different job servers; they share no
resources.
When a Data Services Job Server group receives a task that involves multiple tables and the distribution
level is set to Table, it uses an intelligent algorithm that chooses the Data Services servers based on
the available resource. If a particular server is busy, then the task is submitted to a relatively less busy
Data Services server. These calculations are based on the number of CPUs, RAM, and so on. If you
have two Data Services servers, one with many resources and another with low resources, it is quite
possible that the bigger server gets a proportionally higher number of tasks to execute.

You can also choose servers based purely in a "round robin" fashion, in which case the task is submitted
to the next available Data Services server. For more information, see theSAP BusinessObjects Data
Services Administrator Guide.

Related Topics
• Scalability and performance considerations
• Grid computing

10.4.6.2 Distribution level sub-table

Note:
This setting is only effective for column profiling. Use this setting with caution, as it may have a negative
impact on performance of other types of profiling, such as advanced profiling.

When you set the distribution level to Distribution level sub-table, based on the DOP setting Data
Services distributes a single table task across multiple machines in the network. The basic idea is to
split a single task into multiple independent execution units and send them to different Data Services
Job Servers for execution. You can think of this as DOP but across multiple machines rather than
multiple CPUs of a single machine.

For example, you have a column profiling task for 100 million rows 40 columns, the distribution level is
set to Sub-Table, the DOP is 8, and there are 8 Data Services Job Servers in the group. The task is

179 2011-04-06
Performance and Scalability Considerations

split into 8 execution units for 5 columns each and sent to the 5 Data Services Job Servers, which then
can execute it in parallel. There is no sharing of CPU or RAM. But all of the Data Services Job Servers
share the same pageable cache directory. You must ensure this directory location is shared and
accessible to all Data Services Job Servers . This location should have a very efficient disk and the
network on which this setup is done should be very fast so that it does not become bottleneck; otherwise,
the gains of parallel processing will be negated.

Related Topics
• Scalability and performance considerations
• Grid computing

10.4.6.3 When to use grid computing

The principles are similar to DOP, but with the additional aspects of distribution level.
• Distribution level table: Use when you have many concurrent profiling and rules tasks that work on
large amount of data. Set the distribution level to Table, so that individual tasks are sent to different
servers.
• Distribution level sub-table: Use when you have a very few column profiling tasks on very large
amount of data. In this case, use distribution level "Sub-table". It is important that these machines
share an efficient hard disk and that they are connected by a fast network.

Related Topics
• Scalability and performance considerations
• Grid computing

10.4.7 Using SAP applications as a source

SAP BusinessObjects Information Steward allows direct integration with the SAP Business Warehouse
and the SAP ERP Central Component (ECC) system. One of the main benefits is the ability to connect
directly to the production SAP systems to perform data profiling and data quality analysis based on the
actual, most timely data, instead of connecting to a data warehouse, which is loaded infrequently. To
fully utilize this advantage without risking the performance and user experience on the production ECC
system, consider these requirements.

To use utilize the back-end resources in the most efficient and sustainable way, the connection user
defined for the interaction between Information Steward and the SAP back-end system should be linked
to background processing. This is the recommended setup for the connection user.

180 2011-04-06
Performance and Scalability Considerations

The option to use a dialog user for the connection between Information Steward and the SAP back-end
system should be considered carefully and should only be considered for smaller datasets not exceeding
50,000 records from a medium width table. In this case, the synchronous processing will block a dialog
process for the processing time and its resources. Using this approach on larger data sets would require
changing the heap size for the maximum private memory for a dialog process and therefore has impact
on the overall memory required by the ECC system. In addition, the extraction of larger sets of data will
most likely exceed the recommended maximum work process runtime for dialog process on the ABAP
Web Application server significantly. Depending on the amount of records and number of columns of
the table the data is retrieved from, the extraction process runtime will vary. Due to the significant
resource required and the extended runtime and allocation of a dialog process the usage of dialog user
for the communication is not recommended.

You can connect to the SAP ECC system and retrieve data for profiling and data quality analysis using
one of the following methods.
• Transfer the data directly via the RFC connection established.
In this case the data is stored in an internal table first and then transferred via the synchronous RFC
connection.
• Asynchronous processing scenario 1.
The extracted data is written into a file to a shared directory for both the ABAP Web Application and
the Information Steward. After the extraction is completed, Information Steward picks up the file for
further processing.

For this first asynchronous processing scenario, specify the SAP Applications connection parameters
Working directory on SAP server and Application Shared Directory, and set Data transfer
method to Shared directory.
• Asynchronous processing scenario 2.
In this case the data is written to directory on the back-end server, which is made available to
Information Steward via an FTP server connection. Similar to the second scenario, the data files
are then picked up by Information Steward from the directory specified, in this case via the FTP
connection path.

For this second asynchronous processing scenario, specify the Working directory on SAP server
and the FTP parameters when you define the SAP Applications connection, and set Data transfer
method to FTP.

The data transfer via option two and three have slightly longer turnaround times, but consumes fewer
resources on the back-end system.

It is recommended that you perform optimized requests for data extraction from the SAP system when
you define Information Steward profiling tasks. This optimization should be considered in all scenarios.
Therefore, specify only a subset of the columns in your profile task and use filters to focus on a specific
data set. The more optimized the request, the faster the extraction and overall process execution. In
this case, Information Steward only requests the data for extraction as defined by the filter. The filter
criteria is passed to the SAP system.

Related Topics
• SAP Applications connection parameters

181 2011-04-06
Performance and Scalability Considerations

• SAP application connection best practices


• General best practices
• Profiling and rules
• Organizing projects

10.4.8 Multi-threaded file read

When reading flat files, the Data Services engine can parallelize the tasks in multiple threads to achieve
higher throughput. With multi-threaded file processing, the Data Services engine reads large chunks
of data and processes them simultaneously. This way, the CPUs are not waiting for the file input/output
operations to finish.

You can set the number of file processing threads to the number of CPUs available. Use this setting
when you have a large file to run profile or rule task on.

Related Topics
• Scalability and performance considerations

10.4.9 Data Insight result set optimization

This optimization is applicable to all of the scheduled profile and rule tasks. Information Steward provides
the ability to optimize redundant tasks for Data Insight. If the profile or rule result set for a particular
table is already available, then it is not processed. This is controlled by the Optimization Period setting.
Data in certain tables do not change very often or to refresh the result set is at a specified frequency.
It is not required to execute that again within that time period. If the data is not going to change in a
profiling task, there is no need to process the task again. Similarly, if the scorecards are calculated only
on a nightly basis, there is no need to recalculate the score.

Suppose the Optimization Period is set to 24 hours. A rule task was executed for Table1 and the
result set is already stored. If that same task is tried again within 24 hours, it is not processed again.
Imagine another case where a single rule task involves Table1 and Table2. In this case, rules are not
executed on Table1, but they are processed for Table2 because this table does not have a result set
available.

If you want to always get the latest results due to the changing nature of the data, set the Optimization
Period to the expected period of change in data.
Note:
Any profiling or rule tasks that are run on demand do not use this optimization and all of the data is
processed.

182 2011-04-06
Performance and Scalability Considerations

Related Topics
• Scalability and performance considerations

10.4.10 Performance settings for input data

For Data Insight functionality, Information Steward provides the following settings to improve performance.

Related Topics
• Scalability and performance considerations
• Input data settings

10.4.10.1 Input data settings

These settings are available for both profile and rule tasks when you create a task. Defaults are set in
"Configure Applications".

Max Input Size and Input Sampling Rate


When you create a rule or profiling task, specify the maximum rows and the rate at which you want to
process them. Because processing time and resource requirements are proportional to the number of
records being processed, these settings are very important. Set these numbers only to what is required
for the task. For example, if you know that there are 100 million rows in a table and you want to get a
sense of your data profile quickly, set the Max Input Size to 1 million rows and the Sampling Rate to
100. Every 100th record will be processed up to maximum of 1 million records.

Filter condition
You can also control exactly which data gets processed using the filter condition. Because the number
of records affects performance, set the filter condition and process only the amount of data required.

Suppose you have 10 million records for all countries and there are 1 million records for the U.S. If you
are interested in profiling data for the U.S. only, you should set the filter for country = US. This way,
only 1 million records are processed.

You can combine this with Max Input Size and Sampling Rate to further improve performance, if
applicable.

Related Topics
• Scalability and performance considerations
• Performance settings for input data
• Configuration settings

183 2011-04-06
Performance and Scalability Considerations

10.4.11 Settings to control repository size

For Data Insight functionality, Information Steward provides the following settings to control the size of
the Information Steward repository. This depends on the number of records as well, because the more
data, the bigger the potential result set. However, the repository size can be controlled for even very
large amounts of data.

Related Topics
• Scalability and performance considerations
• Profiling
• Rule processing
• Metadata Management

10.4.11.1 Profiling

The following settings affect the size of the Information Steward repository. Choose these numbers
carefully based on what your data domain experts require to understand data. The lower the number,
the smaller the repository.

You can set a high number, but the repository size increases. Also, response time for viewing sample
data is affected because more rows need to be read from the database, transported over the network,
and rendered in the browser.
• Max sample data size: The sample size for each profile attribute.
• Number of distinct values: The number of distinct values to store for the value distribution result.
• Number of patterns: The number of patterns to store for the pattern distribution result.
• Number of words: The number of words to store for the word distribution result.
• Results retention period: Controls the number of days before the profiling results are deleted. The
longer you keep the results, the bigger the repository. For more information, see the Purge utility.

Related Topics
• Scalability and performance considerations
• Settings to control repository size
• Profiling task settings and rule task settings

184 2011-04-06
Performance and Scalability Considerations

10.4.11.2 Rule processing

Max sample data size: This setting controls the number of failed records to save for each rule. The
higher the number, the more records that are available to view as sample data. This results in a larger
repository size, and the response time for viewing sample failed data is affected.

Score retention period: This setting controls the number of days before the scores are deleted. The
longer you keep the score data, the larger the repository size. For more information, see the Purge
utility.

Related Topics
• Scalability and performance considerations
• Settings to control repository size
• Profiling task settings and rule task settings

10.4.11.3 Metadata Management

The size of the Information Steward repository is also controlled by the amount of metadata that is
collected and retained. Optimize the amount by selectively choosing the components that you are
interested in for different metadata integrators.

Related Topics
• Scalability and performance considerations
• Settings to control repository size

10.4.12 Settings for Metadata Management

10.4.12.1 Runtime parameter for metadata integrators

Here is a list of parameters that affect performance:

185 2011-04-06
Performance and Scalability Considerations

1. JVM arguments for metadata integrators that collect a large amount of metadata should be adjusted
for higher memory allocation. It is recommended that you update the JVM parameters on the integrator
parameters page to -Xms1024m -Xmx4096m.
2. Run-time parameters for the maximum number of concurrent processes to collect metadata should
be set to the number of CPUs that can be dedicated for metadata collection. Typically metadata
integrators are installed on independent servers, so you can set it to the number of CPUs on the
server. These parameters for parallel processing may be different for different metadata integrators.
3. Each metadata integrator provides some method of performance improvement specifically for the
type of metadata that it collects. For example, with SAP BusinessObjects Enterprise Metadata
Integrator, you can reduce processing time by selectively choosing different components.
4. For the first time the integrator is run, the run-time parameter Update Option for metadata integrators
should be set to Delete existing objects before collection. For subsequent runs, change it to
Update existing objects and add newly selected objects. For example, for SAP BusinessObjects
Enterprise Metadata Integrator, you can first collect metadata only for Web Intelligence documents
(by selecting only that component as specified in step 3). For the first run, set the option to Delete
existing. In the next run, you may collect all of the Crystal Reports metadata. For the second run,
set the option to Update existing

Related Topics
• Scalability and performance considerations
• Common runtime parameters for Information Steward

10.4.12.2 Utility configurations

The Metadata Management utilities Compute Lineage Report and Update Search index have some
configuration parameters. Run Compute Lineage in Optimized mode so that it is updated incrementally.

Related Topics
• Scalability and performance considerations

10.4.13 Settings for Cleansing Package Builder

10.4.13.1 JVM runtime parameters

186 2011-04-06
Performance and Scalability Considerations

If there are many parsed values (more than 20) per row in the data being used for Cleansing Package
Builder, then the Auto-Analysis service requires adjustment to the JVM runtime parameter that controls
memory allocation for EIMAPS. If the memory cap for Java is left at 1GB, even though the system has
16GB, the service will run out of memory. The best practice is to allocate 2-3GB of memory to the Java
services via the -Xmx setting.

Related Topics
• Scalability and performance considerations

10.4.13.2 Publishing cleansing packages

Schedule cleansing package processing during non-business hours. Depending on the size of the
cleansing package, publishing can be a time consuming task. SAP-supplied cleansing packages for
Name-Title-Firm are typically very large. If they are updated and published, schedule the processing
during non-business hours. It may be a good idea to have a dedicated server for publishing, if this is a
frequent occurrence.

Related Topics
• Scalability and performance considerations

10.5 Best practices for performance and scalability

10.5.1 General best practices

• When using a distributed environment, enable and run only the servers that are necessary. For more
information, see the Business Intelligence Platform Administrator Guide.
• Use dedicated servers for resource intensive servers like the Data Services Job Server, Metadata
Integrators, and the Cleansing Package Builder Auto-Analysis service.
• Install the Information Steward Web Application on a separate server. The Business Intelligence
platform Web Tier must be installed on the same computer as the Information Steward Web
Application. If you do not have Tomcat or Bobcat, you need to manually deploy the Information
Steward Web Application.

187 2011-04-06
Performance and Scalability Considerations

• If you have many concurrent users, you can use multiple Information Steward web applications with
Load Balancer.
• To obtain a higher throughput, the Information Steward repository should be on a separate computer
but in the same sub-network as the Information Steward Web applications, Enterprise Information
Management Adaptive Processing Server, Information Steward Job Server, and Data Services Job
Server.
• Make sure that the database server for the Information Steward repository is tuned and has enough
resources.
• Allocate enough memory and hard disk space to individual servers as needed.
• Follow good scheduling practices to make sure that resource intensive tasks do not overlap each
other. Schedule them to run during non-business hours so that on-demand request performance is
not affected.

Related Topics
• Installation Guide: Deploying web applications with WDeploy
• SAP BusinessObjects Business Intelligence platform Web Application Deployment Guide: Failover
and load balancing
• Data Insight best practices
• Metadata Management best practices
• Cleansing Package Builder best practices

10.5.2 Data Insight best practices

10.5.2.1 Profiling and rules

1. If you expect your Data Insight profiling and rule tasks to consume a large amount of processing,
deploy the Data Services Job Server on a separate computer.
2. To improve the execution of Data Insight profiling tasks, the Data Services Job Server can be on
multiple computers that are separate from the web application server to take advantage of Data
Services job server groups and parallel execution. You must access the Data Services Server
Manager on each computer to do the following tasks:
• Add a Data Services Job Server and associate it with the Information Steward repository. For
more information, see “Adding Data Services Job Servers for Data Insight” in the Administrator
Guide.
• Specify the path of the pageable cache that will be shared by all job servers in the Pageable
cache directory option.

188 2011-04-06
Performance and Scalability Considerations

3. For a predictable distribution of tasks when using multiple Data Services Job Servers, try to ensure
that the hardware and software configurations are homogeneous. This means that they should all
have similar CPU and RAM capacity.
4. Irrespective of using DOP and/or multiple Data Services Job Servers, set the pageable cache
directory on high speed and high capacity disk.
5. If you are processing flat files, store them on a high speed disk so that read performance is good.
6. Process only data that must be processed. Use settings such as Max Input Size, Sampling rate,
and Filter conditions appropriately.
7. When using Information Steward views, use correct join and filter conditions so that you are pulling
in only required rows.
8. Choose only columns that you are interested in profiling or the rules that you are interested in
calculating for the score. Selecting all columns may lead to redundant processing.
9. Word distribution profiling is done only on a few columns. Do not choose this for all columns.
Otherwise, performance and the size of the Information Steward repository is affected.
10. If you have lookup functions in rule processing, it takes more time and disk space. Typically, lookup
tables are small, but if they are big, lookup tables can adversely affect performance. Ensure that
the tables on which lookup is performed are small. As an alternative to lookup, you can use the SQL
function.
11. Choose the DOP setting and the distribution level carefully. Remember that these are global settings
and affect the entire landscape.
12. When doing column profiling on large amounts of data with many small capacity Data Services Job
Servers, only use the distribution level Sub-Table. This setting can have an adverse effect on other
types of tasks. You may want to change it back to Table level after that column profiling task is done.
13. Store the reference data required for address profiling on a high speed and high capacity disk.
14. Make sure that the database server that contains source data is tuned and has enough resources.
15. Schedule the Purge utility to run during non-business hours. If column profiling and rules are executed
many times a day, try to schedule the Purge utility to run more than once, so that it can increase
free disk space in the repository.

Related Topics
• General best practices
• Using SAP applications as a source
• Organizing projects
• User Guide: Creating a rule

10.5.2.2 SAP application connection best practices

1. Choose the data retrieval method (synchronous or asynchronous) based on the data set size and
performance requirements of your SAP system.
2. For smaller data sets, use the synchronous method.

189 2011-04-06
Performance and Scalability Considerations

3. For larger data sets, use the asynchronous method, where the data from SAP systems is written
into a file that Information Steward uses.

It is recommended to use background processing on the SAP back-end, which can be controlled by
the user type of the connection the user defined.

Dialog processing should be considered carefully. In this case, adjust the run-time parameter such as
heap size for th maximum private memory and maximum work process runtime.

Related Topics
• SAP Applications connection parameters

10.5.2.3 Organizing projects

How scorecards and projects are organized depends on how business users want to view the scorecards.
But this organization can affect the response time for the users. If a project contains many scorecards,
details of all the scorecards must be retrieved for viewing. So it is good idea to create multiple projects
according to the area of interest and have a limited number of scorecards in those projects.

For example, you could create projects for different geographical locations. Within each project, you
could have different scorecards, such as Customer, Vendor, and so on. Or you could create projects
based on Customer, Vendor, and so on, and then have scorecards based on geography.

The organization also helps you decide what data sources you want to use and the filter conditions
involved for profiling and rule execution.

Another benefit of proper organization is that you can control the user security per project and restrict
access to only specific users.

To avoid future problems, different Data Insight user groups should work together in the beginning of
the project to decide these aspects.

Related Topics
• General best practices
• Profiling and rules
• Using SAP applications as a source

10.5.3 Metadata Management best practices

190 2011-04-06
Performance and Scalability Considerations

1. Metadata integrators for BusinessObjects Enterprise, Data Services, and SAP Business Warehouse
should be installed on their own dedicated servers if they require large processing time or they run
in overlapping time periods with other metadata integrators or Data Insight tasks.
2. The Metadata Relationship Service and Metadata Search Service can be combined.
3. Additional guidelines to consider for Metadata Relationship Service:
• Should be on a separate computer than the web application server to obtain higher throughput.
• Can be on its own computer, or it can be combined with any Metadata Integrator. The rationale
for this combination is that Metadata Integrators usually run at night or other non-business hours,
and the Metadata Relationship Service runs during normal business hours when users are viewing
relationships (such as impact and lineage) on the Information Steward web application.
4. Another guideline to consider for Metadata Search Service:
• Can be on its own computer, or it can be combined with any Metadata Integrator. The rationale
for this combination is that Metadata Integrators usually run at night or other non-business hours,
and the Search Server runs during normal business hours when users are searching on the
Metadata Management tab of Information Steward.
5. The File Repository Servers should be installed on a server with a high speed and high capacity
disk.
6. Adjust runtime parameters correctly.

Related Topics
• General best practices

10.5.4 Cleansing Package Builder best practices

1. Cleansing Package Builder Auto-Analysis service should be on a dedicated server obtain higher
throughput.
2. Use sample data that represents various patterns in your whole data set. Large amounts of sample
data with too many repeating patterns leads to redundant processing overhead.
3. Run-time parameters should be set correctly for Auto-Analysis service. Specifically, memory
requirements are very important. If enough memory is not made available to the process, it will run
out of memory. If the memory cap for Java is left at 1GB, even though the system has 16GB, the
service will run out of memory. The best practice is to allocate 2-3GB of memory to the Java services
via the -Xmx setting.

Related Topics
• Common runtime parameters for Information Steward
• General best practices

191 2011-04-06
Performance and Scalability Considerations

192 2011-04-06
Backing Up and Restoring Metadata Management

Backing Up and Restoring Metadata Management

11.1 Backing up and restoring overview

This section describes suggested procedures for performing backups and restorations of various objects
associated with this product.

Related Topics
• Migration mechanisms and tools

11.2 Exporting objects to XML

This application provides an XML export utility, MMObjectExporter.bat, which allows you to export
repository objects to an XML file. The utility is installed on the machine on which you installed this
product. You specify its output by using required and optional command line arguments.

11.2.1 To export objects to XML

1. From a command prompt, navigate to the installation directory's subdirectory location.


2. Invoke the XML export utility by typing and running the command MMObjectExporter with the
appropriate arguments from the following table.

193 2011-04-06
Backing Up and Restoring Metadata Management

Option Description

configuration (Required) Represents the id number of the integrator source configuration


to be exported. You can find the configuration id number at the end of the
URL displayed in the status bar of the SAP BusinessObjects Metadata
Management Explorer when you move the pointer over the integrator
source name. Include only the configuration id number for this argument.
Do not include the word configuration.
filename (Required) Represents the name of the XML file you want this product to
create, including the full path. If this argument contains blank spaces, you
must enclose the argument in quotation marks. Include only the full path
and file name for this argument. Do not include the word filename.
boeServer (Optional) Represents the name of the computer that contains the CMS
server. The default value is localhost. Include both the argument name
and the value, separated by a space.
boeAuthentication (Optional) Represents the logon authentication method. The default value
is secEnterprise. Include both the argument name and the value, separated
by a space.
boeUser (Optional) Represents the CMS user name to connect to the CMS server.
The default value is Administrator. Include both the argument name and
the value, separated by a space.
boePassword (Optional) Represents the password to connect to the CMS server. The
default is no password (the empty string ""). Include both the argument
name and the value, separated by a space.
mainObject (Optional) Represents the object type to be exported. The default value is
hierarchy. Include both the argument name and the value, separated by
a space. If you do not specify this argument, then the utility finds the top
level object types from the integrator's <object-hierarchy> element.
By default, all other objects related to the object or objects specified by
this argument are also included in the XML output. However, this argument
has an optional name list to limit the object selection to a specific set of
objects. The object type is limited to those shown in the object type XML
specification's <object-type> element type attribute in the file Object
Types.xml, and must be spelled and cased the same. The type attribute
is also used to specify the logical name of the table containing objects of
that type.

If there are no objects found in the repository that meet the mainObject
requirements, then the utility exits with an error message. If some but not
all of the objects are found, the utility generates the XML file but writes a
warning message about the objects that were not found to the log.

includeNulls (Optional) Represents a boolean value to indicate whether to include


properties with null values. The default value is TRUE. Include both the
argument name and the value, separated by a space.

194 2011-04-06
Backing Up and Restoring Metadata Management

Option Description

includeParents (Optional) Represents a boolean value to indicate whether to include par-


ents of objects. The default value is TRUE. Include both the argument
name and the value, separated by a space.
includeChildren (Optional) Represents a boolean value to indicate whether to include chil-
dren of objects. The default value is TRUE. Include both the argument
name and the value, separated by a space.
includeRelationships (Optional) Represents a boolean value to indicate whether to include object
relationships. The default value is TRUE. Include both the argument name
and the value, separated by a space.
validate (Optional) Represents a boolean value to indicate whether to validate the
exported XML file. The default value is TRUE. Include both the argument
name and the value, separated by a space.
xsdUrl (Optional) Represents location of the file ObjectExportSchema.xsd.
The location of the file is added to the exported XML file as the attribute
schemaLocation. Other programs can validate and interpret the exported
XML file by reading the value of that attribute. Use this argument to override
the default name or the default location in order to specify a network-ac-
cessible location for the XSD file.

After the utility runs, this product creates an XML file according to the specified arguments.

Example:
In this example, at a command prompt positioned in the installation directory's subdirectory, the user
enters the following command and arguments:
mmobjectexporter 82 "c:\temp\first exported.xml"
boeUser Jane boePassword My1Password
mainObject Universe

The entries represent the following:

195 2011-04-06
Backing Up and Restoring Metadata Management

Entry Description

mmobjectexporter Invokes the utility.


82 Uses the configuration argument and represents in this example the ID
number of the configuration for export to XML.
"c:\temp\first exported.xml" Uses the filename argument and represents the path and name of the
created XML file. Here the argument is in quotation marks because the
file name contains a space.
boeUser Jane Uses the boeUser argument to specify the name of the CMS user.
boePassword My1Pass- Uses the boePassword argument to specify the password for the CMS
word user.
mainObject Universe Uses the mainObject argument to select all Universes in the configuration.
Alternatively, the argument could be more specific. For example, using
the syntax mainObject "Universe=MyUniverse" selects only the Universe
named MyUniverse. Using the syntax mainObject "Universe=MyUni-
verse,YourUniverse" selects the two Universes named MyUniverse and
YourUniverse. The parents and children of these objects would also be
included, because the arguments for includeParents and includeChildren
are not invoked, and so they default to TRUE.

11.3 Backing up and restoring configurations

Use these procedures when you want to backup and restore this application's configurations from one
system to another.

Note:
Use the backup utility in your Relational Database Management System to back up the repository that
contains all MMT_* tables. Use the Lifecyle management console for SAP BusinessObjects BI platform
to back up and restore configurations of this product.

11.3.1 Backing up configurations

You must have the current version of this product on both the source and target machines.

To back up your configurations, use the Lifecyle management console for SAP BusinessObjects BI
platform to create an output BIAR file for Metadata Management configuration information on the source
SAP BusinessObjects BI platform system:

196 2011-04-06
Backing Up and Restoring Metadata Management

1. On the "Destination environment" screen, specify a BIAR file that is accessible to both the source
and target SAP BusinessObjects BI platform machines.
2. On the "Select objects to import" screen, select only the Import application folders and objects
option.
3. On the "Select application folders and objects" screen, select Metadata Management.
4. On the "Select objects to import" screen, select the users and groups that have permissions to the
Metadata Management integrator source configurations.
5. On the next screen, select the users and groups that have permissions to the Metadata Management
integrator source configurations.
The following configuration information is backed up as a result of this procedure:
• Metadata Integrator source configuration
• Metadata Management utilities configurations
• Metatdata source groups
• Security information (users, groups, and their permissions)

11.3.2 Restoring configurations

You must have this version of the application on both the source and target machines.

To restore the configurations on the target system:


1. Use the Lifecyle management console for SAP BusinessObjects BI platform to import the generated
BIAR file.
2. Do the following steps for each imported integrator source.
a. In Information Steward, go to the "Metadata Management" area and select the "Integrator Sources"
node.
The list of integrator source configurations displays by default.
b. Double-click the name of each integrator source configuration to open the Properties page and
enter the password for the source system.
For security purposes, the password is not stored in the backup information.
c. Save the Properties page.
3. Restart the system and Web application.

197 2011-04-06
Backing Up and Restoring Metadata Management

198 2011-04-06
Life Cycle Management

Life Cycle Management

12.1 Migration basics

About this section


Migration as it relates to SAP BusinessObjects Information Steward is the process of moving application
configurations from a test phase into production. The software supports simple and complex application
migration through all development phases.

12.1.1 Development process phases

The application development process typically involves two phases:


• Test phase
• Production phase
You can use SAP BusinessObjects Information Steward in both phases. Because each phase might
require a different repository to control environment differences, the software provides controlled
mechanisms for moving objects from phase to phase.

Each phase could involve a different computer in a different environment with different security settings.
For example, the initial test may require only limited sample data and low security, while final testing
may require a full emulation of the production environment including strict security.

12.1.1.1 Test phase

In this phase, you define and test the following objects for each module of SAP BusinessObjects
Information Steward:
• Data Insight—Define profile tasks, rules, and scorecards that instruct Information Steward in your
data quality requirements. The software stores the rule definitions so that you can reuse them or
modify them as your system evolves.

199 2011-04-06
Life Cycle Management

• Metadata Management—Define integrator sources, integrator source groups, and integrator source
instances that collect metadata to determine the relationships of data in one source to data in another
source.

After you define the objects, use SAP BusinessObjects Information Steward to test the execution of
your application. At this point, you can test for errors and trace the flow of execution without exposing
production data to any risk. If you discover errors during this phase, you can correct them and retest
the application.

The software provides feedback through trace, error, and monitor logs during this phase.

The testing repository should emulate your production environment as closely as possible, including
scheduling Data Insight tasks and Metadata Integrator runs rather than manually starting them.

12.1.1.2 Production phase

In this phase, you set up a schedule in the Central Management Console (CMC) to run your Data Insight
tasks and Metadata Integrator runs as jobs. Evaluate results from production runs and when necessary,
return to the test phase to optimize performance and refine your target requirements.

After you move the software into production, monitor it in the CMC for performance and results. During
production:
• Monitor your Data Insight tasks and Metadata Integrator runs and the time it takes for them to
complete.

The trace and monitoring logs provide information about each task and run.

You can customize the log details. However, the more information you request in the logs, the longer
the task or integrator runs. Balance run time against the information necessary to analyze
performance.
• Check the accuracy of your data.

To enhance or correct your jobs:


1. Make changes in your test environment.
2. Repeat the object testing.
3. Move changed objects back into production.

12.2 Migration mechanisms and tools

SAP BusinessObjects Information Steward provides the following migration mechanisms:

200 2011-04-06
Life Cycle Management

• Lifecycle management console for SAP BusinessObjects BI platform


• Information Steward export and import

12.2.1 Moving objects using the lifecycle management console

Lifecycle management console for SAP BusinessObjects BI platform is a web-based tool that enables
you to move BI resources from one system to another system, without affecting the dependencies of
these resources. It also enables you to manage different versions of BI resources, manage dependencies
of BI resources, and roll back a promoted resource to restore the destination system to its previous
state.
You can use the lifecycle management console for SAP BusinessObjects BI platform to move objects
in the Central Management System (CMS) between the same versions. For example, when you move
objects from the test system to the production system, you can accomplish the task through the lifecycle
management console.

You can use the lifecycle management console to move the following Information Steward objects:
• Integrator source configurations
• Metadata Management utilities
• Source groups
• Security information (including users, groups, and their permissions) for Metadata Management,
Metapedia, Data Insight, and Cleansing Package Builder. Most Information Steward security
information is stored at the folder level, so to move all of the security settings from one system to
another, promote each folder. The lifecycle management console has an option that lets you choose
whether to promote a job with its associated security and whether to include application rights.

12.2.2 Exporting and importing objects using Information Steward

Using Information Steward import and export functionality, you can move the following objects:

201 2011-04-06
Life Cycle Management

Object For more information

Data Insight file formats For more information about importing and exporting file formats, see the
“Data Insight ” section of the SAP BusinessObjects Information Steward
User Guide.
Data Insight rules For more information about managing rules, see the “Data Insight ” section
of the SAP BusinessObjects Information Steward User Guide.
Data Insight views For more information about importing and exporting views, see the “Views”
section of the SAP BusinessObjects Information Steward User Guide.
Metapedia terms and cat- For more information about importing and exporting terms and categories
egories with Excel, see the “Metapedia” section of the SAP BusinessObjects Infor-
mation Steward User Guide.

202 2011-04-06
Supportability

Supportability

13.1 Information Steward logs

For each profiling and rule task and Metadata Integrator run, SAP BusinessObjects Information Steward
writes information in the following logs:
• Database Log - Use the database log as an audit trail. This log is in the Information Steward
Repository. You can view this log while the Metadata Integrator or Data Insight profile or rule task
is running.
The default logging level for the database log is Information which writes informational messages,
such as number of reports processed, as well as any warning and error messages. It is recommended
that you keep the logging level for the database log at a high level so that it does not occupy a large
amount of disk space.

• File Log - Use the file log to provide more information about a Metadata Integrator or Data Insight
profile or rule task run. The Metadata Integrator creates this log in in the Business Objects installation
directory and copies it to the File Repository Server. You can download this log file after the Metadata
Integrator run completed.
The default logging level for the file log is Configuration which writes static configuration
messages, as well as informational, warning, and error messages. You can change the logging level
for the file log if you want more detailed information. If your logs are occupying a large amount of
space, you can change the maximum number of instances or days to keep logs.

13.1.1 Log levels

Each logging level logs all messages at that level or higher. Therefore, the default logging level
Information logs informational, warning, and error messages. If you change the logging level to Warning,
SAP BusinessObjects Information Steward logs warning and error messages. Similarly, if you change
the logging level to Integrator trace, Information Steward logs trace, configuration, informational, warning,
and error messages

Log level Description

Off Turn off logging any messages

203 2011-04-06
Supportability

Log level Description

Error Log messages that indicate a serious failure

Warning Log messages that indicate a potential problem

Information Log informational messages

Configuration Log static configuration messages

Integrator trace Log integrator tracing information

SQL trace Log SQL tracing information

System trace Log highly detailed tracing information

All Log all messages

13.1.2 Changing log levels

You can change the log levels for Metadata Management and Data Insight logs.

13.1.2.1 Changing Metadata Management log levels

To change the Metedata Managent log levels, you must have the Schedule right on the integrator
source.
1. On the Information Steward page in the Central Management Console (CMC) , expand the Metadata
Management node, and expand the Integrator Sources node to display all configured integrator
sources.
2. Select the integrator source for which you want to change the logging level by clicking anywhere on
the row except its type.
Note:
If you click the integrator type, you display the version and customer support information for the in
tegrator.

3. Select Action > Schedule in the top menu tool bar.


4. Click the Parameters node in the tree on the left.
5. From the drop-down list, select the logging level that you want for Database Log Level or File Log
Level.

204 2011-04-06
Supportability

6. Click Schedule.
Future runs of the recurring schedule for this integrator source will use the logging level you specified..

13.1.2.2 Changing Data Insight log levels

To change the Data Insight log levels, do the following:


1. On the Information Steward page in the Central Management Console (CMC) , expand the Data
Insight node, and expand the Projects node.
2. Select the project source, and select the project for which you want to change the log level by clicking
anywhere on the row except its type.
3. Select Action > Schedule in the top menu tool bar.
4. Click the Parameters node in the tree on the left.
5. From the drop-down list, select the log level that you want for Database Log Level or File Log
Level.
6. Click Schedule.
Future runs of the recurring schedule for this Data Insight profile or rule task will use the logging level
you specified.

13.1.3 Viewing logs

You can view Metadata Management and Data Insight logs.

13.1.3.1 Viewing integrator source logs

To view Metadata Management integrator source logs:


1. From the Central Management Console (CMC) click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
2. Expand the Metadata Management node and select the Integrator Sources node.
A list of configured integrator sources appears in the right panel with the date and time each was
last run.
3. Select the integrator source and click Action > History in the top menu tool bar.

205 2011-04-06
Supportability

The "Integrator History" pane displays each schedule in the right panel.
4. Select the schedule name and click the icon for View the database log.
The "Database log" shows the task messages which are a subset of the message in the log file.
5. To find specific messages in the "Database log" window, enter a string in the text box and click
Filter.
For example, you might enter error to see if there are any errors.
Note:
• For information about troubleshooting Cleansing Package Builder, see “Troubleshooting” in the
User Guide.
• For information about troubleshooting integrator sources, see Troubleshooting
6. To close the "Database log" window, click the X in the upper right corner.

13.1.3.2 Viewing Data Insight task logs

To view Data Insight task logs:


1. From the Central Management Console (CMC) click Information Steward.
The "Information Steward" page opens with the Information Steward node selected in the Tree
panel.
2. In the Tree panel, expand the Data Insight node.
3. Expand the Projects node.
4. Select the name of your project in the Tree panel.
A list of tasks appears in the right panel with the date and time each was last run.
5. Select the task and click Action> History in the top menu tool bar.
The "Data Insight Task history" pane displays each instance the task was executed.
6. Select the instance name and click the icon for View the database log.
The "Database log" shows the task messages which are a subset of the message in the log file.
7. To find specific messages in the "Database log" window, enter a string in the text box and click
Filter.
For example, you might enter error to see if there are any errors.
Note:
For information about troubleshooting Cleansing Package Builder, see “Troubleshooting” in the User
Guide.

8. To close the "Database log" window, click the X in the upper right corner.

206 2011-04-06
Supportability

13.1.4 Viewing additional logs

You can also view log information for SAP BusinessObjects Data Services and SAP BusinessObjects
Enterprise XI 4.0.

Data Services log files


The log files are located in the Data Services log directory, for example: C:\Program Files
(x86)\SAP BusinessObjects\Data Services\log.

You can also find log files in the following locations:

C:\Program Files (x86)\SAP BusinessObjects\Data Services\log\MetadataService

C:\Program Files (x86)\SAP BusinessObjects\Data Services\log\ViewdataService

Look for log files associated with job execution, for example errorlog.txt and tracelog.txt

Business Object Enterprise Cleansing Package Builder log files


The log files are in the platform Log directory where CMS is installed. For example, C:\Program
Files (x86)\SAP BusinessObjects \SAP BusinessObjects Enterprise XI 4.0\logging

Look for the following log files:


• InformationSteward.RelationshipService.log
• InformationSteward.SearchService.log
• InformationSteward.IntegratorService.log
• InformationSteward.SchedulingService.log
• InformationSteward.Administrator.log
• InformationSteward.Explorer.log

On machines with only Web Applications, the log files (InformationSteward.Administrator.log


and InformationSteward.Explorer.log) are stored in the Web Application temp directory. For
example, C:\Program Files (x86)\SAP BusinessObjects\Tomcat6\temp\ICC

Cleansing Package Builder log files


All Cleansing Package Builder Services log details can be found in EIM Adaptive Processing Server
trace log under <BOE Install>\SAP BusinessObjects Enterprise XI 4.0\logging

For example: C:\Program Files (x86)\SAP BusinessObjects\SAP BusinessObjects


Enterprise XI 4.0\logging\ pjs_<BOE Node
Name>EIMAdaptiveProcessingServer_trace.00000X.glf

207 2011-04-06
Supportability

208 2011-04-06
Appendix

Appendix

14.1 Glossary

accuracy
The extent to which data objects correctly represent the real-world values for which they
were designed.
address profiling
A process that componentizes and measures address data with dictionary data.
alternate
A substitute spelling or nickname; Mr. is an alternate for Mister.
annotation
User notes added to an object in Metadata Management and Data Insight.
association
A relationship between terms contained in a Metapedia business glossary and metadata
objects.
catalog
A relational object type that, in a relational database management system (RDBMS),
corresponds to a database. Each deployment contains two such object types, one for the
datasources and one for the target tables.
category
The organization system for grouping terms to denote a common functionality. Categories
can contain sub-categories, and you can associate terms to more than one category.
cleansing package
The parsing rules and other information that define how to parse and standardize the data
of a specific data domain.
Cleansing Package Builder
A module of Information Steward that allows a data steward to create and modify cleansing
packages for any data domain. A cleansing package is then used to process the data in
accordance with package guidelines through SAP BusinessObjects Data Services.
cleansing package category
The organization system for grouping terms to denote a common functionality. Cleansing
package categories can contain sub-categories, and you can associate terms to more than
one category.

209 2011-04-06
Appendix

completeness
The extent to which data is not missing.
conformity
The extent to which data conforms to a specified format.
consistency
The extent to which distinct data instances provide non-conflicting information about the
same underlying data object.
context definition
A method that allows users to specify context when data contains a pattern or contains
parsed values that have a special meaning when used together, such as a range of
acceptable values.
custom attribute
Properties you add to existing metadata objects that, once defined, can be searched for
and viewed.
custom cleansing package
The parsing rules and other information that you have defined in order to parse and
manipulate all types of data including operational and product data.
data insight project
A collaborative space for data stewards and data analysts to assess and monitor the data
quality of a specific domain and for a specific purpose (such as customer quality
assessment, sales system migration, and master data quality monitoring).
data steward
A person who manages data as an asset, is an expert in his data domain and is responsible
for the quality of the data.
database
One or more large structured sets of persistent data, usually associated with software, to
update and query the data. A relational database organizes the data, and relationships
between them, into tables.
datasource schema
The definition of a table’s columns and primary keys.
dependency profiling
A process that determines whether the data in one column or table is based on the results
of another column or table.
dimension
A logical grouping of characteristics within an InfoCube.
directory structure
A hierarchy within Information Steward that is organized into folders that contain four
categories, namely Data Integration, Business Intelligence, Data Modeling, and Relational
Databases.
extract

210 2011-04-06
Appendix

A process by which Information Steward copies information from source systems and loads
it into the repository.
file format
Flat file definition which includes the column name, data type, delimiter or character width.
This is equivalent to the schema for a relational database table.
impact diagram
A graphical representation of the object(s) that will be affected if you change or remove
other connected objects.
InfoCube
A type of InfoProvider that describes a self-contained dataset, for example, from a business
oriented area.
integrator source
A named set of parameters that describes how a metadata integrator can access a data
source.
integrity
The extent to which data is not missing important relationship linkages.
key data domains
A set of related data objects or key data entities.
lineage diagram
A diagram that shows where the data comes from and what sources provide the data for
this object.
metadata integrator
An application that collects information about objects in a source system and integrates it
in one or more related source systems.
metadata object
A unit of information that the software creates from an object in a source system.
Metapedia
A custom glossary within Information Steward that you use to define and organize terms
and categories related to your business data.
Metapedia category
The organization system for grouping Metapedia terms to denote a common functionality.
Categories can contain sub-categories, and you can associate Metapedia terms to more
than one category.
Metapedia term
A word or phrase that defines a business concept in your organization.
MultiProvider
An SAP NetWeaver Business Warehouse object that combines data from several
InfoProviders and makes it available for reporting.
object equivalency rule
A naming rule that indicates that an object in one source system is the same physical
object in another source system.

211 2011-04-06
Appendix

object tray
A temporary holding space for objects that you want to export or define a relationship for
in Metadata Management.
Open Hub Destination
An SAP NetWeaver Business Warehouse object within the open hub service that contains
all information about a target system for data in an InfoProvider. The target system can be
external.
parent-child
A hierarchical relationship where one object is subordinate to another. In this hierarchy,
the parent is one level above the child; a parent can have several children, but a child can
have only one parent. For example, a table can have multiple columns, but a column can
belong to only one table.
parsed value
A data string that results from parsing.
parsing rule
A rule that determines how data is classified based on a pattern within the data and how
the data is mapped to specific attributes.
person and firm cleansing package
A cleansing package that parses party or name and firm information such as given name,
family name, prename, title, phone number, and firm or company name.
private cleansing package
A cleansing package that can be viewed or edited only by the user who owns it.
profile
A process that generates attributes about the data such as minimum and maximum values,
pattern distribution and data dependency to help data analysts discover and understand
data anomalies.
profile
To generate attributes about the data.
profile task
A task to profile one or more tables, views and/or flat files. This task can be scheduled or
executed on demand.
properties file
A collection of information that appears on the Report tab of each report; it includes such
information as the name and description of the report, and the source(s) of the information
contained in the report.
published cleansing package
A cleansing package that is either SAP-supplied or created and then is published by a
data steward, is available to all users, and can be used in a Data Services transform.
quality dimensions
A category for rules such as accuracy and completeness. This helps to organize your rules
and provides a score that contributes to the scorecard value.

212 2011-04-06
Appendix

query
An SAP NetWeaver Business Warehouse object consisting of a combination of
characteristics and key figures (InfoObjects) that allow you to analyze the data in an
InfoProvider.
query views
An SAP NetWeaver Business Warehouse object consisting of a modified view of the data
in a query or an external InfoProvider.
redundancy profiling
A process that measures the amount of repeated data.
rule task
A task to run rules bound by one or more tables, views and/or flat flies. This task can be
scheduled or executed on demand.
same as relationship
The association between two objects indicating that they are identical physical objects.
Only objects of the same object type can have this kind of association.
schema
A definition of a table in a relational database.
score
A numerical result calculated by counting the records that pass a rule divided by the total
number of records.
scorecard
A high level data quality view of a key data domain based on business data quality
objectives.
scripting language
Expression language used to write validation rules.
server instance
A database, data source, or service in a relational database management system.
source
An object that provides data that is copied or transformed to become part of the target
object.
source group
A set of related integrator sources.
source system
A software application from which SAP BusinessObjects Information Steward extracts and
organizes metadata into directory structures, enabling you to navigate and analyze the
metadata.
standard form
The standardized or normalized form of a variation, which is displayed after cleansing.
sub-category

213 2011-04-06
Appendix

Within a category, the organization system for grouping terms to denote a common
functionality.
synonym
Another name for an object in the same system. For example, a synonym for a relational
table exists in the same database as the table.
target schema
A set of tables. A project can only contain one target schema, and its name is always target
schema.
timeliness
The extent to which data is sufficiently up-to-date for the task at hand.
transfer rules
An SAP NetWeaver Business Warehouse object that determines how the data for a
DataSource is to be moved to the InfoSource. The uploaded data is transformed using
transfer rules.
transformation
An SAP NetWeaver Business Warehouse object that consists of functions for unloading,
loading, and formatting data between different data sources and data targets that use data
streams.
transformation name
The identity of the universe object, if the target is a measure, or the data flow name, if the
source data was taken from an Extract, Transform, and Load (ETL) system.
uniqueness
The extent to which the data for a set of columns is not repeated.
uniqueness profiling
A process that determines whether the exact piece of data is repeated within the same
column or differing columns.
usage scenario
An example that is typical of the kinds of tasks you’d like to perform with the software.
validation rules
A method that assesses the quality of data in the source system. These rules are bound
to one or more columns to derive a score.
variation
A value that has been assigned to an attribute.
web template
An SAP NetWeaver Business Warehouse object consisting of an HTML document that
determines the structure of a Web application.
workbook
An SAP NetWeaver Business Warehouse object consisting of a Microsoft Excel spreadsheet
with one or more embedded NetWeaver Business Warehouse queries.

214 2011-04-06
Index
A catalog 209 concurrent tasks 175
categories and terms concurrent users
accessing Information Steward 9 importing and exporting 201 performance 168
accuracy 209 category 209 performance factors 170
adaptive job server 19 Central Management Console response time 169
adaptive processing server 19 See CMC 19 configurations, backing up 196
add collected metadata 127 Central Management Server configurations, backing up³ 196
address profiling 209 See CMS 19 configurations, restoring 197
administration chunk of data 209 configure
tasks for Metadata Management cleansing package 209 utility 186
105 Cleansing Package Builder 209 configuring
advanced profiling 176 best practices 191 Common Warehouse Metamodel
alias 209 distributed processing 173 Metadata Integrator 109
alternate 209 performance 186 Metadata Integrator,
annotation 209 performance factors 170 BusinessObjects Enterprise
assigning rights related services 173 106
Metadata Management tasks 59 resource intense 165 Data Federator Metadata Integrator
association 209 scalability 186 111
asynchronous processing security rights 61 Data Services Metadata Integrator
SAP as a source 180 cleansing package category 209 112
Average Concurrent Tasks 175 cleansing packages Meta Integration Metadata Bridge
deleting 139 (MIMB) Metadata Integrator
editing descriptions 140 113
B group rights 61 Metadata Browsing Service 157,
backing up owners 139 158
integrator configurations 196 publishing 187 NetWeaver Business Warehouse
best practices states and statuses 141 Metadata Integrator 108
Cleansing Package Builder 191 unlocking 140 View Data Service 157, 159
Data Insight 188 CMC 19 configuring Metadata Integrator
general 187 CMS 19 JDBC connections 116
Metadata Management 190 collect Relational Database 114
profiling 188 Crystal Reports 127 universe connections 118
projects 190 universes 127 conformity 209
rules 188 Web Intelligence documents 127 connection parameters
SAP as a source 180 column profiling displaying and editing 85
BI platform components degree of parallelism 176 HP Neoview 69
that Information Steward uses 11 Common Warehouse Metamodel IBM DB2 70
Business Intelligence platform Metadata Integrator Informix IDS 71
components configuring 109 Microsoft SQL Server 72
how Information Steward uses 19 completeness 209 MySQL 73
BusinessObjects Enterprise Integrator components Netezza 74
configuring 106 Business Intelligence platform 19 ODBC 74
compute lineage report utility 146 Oracle database 76
configuring 153 SAP Applications 81
C description 145 SAP In-Memory Database 68
modifying configuration 152 SAP NetWeaver Business
calculate scorecard utility monitoring 151 Warehouse 80
description 145 rescheduling 149 connections
monitoring 151 run now 150 for Data Insight 66
rescheduling 149 scheduling 148 to application 79
run now 150 computing Relationship Table 146 to database 66
scheduling 148

215 2011-04-06
Index

connections (continued) Data Services job server extract 209


to file 84 adding job server for Information
connections, Data Insight Steward 162
security rights 43 configuring for Information Steward
F
consistency 209 161 file format 209
context definition 209 Data Services Job Server file repository server 19
creating source groups 136 description 13 folders, Data Insight
custom attribute 209 Data Services job server group security rights 41
custom cleansing package 209 deleting a job server for Information folders, Metadata Management
Steward 163 security rights 55
displaying for Information Steward
D 162
data Data Services Job Server Group G
characteristics 170 for Information Steward 161
Data Services Metadata Integrator grid computing 178, 180
performance factors 168 group rights
data characteristics configuring 112
data steward 209 cleansing packages 61
performance 168 groups
performance factors 170 database 209
datasource schema 209 assigning rights 48, 50
sizing 168 denying rights 49
Data Federator Metadata Integrator DB2 connection parameters 70
configuring 111 defining objects 199
Data Insight degree of parallelism 175 H
administrator tasks 65 column profiling 176
best practices 180, 188, 190 disk requirements 177 history of Metadata Integrator runs 131
distributed processing 172 rule processing 177 HP Neoview connection parameters
monitoring tasks 95 using 177 69
optimize result set 182 deleting
organizing projects 190 cleansing packages 139
deleting source groups 137
I
pausing and resuming tasks 96
performance factors 168 dependency 176 IBM DB2 connection parameters 70
related services 172 dependency profiling 209 impact
SAP as a source 180 deployment response time 169
scheduling tasks 89 scalability 171 impact analysis
tasks 89 developing applications resource intense 165
test phase 199 production phase 200 impact diagram 209
Data Insight connection testing phase 199 importing and exporting 201
configuring for file source 84 development process phases 199 InfoCube 209
Data Insight connection parameters dimension 209 Information Steward
displaying and editing 85 directory structure 209 accessing 9
Data Insight connections distribution level sub-table 180 architecture 11
assigning security 50 distribution level table 180 log levels 204, 205
denying rights 49 DOP logs 203, 205, 206, 207
data insight project 209 advanced profiling 176 product overview 9
Data Insight projects services 16
assigning security 50 E user groups 35
creating 87 Information Steward application
deleting 86, 88 editing descriptions security rights 63
denying rights 49 cleansing packages 140 Information Steward architecture
editing 88 EIMAPS 187 performance intensive components
purpose 87 email notification 166
Data Insight rules configuring notification server 91, Information Steward components
importing and exporting 201 94 run on BI platform servers 11
Data Insight views processing alert 91, 94 Information Steward groups 37
importing and exporting 201 encryption Information Steward job server
data profiling of sensitive data 28 displaying on CMC 162
resource intense 165 exporting and importing 201

216 2011-04-06
Index

Information Steward job server group lineage (continued) Metadata Management search indexes
description 161 viewing from InfoView for Web (continued)
Information Steward repository Intelligence documents 106 scheduling update of 148
editing user and password 63 lineage analysis Metadata Management utilities
Information Steward users 36 resource intense 165 moving 201
information workflose 22 lineage diagram 209 metadata object 209
Informix IDS connection parameters lineage staging table metadata sources
71 recalculate lineage information 153 performance factors 169
input data logs, viewing 131 Metapedia 209
performance 183 metapedia category 209
scalability 183 metapedia term 209
settings 183
M Metapedia terms and categories
installing metadata integrators 105 match standard 209 importing and exporting 201
integrator configurations Meta Integration Metadata Bridge Microsoft SQL Server connection
backing up 196 (MIMB) Metadata Integrator parameters 72
restoring 197 configuring 113 migrating
integrator source 209 metadata browsing application configurations 199
Integrator source configurations resource intense 165 Information Steward objects using
moving 201 Metadata Browsing Service lifecyle management console
integrator sources changing properties 158 201
changing limits for 122 metadata integrator 209 migration tools 200
configuring 105 runtime parameters 185 MMT_Alternate_Relationship table
managing tasks 119 scheduling 124 description 146
run-time parameters 125 Metadata Integrator history scheduling computation of 148
types of 105 viewing 131 modifying source groups 136
viewing and editing 120 Metadata Integrator logs moving
integrity 209 viewing 131 Information Steward objects using
Metadata Integrator, Common lifecyle management console
Warehouse Model 201
J multi-threaded files
configuring 109
JDBC connection sources metadata integrators performance 182
configuring 116 options to schedule 90 scalability 182
job server resource intense 165 multiple users
configuring on Data Services 161 runtime parameters 97 performance 168
job server group when you install Information MultiProvider 209
adding job server 162 Steward 105 MySQL connection parameters 73
deleting a job server 163 Metadata Integrators
Job Server Group definition 15 N
for Information Steward 161 running 123
JVM running immediately 123 Netezza connection parameters 74
runtime parameters 187 tasks 119 NetWeaver Business Warehouse
Metadata Management Metadata Integrator
administration tasks 105 configuring 108
K best practices 190 NetWeaver BW integrator source
key data domains 209 distributed processing 173 run-time parameters 130
keystore and truststore lineage staging table 146 notification server
SSL setup for Remote Job Server performance 185 alerts 91, 94
30 performance factors 169 configuring for processing 91
pre-defined user groups 54 configuring for rules 94
related services 173 email notification 91, 94
L repository 185
scalability 185
lifecycle management console 201 test phase 199
O
lineage updating search indexes 147
response time 169 object collection
Metadata Management search indexes selective 127
rescheduling update of 149 object equivalency rule 209

217 2011-04-06
Index

object tray 209 pre-defined user groups 33 purge utility


ODBC connection parameters 74 Metadata Management 54 description 145
Open Hub Destination 209 private cleansing package 209 monitoring 151
optimize result set processing rescheduling 149
Data Insight 182 Cleansing Package Builder run now 150
performance 182 related services 173 scheduling 148
scalability 182 Data Insight
Oracle database related services 172
connection parameters 76 degree of parallelism 177
Q
out of memory error distributed 172, 173, 174 quality dimensions 209
action to resolve 133 Metadata Management query 209
owners related services 173 query views 209
cleansing packages 139 repository 173 queuing
web application 174 tasks 175
production phase 199, 200
P profile 209
parallelism 175 profile connection R
parameters to run integrator sources configuring for a database source
66 rebuilding search indexes
125 Metadata Management 147
parameters to run metadata integrators configuring for application 79
profile connections recurrence options 90
97 redundancy 176
parent-child 209 assigning security 48
profile projects redundancy profiling 209
parsed value 209 Relational Database Metadata
parsing failure assigning security 48
profile task 209 Integrator
action to resolve 133 configuring 114
parsing rule 209 profile tasks 89
monitoring 95 Remote Job Server
pausing a utility 151 SSL 129
performance pausing and resuming 96
runtime parameters 97 Reports option
considerations 170 Metadata Management lineage
input data 183 scheduling 89
profiling staging table 146
performance considerations repository
Cleansing Package Builder 186 advanced 176
best practices 188 distributed processing 173
degree of parallelism 175 Metadata Management 185
deployment 171 column
degree of parallelism 176 profiling 184
distributed processing 172 rule processing 185
grid computing 178 repository 184
profiling task settings 98, 99 repository size
input data 183 performance 184
Metadata Management 185 profiling tasks
configuring 103 scalability 184
multi-threaded files 182 requirements
optimize result set 182 profiling type
performance 168 degree of parallelism 177
queuing tasks 175 resource intense
repository size 184 sizing 168
projects Cleansing Package Builder 165
scheduling tasks 174 data profiling 165
performance factors 167 best practices 190
organizing 190 impact analysis 165
concurrent users 168, 169, 170 lineage analysis 165
data 168 projects, Data Insight
security rights 45 metadata browsing 165
data characteristic 168 metadata integrators 165
data characteristics 170 promoting
Information Steward objects using validation rule 165
Data Insight 168 viewing data 165
metadata sources 169 lifecyle management console
201 response time 167
multiple users 168 concurrent users 169
profiling type 168 properties file 209
published cleansing package 209 restoring
sample data 170 integrator configurations 197
person and firm cleansing package publishing
cleansing packages 187 reverse proxy servers
209 description 32

218 2011-04-06
Index

rights scalability considerations (continued) sizing factors 167


for Metadata Management objects input data 183 source 209
54, 57 Metadata Management 185 source group 209
rule multi-threaded files 182 source groups
profiling optimize result set 182 creating 136
degree of parallelism 177 queuing tasks 175 deleting 137
rule processing repository size 184 metadata sources 135
repository 185 scalability levels 171 modifying 136
rule task 209 scheduling tasks 174 moving 201
rule task settings 98, 99 scheduling source system 209
rule tasks 89 metadata integrator 124 SSL 31
configuring 103 tasks 174 configuring remote job server 29,
monitoring 95 schema 209 31
pausing and resuming 96 score 209 SSL setup for Remote Job Server
runtime parameters 97 scorecard 209 keystore and truststore 30
scheduling 89 scripting language 209 standard form 209
rules Secure Socket Layer (SSL) 29 states
best practices 188 Secure Sockets Layer (SSL) 31 cleansing packages 141
importing and exporting 201 security statuses
run-time parameters adding users to Information cleansing packages 141
BusinessObjects Enterprise user Steward groups 37 sub-category 209
collection 126 creating users 36 synchronous RFC connection
SAP NetWeaver BW integrator Data Insight connections 49, 50 SAP as a source 180
source 130 Data Insight objects 38 synonym 209
running integrator sources 125 Data Insight projects 49, 50
running metadata integrators 97 Metadata Management objects 53
options to schedule 90 profile connections 48
T
running utilities profile projects 48 target schema 209
options to schedule 90 SSL 29 tasks
runtime user data 28 queuing 175
metadata integrator 185 within SAP BusinessObjects scheduling 174
parameters 185 Enterprise 27 tasks, Data Insight
runtime parameters security information security rights 47
JVM 187 moving 201 tasks, Metadata Management
security rights security rights 59
Cleansing Package Builder 61 terms and categories
S Data Insight connections 43 importing and exporting 201
same as relationship 209 Data Insight folders 41 Test indexterm
sample data Data Insight projects 45 in cell of a table 106
performance factors 170 Data Insight tasks 47 test phase 199
SAP Applications connection Data Insight views 46 testing applications 199
parameters 81 Information Steward application 63 time out
SAP as a source Metadata Management folders 55 Information Steward sessions 9
best practices 180 server timeliness 209
SAP In-Memory Database connection adaptive job server 19 token 209
parameters 68 adaptive processing server 19 transfer rules 209
SAP NetWeaver BW connection file repository server 19 transformation 209
parameters 80 server instance 209 transformation name 209
scalability servers 12 type-specific rights
considerations 170 verify running 155 Metadata Management objects 54,
deployment 171 services 12 57
scalability considerations pertinent to Information Steward
Cleansing Package Builder 186 16
degree of parallelism 175 verify running 156 U
distributed processing 172 settings
input data 183 uniqueness 176, 209
grid computing 178 uniqueness profiling 209

219 2011-04-06
Index

universe connection sources users (continued) viewing log for utility 151
configuring 118 multiple 168 views
unlocking utilities importing and exporting 201
cleansing packages 140 descriptions 145 views , Data Insight
update search index utility 147 options to schedule 90 security rights 46
configuring 153 utility
description 145 configuring 186
modifying configuration 152 utilties
W
monitoring 151 changing limits for 122 web application
run now 150 distributed processing 174
usage scenario 209 web application server 19
user data
V
Web application server 13, 14
securing 28 validation rule administration 14
user groups resource intense 165 Web Intelligence documents
Data Insight 39 validation rules 209 viewing lineage from InfoView 106
Information Steward 35 value 209 web template 209
user rights variation 209 workbook 209
Data Insight objects 38 View Data Service workflow
users changing properties 159 adding table to project 22
adding to Information Steward viewing data scheduling and running a profile
groups 37 resource intense 165 task 23
assigning rights 48, 50 viewing impact and lineage scheduling and running an
concurrent 168 response time 169 integrator source 23
denying rights 49

220 2011-04-06

S-ar putea să vă placă și