Sunteți pe pagina 1din 121

http://bisharath.blogspot.com/20 11/01/why-dtp-and-not-infopackagein-sap-bi.

html Business Intelligence Purpose The reporting, analysis, and interpretation of business data is of central importance to a company in guaranteeing its competitive edge, optimizing processes, and enabling it to react quickly and in line with the market. With Business Intelligence (BI), SAP NetWeaver provides data warehousing functionality, a business intelligence platform, and a suite of business intelligence tools with which an enterprise can attain these goals. Relevant business information from

productive SAP applications and all external data sources can be integrated, transformed, and consolidated in BI with the toolset provided. BI provides flexible reporting, analysis, and planning tools to support you in evaluating and interpreting data, as well as facilitating its distribution. Businesses are able to make well-founded decisions and determine targetorientated activities on the basis of this analysis. Integration The following figure shows where BI is positioned within SAP NetWeaver. In addition, the subareas that are covered by the BI documentation are

listed. These are described in detail below.

Integration with Other SAP NetWeaver Components BEx Information Broadcasting allows you to publish precalculated documents or online links containing business intelligence content to the portal. The Business Explorer portal role illustrates the various options that are available to you when working with content from BI in the portal. For more information, see Information Broadcasting. The BEx Broadcaster, the BEx Web Application Designer, the BEx Query Designer, KM Content, SAP Role Uploads, and the Portal Content Studio are used to integrate BI contents into the portal. For more

information, see Integrating BI Content into the Portal. The documents and metadata created in BI (metadata documentation in particular) can be integrated using the repository manager in knowledge management. The BI Metadata Repository Manager is used within BEx Information Broadcasting. For more information, see BI Document Repository Manager and BI Metadata Repository Manager. You can use SAP NetWeaver Exchange Infrastructure (SAP NetWeaver XI) to send data from SAP and non-SAP sources to BI. In BI, the data is placed in the delta queue where it is available for further

integration and consolidation. Data transfer using SAP NetWeaver XI is SOAP-based. For more information, see Data Transfer Using SAP XI. Integration with BI Content AddOn With BI Content, SAP delivers preconfigured role and task-based information models and reporting scenarios for BI that are based on consistent metadata. BI Content provides selected roles within a company with the information they need to carry out their tasks. The information models delivered cover all business areas and integrate content from almost all SAP and selected external applications. For more information, see BI Content.

Features Subareas of BI Area Description

Data Data warehousing in Warehousing BI represents the integration, transformation, consolidation, cleanup, and storage of data. It also incorporates the extraction of data for analysis and interpretation. The data warehousing process includes data modeling, data extraction, and administration of the data warehouse management processes. The central tool for data warehousing tasks in BI is the Data

BI Platform

The business intelligence platform serves as the technological infrastructure and offers various analytical technologies and functions. These include the Analytics Engine, the Metadata Repository, Business Planning and Simulation, and special analysis processes such as data mining.

BI Suite: Business Explorer

The SAP NetWeaver Business Intelligence Suite, the Business Explorer (BEx), provides flexible reporting and analysis tools for strategic analyses, operational reporting, and decision-making support within a business. These tools include query, reporting, and analysis functions. As an employee with access authorization you can evaluate past or current data on various levels of detail, and from different perspectives,

Development BI Java SDK Technologies You use the BI Java SDK to create analytical applications with which you access both multidimensional (Online Analytical Processing or OLAP) and tabular (relational) data. You can also edit and display this data. BI Java Connectors, a group of four JCA-enabled (J2EE Connector Architecture) resource adapters, implement the BI

Why DTP and not Infopackage in SAP BI 7.0? or Difference between DTP and Infopackage Posted by SharathLabels: BI, Difference between DTP and Infopackage, INFOPACKAGE, sap BI, Why DTP and not Infopackage in SAP BI Summary This document will help to understand the basic idea of why DTP replaced IP in BI 7.0 for loading the data to data targets and data marts. This document highlights the main difference between DTP & IP

and the significance of DTP over IP. Author: Balamurugan A.J From SDN. Introduction DTP and IP are tools used to load data from external system into BI. In BW3.5 and below versions, IP is used to load data from source system to end data target. Whereas in BI 7.0, IP is used to load data only till PSA and DTP loads into data target and data mart. This document provides understanding of why DTP replaced IP in BI 7.0. First let us understand the basic concept about DTP and IP. DTP

Data Transfer Process Data transfer process (DTP) loads data within BI from one object to another object with respect to transformations and filters. In short, DTP determines how data is transferred between two persistent objects. It is used to load the data from PSA to data target (cube or ods or infoobject) thus, it replaced the data mart interface and the Info Package. DTP fundamentals: http://help.sap.com/saphelp_nw200 4s/helpdata/en/42/f98e07cc483255 e10000000a1553f7/content.htm

IP InfoPackage The InfoPackage is an entry point for SAP BI to request data from a source system. InfoPackages are tools for organizing data requests that are extracted from the source system and are loaded into the BW system. In short, loading data into the BW is accomplished using InfoPackages. IP Basics: http://help.sap.com/saphelp_bw33/ helpdata/en/44/420a3b7a3cf508e10 000000a114084/frameset.htm Reasons behind why DTP is Preferred over IP

Loading DTP follows one to one mechanism i.e. there is one DTP per data target whereas, IP loads for all data targets at once. DTP can load Full/Delta load for same target via same DTP which is not possible via same IP. Delta Mechanism

DTP has separate Source to Target Delta handling mechanism. Thus, it helps in achieving, Possibility of having different data selection for one data source via

different DTP (Same / Different target) whereas this concept is not possible in IP. No DTP request exists in Reconstruct tab as it has separate delta handling mechanism for each target. No "Repair Full /Full Repair" concept in DTP as it has one to one delta mechanism. Deleting the data mart deletes the request itself not just status as in case of IP. Filter DTP filter's the data based on Semantic Key which is not possible in IP.

Having filters at IP, filters data before entering into BI i.e. PSA contains filtered data. Whereas DTP-filter, filters data only at Data Target level thereby keeping all data at PSA. Debugging Breakpoint can be set at data packet level at all the stages (Before Extraction, Before Data Transfer and After Data Transfer.) whereas, breakpoint doesnt exist in IP. Error Handling Temporary Data Storage and Error Stack improve error handling in DTP

which doesnt exist in IP. (Temporary Storage area contains all the data whereas Error stack will have only erroneous records. Temporary storage area can be switched On/Off at each stage of process) Existence Error Handling for DSO is possible only in DTP. Reloading Bad Request without deleting request in the Data target is possible only in DTP using Manual Update option. Performance DTP is faster as DTP data loading is optimized by Parallel process. Data loading through DTP has no

TRFC/LUWs generated at run time thus minimizing the loading time. Migration It's possible to migrate from IP flow to DTP flow if you are in BI 7.0 or above. Create a DTP and set the Processing mode Serial Extraction, Immediate parallel processing to switch from IP to DTP without data loss. Migrating IP flows to DTP Note 920644 - Date store with emulated Data Source: InfoPackages/DTP http://help.sap.com/saphelp_nw2 004s/helpdata/en/43/03c035dc7821 abe10000000a1553f6/content.

Htm

Doc SDN Link: http://www.sdn.sap.com/irj/s cn/index?rid=/library/uuid/f0ada4bf -e963-2d10-aa89-abb9eeae7723

InfoPackage Definition An InfoPackage (active object: ISIP/delivery object :SHIP) is a sourcesystem dependent object that has the GUID as key. There is one object for each source system. You can find additional information about this BW object under Maintaining an InfoPackage. Structure Key Fields of an InfoPackage (ISIP/SHIP)

Version

Key Fields Data Fields A (active) / GUID Source M System (modified) / T (Transport) - ISIP D (Delivery GUID Source /. Content) system SHIP release As the source system does not exist in the key, but is contained within the data part, the InfoPackage objects is also delivered as a shadow table.

Example The following example aims to show what you need to be aware of when delivering InfoPackages within partner Content. The Content development system is connected to source system Q1.There is exactly one InfoPackage with GUID1 for this source system. In the customer system, this InfoPackage is to be activated for several source systems with the same release status: 4.0 Source system X1, Release

4.0

Source system X2, Release

Accordingly, the InfoPackage is copied into the A version twice. The logic behind this is the same as for transfer rules (see Transfer Rules and DataSource). Conversion is carried out byRSSM_SHIP_PSEUDO_DVERSION_ WIR. The copies contains GUID3 and GUID4.

When SAP delivers again, you need to know from which InfoPackage GUID3 and GUID4 are derived. This information is stored in table RSSHIPDVERS: Table RSSHIPDVERS LOGDPID_SH LOGSYS LOGDPID GUID3 X1 GUID2 GUID4 X2 GUID2

See also:

Transport and Delivery of Source-System Dependent Objects

InfoPackage The following table gives an overview of the InfoPackage properties that are relevant for transport, BI Content delivery, and BI Content activation.

Classific ation Degree Directly dependent of depende ncy Type of Dependency in object key depende ncy Key GUID technolo gy Active Object Transpor ISIP t object

Object key

Logical InfoPackage ID

Source Field LOGSYS in primary system table: Reference to source referenc system e in data part Referenc RSSHIPDVERS Table e to shadow object Primary table Import version RSLDPIO A

Object A | Pseudo D versions in primary table Shadow Object Transpor SHIP t object for delivery Object key Logical InfoPackage ID

BI Content version referenc e in data part

Fields SRCREL | CONTSRCTYPE | CONTSRCVERS in the primary table

Identifica Separate tables tion for shadow version Primary table Pseudo D version RSLDPIOSH Persistent through module RSSM_SHIP_PSEUDO_D VERSION_WRI

Mainten Creating InfoPackages ance Example The following example shows what you need to bear in mind when delivering InfoPackages. The content development system is connected to the Q1 source system. There is exactly one InfoPackage with GUID1 for this source system. In the customer system, this InfoPackage is to be activated for several source systems with the same release status: Source system X1, release 4.0 Source system X2, release 4.0

Accordingly, the InfoPackage is copied twice to the A version. The logic is similar to the logic for transfer rules (more information: Source System-Dependent 3.x Objects). Conversion is carried out by the RSSM_SHIP_PSEUDO_DVERSION _WRI function module. The copies get GUID3 and GUID4.

If there is another delivery, you have to know from which InfoPackage GUID3 and GUID4 are derived. This information is stored in the RSSHIPDVERS table: RSSHIPDVERS Table LOGDPID_SH LOGSYS LOGDPID GUID2 X1 GUID3 GUID2 X2 GUID4 Using the RSSM_SHIPDVERS_CLEANUP report, you can display and repair inconsistent InfoPackages if necessary. For more information, see SAP Note 844739.

More Information Object Dependencies and Consistency Classification of Source SystemDependent Objects Enhancements to the BI Versioning Concept

Maintaining InfoPackages Use With the help of an InfoPackage, you determine the selection conditions in

BW for the data request from a source system. In this section, you can select the contents of the InfoPackages so that it can be scheduled. If you do not enter any specific selections, then all of the available data for the DataSource will be requested from the source system. You only have to make a selection when dealing with hierarchies. If you have previously selected a delta process then you only get the affected delta for your

selection. There is a maximum of one active delta selection for a combination of InfoSource, DataSource and source system. If you change already scheduled InfoPackages, your new selection is used in later background jobs, for example, periodic runs. You have to create a new InfoPackage if you want to continue to use the existing InfoPackage. If you access an external system with the data request, you must make the data available in special formats. You can find

additional information about this in the sectionsMaintaining InfoSources (Flat Files) and Maintaining InfoSources (External Systems)). InfoPackages have a connection to the BW transport system anddocument management. Procedure 1. Highlight a source system in the InfoSource tree in the Administrator Workbench. 2. Choose Context menu (right mouse click) Create InfoPackageand enter any

alphanumeric or numeric name for the InfoPackage. Dialog box that follows will show you which source system you want to load data and to which InfoSource. You will see the DataSources that are assigned to the InfoSource to which data is to be loaded. 3. Select a DataSource for which you want to load data into BW. If the InfoPackage is scheduled, data is loaded into BW from the DataSource selected here. Now go through the procedure for the relevant system: Procedure for SAP Systems

Procedure for Flat Files Procedure for External Systems with BAPI Functions

Maintaining InfoPackages Use You use an InfoPackage to determine the selection conditions in BI for a data request from a source system. In this step you select the contents of the InfoPackage so that it can be scheduled. If you do not enter specific selections, all of the available

data for the DataSource is requested from the source system. You only have to enter a selection if you are working with hierarchies. If you have selected a delta process, you only get the relevant delta for your selection. A maximum of one active delta selection exists for a combination of InfoSource, DataSource and source system. If you change InfoPackages that have already been scheduled, your new selection is used for all subsequent background jobs, such as periodic runs. If you want to continue to use the

existing InfoPackage, you have to create a new InfoPackage. If you are accessing a flat file or a non-SAP system with the data request, you have to make the data available in particular formats. For more information, see Maintaining InfoSources (Flat Files)and Maintaining InfoSources (External Systems). InfoPackages have a connection to the transport system and to the BI document management system (more information: Transporting BI Objects, Documents). Procedure 1. ...

1.

1. In the InfoSource tree in the Data Warehousing Workbench, select a source system. 2. 2. In the context menu of the source system, choose Create InfoPackage and enter any alphanumeric or numeric name for the InfoPackage. A dialog box appears which displays the InfoSource and source system that you are using in the data load. This dialog box only displays the DataSources that are assigned to the InfoSource into which you are loading data. 3. 3. Select the DataSource that you want to use to load data into BI.

If the InfoPackage is scheduled, data is loaded into BI from the DataSource you select here. Perform the procedure for the appropriate system: Procedure for SAP systems Procedure for flat files Procedure for non-SAP systems with BAPI functionality

Skip to main content


Sign in (or register) English Technical topics

Evaluation software Community Events


Search dev Search

developerWorks Information Management Technical library Data integration with SAP NetWeaver Business Intelligence using IBM Information Server Jeff J. Li (liji@us.ibm.com), Advisory Software Engineer, IBM Summary: The SAP NetWeaver Business Intelligence (SAP BI) is an integral part of the SAP NetWeaver 2004s application server. Designing extract, transform, and load (ETL) jobs to load data into and extract data

from SAP BI is often a timeconsuming task. This article provides a step-by-step guide on how to use the IBM Information Server to load data into the SAP BI. It also explains in detail the procedure used to leverage the IBM Information Server to extract data from the SAP BI. Tags for this article: application_and_information_ integration,business_intelligence, bw, crossproduct_integration,data_warehousin g, databases_and_data_managemen t,datastage, infosphere_datastage,inf osphere_information_server,installati on/configuration/planning... more tags integrating_products, sap

Tag this! Update My dW interests (Log in | What's this?)Skip to help for Update My dW interests Date: 21 Feb 2008 Level: Introductory Also available in: Russian Activity: 12240 views Comments: 1 (View | Add comment - Sign in) Average rating (17 votes) Rate this article Introduction SAP BI provides an Enterprise Data Warehouse solution for SAP customers to consolidate and analyze their business data.

Enterprises can use the SAP BI to read into their business, react to the market changes, and gain competitive advantages. In SAP NetWeaver 2004, SAP BI was called SAP Business Information Warehouse (SAP BW). Building an SAP BI data warehouse is a complex project. It includes various activities such as planning, data modeling, data sizing, ETL job design, and performance tuning. Designing the ETL jobs to load data into and extract data from SAP BI is often the most time consuming task. IBM Information Server is a unified and comprehensive information integration platform. Businesses can use IBM Information Server to connect to various data sources,

retrieve and process data contents, and deliver cleansed and high quality information. In the SAP BI projects, IBM Information Server can be leveraged as an efficient ETL tool to process a large volume of data and build the enterprise data warehouse. Appendix A explains the terminology used in this article. Appendix B lists the tools for creating the examples shown in this article. Back to top Product prerequisites and installation IBM Information Server includes many software products for data integration and analysis tasks. Those products include WebSphere DataStage, WebSphere QualityStage, WebSphere

Information Analyzer, WebSphere Federation Server, and other companion products. Depending on the specific project requirements, you can choose to install a subset of the products in the IBM Information Server. Figure 1. Software products needed for designing ETL jobs for SAP BI and SAP BW

Figure 1 shows the minimum set of IBM Information Server products

needed to design ETL jobs for the SAP BI or BW data warehouse. WebSphere DataStage, which includes: DataStage Client DataStage Server DataStage Metadata Repository DataStage Domain Server WebSphere DataStage products can be installed separately on different hosts or installed on the same host.
o o o o

WebSphere DataStage Pack for SAP BW (DataStage BW Pack) The DataStage BW Pack is a companion product of the IBM Information Server. The pack was originally developed to support

SAP BW and currently supports both SAP BW and SAP BI. The GUIs of the DataStage BW Pack are installed on the DataStage Client. The runtime part of the Pack is installed on the DataStage Server. SAP Remote Function Call (RFC) Library The SAP RFC library is an external component to the IBM Information Server. The DataStage BW Pack uses the SAP RFC interface to call SAP BI and SAP BW functions. The SAP RFC library is a prerequisite for using the DataStage BW Pack and must be installed on both the DataStage Client and Server. Back to top Architecture overview

The software components in Figure 1 play different roles in designing and executing the ETL jobs for SAP BI. The DataStage Client and DataStage BW Pack GUI components provide a friendly user interface to design ETL jobs and to set up the data operations to be performed on SAP BI systems. The DataStage Server and DataStage BW Pack Server components enable users to schedule and run the ETL jobs. The DataStage Domain Server manages user accounts and authorizes users to use different features of the IBM Information Server.

The DataStage Metadata Repository is a database for storing and sharing tables, fields, or object definitions. The DataStage BW Pack includes four major components: BW Load Stage: Loads data from non-SAP data sources to an SAP BI system. BW Load Stage is an SAP-certified, data-loading integration solution implemented using the SAP Staging BAPI interface. BW Extract Stage: Extracts data from an SAP BI system. It is an SAP-certified, data-extraction integration solution based on the SAP Open Hub Service interface. BW RFC Server: Implements various functions that are invoked

by an SAP BI system. It accepts the SAP BI initiated data-loading or data-extraction requests and triggers the DataStage jobs to execute the corresponding data operations. BW RFC Manager: Manages the BW RFC Server processes. It creates one BW RFC Server process per source system. It also provides the functions to start or stop BW RFC Server processes. A source system represents a logical or physical system that is external to an SAP BI system. A source system provides source data to an SAP BI system or accepts extracted data from an SAP BI system. Back to top

Load data into SAP BI DataStage jobs can be designed to retrieve, cleanse, and consolidate the data from non-SAP sources and load the data into SAP BI systems. For example, you can extract the customer data from your CRM applications and then look up the purchase orders for your customers in your purchase order applications. The consolidated purchase orders can be loaded into your SAP BI system for analysis. This section uses a simple ETL job to illustrate the steps necessary for loading data into an SAP BI system. Figure 2 shows the sample job. The job extracts customer data from an Oracle database table using an ODBC Stage. It passes the extracted

data to the BW Load Stage LoadDataToSAPBI, which loads the processed data into the CUSTOMER Characteristic in an SAP BI system. Table 1 shows the sample data in the Oracle database table. The CUSTOMER Characteristic is created in the SAP BI using the SAP Data Warehousing Workbench, which is shown in Figure 3 and Figure 4. Figure 2. DataStage job for loading data into SAP BI

Table 1. Sample data in the Oracle database table

ID

NAME

STREET CITY New Armonk Orchard Road

REG NY

0000001 IBM

0000002 Microsoft One Redmond WA Microsoft Way 0000003 SAP 3999 Newtown PA America West Square Chester Pike

POSTALID COUNTRY PHONE FAX TAXN

10504

USA

8004264968 8006427676 6106611000

866- 00000 7229226

98052

USA

425- 00000 9367329

19073

USA

404- 00000 9432950

Figure 3. CUSTOMER Characteristic in SAP BI

Figure 4. CUSTOMER Characteristic in SAP BI Attribute tab

Figure 5. Data flow diagram of BI

data load operation

Figure 5 illustrates the data flow diagram of a BI data load operation. 1. A source system is defined to represent one or more DataStage jobs loading data into the SAP BI. A data transfer structure describes the data available in the source

system. A DataStage job loads the data from external data sources into an SAP BI Persistent Staging Area (PSA) staging table. 2. Transfer rules are defined to transfer data from the staging table into an InfoSource. An InfoSource is a collection of data fields treated as a single unit. The communication structure defines the data fields of the InfoSource. 3. Update rules are created to transform the data from the InfoSource to one or more BI data targets (InfoObjects, DataStore objects, or BI InfoCubes). The BW Load Stage provides a Stage Editor, as shown in Figure 6. The Stage Editor contains several

tabs for setting various parameters of the BI load operation. The General tab creates or selects an RFC connection to the SAP BI. The Transfer Structure tab creates or selects a source system and an InfoSource. The Columns tab displays the column definitions of the data being sent to the SAP BI. The InfoPackage tab defines how and when the DataStage job loads data into the SAP BI. The Process Chain tab enables you to run the data load operation as a process within a BI process chain. A process chain provides the workflow function. It is used to design and schedule a series of

dependent data-processing processes. Figure 6. BW Load Stage Editor

The BW Load Stage LoadDataToSAPBI in Figure 2 loads the data in Table 1 into the SAP BI. Multiple steps are involved in setting up the BW Load Stage. Those steps

are illustrated in Figure 7. They are described in detail in the following sections. Figure 7. Setting up the BW Load Stage LoadDataToSAPBI

BW connection setup

BI connection properties are set in the General tab, shown in Figure 6. Menu items are provided to create, select, or modify BI connections. Figure 8 shows the new connection dialog window. The new BI connection DEMOCONN is saved on the DataStage server, and can be reused to design other DataStage jobs to perform BI data operations. Figure 8. Connection property

dialog window

Source system setup The source system property is set in the Transfer Structure tab, shown in Figure 9. Menu items are provided to create, select, or view source systems. In SAP BI, a source system can be a flat file, an SAP system, a database system, a multidimensional data source, a Web service, or a staging

BAPI interface-based external application. BW Load Stage uses the staging BAPI APIs to exchange metadata with SAP BI and load data into SAP BI. Figure 9. Transfer Structure tab

The New menu item in Figure 9 creates a new source system. When

a new source system is created, BW RFC Manager starts a new BW RFC Server process. The RFC Server process waits for data-load requests from SAP BI and requests the corresponding DataStage job to start the data-load operation. The new source system is saved on the DataStage server for future reuse. Figure 10 shows that a new source system DEMODSSRC is created. The BW RFC Server process for the new source system is automatically created on the DataStage server and registered with the SAP BI as the program <hostname>.DEMODSSRC. As shown in Figure 11, the SAP transaction sm59 can be used to see and test the RFC destination for the new source system.

Figure 10. Create a source system

Figure 11. RFC destination

InfoSource setup An InfoSource is set on the Transfer Structure tab, shown in Figure 12. Menu items are provided on the tab to create, update, view, and search

SAP BI characteristics, key figures, and InfoSources. Figure 12. InfoSource menu items

The menu item Create Master InfoSource from Existing Characteristic creates an InfoSource based on an existing characteristic. Two subsequent user

actions are needed when this menu item is selected: 1. Select an existing Characteristic. BW Load Stage shows the BI characteristics matching the search condition and allows the selection of an existing characteristic. Figure 13 selects the CUSTOMER characteristic. Figure 13. Select an existing

characteristic

2. Specify the properties of the new BI InfoSource object, shown in Figure 14. Figure 14. Specify the

properties of new InfoSource

As shown in Figure 15, the BW Load Stage creates the specified InfoSource in the SAP BI. The Stage also selects the InfoSource on the Transfer Structure tab in Figure 16. Figure 15. New InfoSource

Figure 16. Select new InfoSource

DataStage table definition setup When an InfoSource is selected on the Transfer Structure tab, a DataStage table definition is created based on the transfer structure of the InfoSource. Figure 17 shows the table definition. Table 2 shows how

the SAP data types are mapped to the DataStage data types. The table definition can be validated and synchronized with the InfoSource fields using the Validate Columns and Synchronize Columns buttons. Figure 17. Columns tab

Table 2. Data type mapping table SAP data type DataStage data type SQL DATE SQL CHAR SQL TIME SQL FLOAT SQL CHAR

DATS CURR TIMS FLTP CHAR (no more than 256 characters) CHAR (more than 256 characters)

SQL VARCHAR

InfoPackage setup The InfoPackage is set on the InfoPackage tab, shown in Figure 18. The InfoPackage is an entry point for SAP BI to request data from a source system. An InfoPackage defines when and how a DataStage job loads data into an SAP BI system. The InfoPackage tab creates and selects an InfoPackage. As shown in Figure 19, the tab also allows you to set the InfoPackage properties. Figure 18. InfoPackage tab

Figure 19. InfoPackage property

dialog window

BW Load Stage supports three data load mechanisms: Push mode: A DataStage job is started first. The DataStage job schedules the InfoPackage for the job to start the data loading operation.

Pull mode: An InfoPackage is scheduled first using SAP Data Warehousing Workbench. When the SAP BI is ready to receive data, it notifies the RFC Server process. The RFC server process launches the DataStage job to send data to the SAP BI. File mode: A DataStage job runs first. The DataStage job saves the data for SAP BI to a temporary file. An InfoPackage is then scheduled to load the data in the file into SAP BI. The InfoPackage third-party parameters are defined to support third-party integration tools like the DataStage BW Pack. As shown in Figure 20, the DataStage job BILoadJob is automatically set as a

third-party parameter for the new pull InfoPackage DEMO Pull InfoPacakge. Figure 20. InfoPackage third party parameters

The use of third-party parameters in the BI data loading process is described as follows: SAP BI schedules and runs the InfoPackage DEMO PULL InfoPackage.

When it is ready to receive data, the InfoPackage sends the loading request to the source system DEMODSSRC. It also passes the third-party parameter DSJob and its value BILoadJob to the source system. The RFC Server process for the source system receives the request and starts the DataStage job BILoadJob to send data packages to the SAP BI. Process Chain Setup The process chain is set on the Process Chain tab. This step is optional. The BW Load Stage can run with or without a process chain. A process chain defines a sequence of dependent processes linked together. The execution of an

InfoPackage is one of the process types that SAP BI defines. As shown in Figure 21, the execution of the InfoPackage DEMO Pull InfoPacakge is added as a process in the process chain Demo Load Chain. Figure 22 selects the process chain Demo Load Chain. Figure 21. Run the data loading job as part of a process chain

Figure 22. Select process chain

Run data load operation The process chain Demo Load Chain must be scheduled to run your BI data load operation. The process chain can be started in two ways: Use the Data Warehousing Workbench. Figure 23 shows the

sequence of activities. Figure 23. Start the process chain using the SAP Data Warehousing Workbench

Use the DataStage job BILoadJob. Figure 24 shows the sequence of activities. Figure 24. Start the process chain using the DataStage job

In this example, the DataStage job is first started to invoke the process chain. Figure 25 shows that the DataStage job runs successfully. Figure 26 shows that the source data is loaded into the SAP BI. Figure 25. Run DataStage job

Figure 26. Data loaded into SAP BI

Back to top Extract data from SAP BI The BW Extract Stage extracts data from an SAP BI system. The extracted data can be fed into nonSAP applications. The BW Extract Stage is based on the SAP Open Hub Service, which defines a controlled and monitored data exporting process. This section uses a simple ETL job to illustrate the steps necessary to extract data from an SAP BI system.

Figure 27 shows the sample job. The job extracts the data from the SAP BI characteristic CUSTOMER, as shown in Figure 4. The job processes the extracted data using a DataStage Transformer Stage and then saves the processed results into a flat file. Figure 27. DataStage job for extracting data from the SAP BI

The BW Extract Stage has a Stage Editor, shown in Figure 28. The Stage Editor contains four tabs for setting up various properties for the BI data extraction operation.

Figure 28. BW Extract Stage GUI

The General tab creates or selects an RFC connection to an SAP BI. The Process Chain tab selects a source system and a process chain. The BI data extraction operation is run as a process within a process chain.

The Open Hub Destination tab selects an InfoSpoke or an Open Hub Destination. The Columns tab displays the column definitions for the extracted data from the SAP BI. The BW Extract Stage ExtractDataFromSAPBI, in Figure 27, extracts data from an SAP BI system. Multiple steps are involved in setting up the BW Extract Stage. Those steps are illustrated in Figure 29 and described in detail in the following sections.

Figure 29. Setting up the BW Extract Stage

ExtractDataFromSAPBI

BI Open Hub Destination and data transfer process setup The BW Extract Stage supports both BW InfoSpoke and BI Open Hub Destination for data extraction.

InfoSpoke is a central piece of the Open Hub Service Pack in SAP BW. An InfoSpoke specifies three properties: An InfoProvider that provides the original data. An InfoProvider can be an InfoCube, a DataStore object, or an InfoObject. An Open Hub Destination that defines the targets to receive the extracted data. An Open Hub Destination can be a flat file, a database table, or a RFC destination. Transformation that converts the data from its original form to the destination form. An InfoSpoke must be created first before it can be selected on the Open Hub Destination tab. BW Extract

Stage supports creating an InfoSpoke and using the InfoSpoke for data extraction. In the SAP BI, Open Hub Destination has been integrated into the new BI data transfer process and is no longer tightly coupled with InfoSpoke. As illustrated in Figure 30, a data transfer process transforms the data from an InfoProvider to an Open Hub Destination. When the data is ready in the Open Hub Destination, SAP BI notifies the DataStage RFC Server process, which starts a DataStage job to extract the data from the Open Hub Destination. A process chain is created to control the whole data extraction process.

Figure 30. BI data extraction diagram

The Extract Stage ExtractDataFromSAPBI, in Figure 27, is set up based on the BI Open Hub Destination. The use of a traditional BW InfoSpoke is not discussed in this article. The following steps summarize how to create the BI artifacts for the stage:

1. Create and activate a new Open Hub Destination using the SAP transaction RSBO or the Data Warehousing Workbench GUI. Figure 31 shows the dialog window for creating a new Open Hub Destination DEMODEST. The attributes of the CUSTOMER characteristic are selected to create the field definitions of the new Open Hub Destination. As shown in Figure 32, the DEMODSSRC RFC destination is selected as the data receiver for the Open Hub Destination. Figure 31. Create new Open Hub Destination

Figure 32. Select RFC destination

2. Create and activate a new data transfer process to transform the

data from an InfoProvider to the Open Hub Destination. Figure 33 shows the dialog window for creating a new data transfer process. The data transfer process transfers the data from the Customer attributes to the Open Hub Destination DEMODEST. Figure 33. Create a new data transfer process

3. Create and activate a new process chain to invoke the new data transfer process. Figure 34 shows that the data transfer process CUSTOMER > DEMODEST is added as a process to the process chain CUSTCHAIN. Figure 34. Create and activate a process chain

Source system and process chain setup

The source system and process chain are set on the Process Chain tab. A source system is the destination for the extracted data. A process chain controls the data extraction process. In Figure 35, DEMODSSRC is selected as the source system. The process chain CUSTCHAIN is selected for the stage ExtractDataFromSAPBI.

Figure 35. Process Chain tab

Open Hub Destination setup An Open Hub Destination is set on the Open Hub Destination tab, as shown in Figure 37. Two user actions are required:

1. Select an Open Hub Destination. In Figure 36, the Open Hub Destination DEMODEST is selected for the stage ExtractDataFromSAPBI. The stage retrieves the DEMODEST definitions from BI and automatically populates various GUI controls shown in Figure 37. The table fields shown in Figure 37 are a part of the DEMODEST definitions. Those fields are converted to a DataStage tab definition that is displayed on the Column tab. The Column tab for the BW Extract Stage is the same as the Column tab for the BW Load Stage shown in Figure 17.

Figure 36. Select an Open Hub Destination

Figure 37. Open Hub Destination tab

2. Update third-party parameters for the Open Hub Destination. An Open Hub Destination supports third-party parameters in

the same way as InfoPackage. The button Update BW, in Figure 37, sets the third-party parameters of the selected Open Hub Destination. Figure 38 shows the result of when the button is clicked. Figure 39 shows that the job name BIExtractJob, the process chain CUSTCHAIN and the source system DEMODSSRC are set as the third-party parameters of the Open Hub Destination DEMODEST. The usage of the third-party parameters in the BW Extract Stage is similar to the usage of the parameters in the BW Load Stage.

Figure 38. Update the thirdparty parameters

Figure 39. DEMODEST thirdparty parameters

Run data extraction operation

Similar to the data load operation, the process chain CUSTCHAIN must be scheduled to run the data extraction operation. It can be started either by the DataStage job BIExtractJob or by the SAP Data Warehousing Workbench. In this example, the DataStage job is started to invoke the process chain. Figure 40 shows that the DataStage job runs successfully and Figure 41 shows the data extracted from the SAP BI. Figure 40. Run the DataStage job

Figure 41. Data extracted from

SAP BI

Back to top Conclusion This article demonstrated how to integrate SAP BI data with non-SAP BI data using IBM Information Server and the WebSphere DataStage SAP BW pack. It explained the SAP BI data loading and extraction processes and the new SAP BI feature: the data transfer process. Two examples illustrated the step-bystep design processes. IBM Information Server provides leading technology and integration solutions to two other critical issues

in the SAP BI Data Warehouse environment: Data Quality: The data that builds a data warehouse often comes from various data sources. The structure of the legacy data is often not documented and the data quality is poor. The WebSphere Information Analyzer product analyzes your data and determines the data structure and quality. It helps you understand your data. The WebSphere QualityStage product standardizes and matches any type of information to create high quality data. Data Volume: There is often a huge amount of data that needs to be processed regularly for a data

warehouse environment. Sometime the data volume grows beyond expectations. The issue needs to be addressed with a scalable ETL architecture. IBM Information Server leverages the pipeline and partition technologies to support high data throughput. IBM Information Server can be deployed on symmetric multiprocessing (SMP) and massively parallel processing (MPP) computer systems to achieve the maximum scalability. Currently work is underway on other SAP BI integration solutions, including direct data accesses and metadata management. The work focuses on developing a fast and efficient ETL solution to build

corporate transactional data stores with massive document-level data items. Acknowledgements I would like to thank Wendi Nusbickel, Nicolas Cominetti, and Susan Shay for their feedback and reviews of this article. Back to top Appendix A: Terminology Terminology Description ETL Extract, Transform, and Load Business Intelligence SAP Business Information Warehouse

SAP BI SAP BW

GUI CRM

Graphical User Interface Customer Relationship Management Open Database Connectivity

ODBC

DataStage job A sequence of data operations performed by IBM Information Server. RFC SAP term, Remote Function Call SAP BI term, Persistent Staging Area

PSA

Staging BAPI SAP BI term, an open

interface for third party ETL tools Open Hub Service SAP BI term, an SAP BW/BI data exporting mechanism SAP BI term, a logical or physical system external to an SAP BI system. SAP BI term, a lowest level information provider SAP BI term, a storage location for consolidated transaction and master data at document level.

Source System

InfoObject

DataStore Object

InfoCube

SAP BI term, several relational tables arranged in a star schema SAP BI term, a quantity of information that logically belongs together SAP BI term, an entry point for requesting data from a source system SAP BI term, an extraction object that exports data within the Open Hub Service SAP BI term, a sequence

InfoSource

InfoPackage

InfoSpoke

Process

Chain

of processes linked together SAP BI term, a selection of data fields from a source system

Transfer Structure

Back to top Appendix B: Tools Figure Tool Figure 2, 25, 27, 40 DataStage Designer Figure 3, 4, 11, 15, 20, 21, 26, 31, 32, 33, 34, 39 Figure 6, 8, 9, 10, SAP Data Warehousing Workbench BW Load Stage

12, 13, 14, 16, 17, 18, 19, 22 Figure 28, 35, 36, 37, 38 Figure 41

Editor

BW Extract Stage Editor DataStage File Stage Editor

Resources Learn developerWorks Information Management zone: Learn more about DB2. Find technical documentation, how-to articles, education, downloads, product information, and more.

Stay current with developerWorks technical events and webcasts.

Get products and technologies Build your next development project with IBM trial software, available for download directly from developerWorks.

Discuss Participate in developerWorks blogs and get involved in the developerWorks community.

About the author

S-ar putea să vă placă și