Sunteți pe pagina 1din 115

Calling RFC from BODS

Posted by Rahul More Aug 22, 2014


Calling RFC from BODS

Introduction:-

In this scenario I am demonstrating about how to call Remote enabled Function Module from BODS.

1) Create SAP Application Datastore.
In this example I am using the SAP_BI as the SAP Application datastore.
As i have created the Fm in BI system, i have crated datastor for that system.

2) Import RFC from SAP system.
In Local Object Library expand SAP datastore.
Right click on Functions & click "Import By Name".



Enter the name of the RFC to import & click on "Import".
Here I am using the ZBAPI_GET_EMPLOYEE_DETAILS as the RFC.


RFC will be imported & can be seen in the Local Objet Library.


Note :- This RFC takes Employee ID as the input & displays Employee details.
I have stored Employee id in the text file, so to read text file I am using File format as the source.

3) Create File Format for flat (text) file.

This file format(here "Emp_Id_Format") has the list of employee ids.



4) Create Job, Workflow, Dataflow as usual.

5) Drag File Format into dataflow & mark it as a Source.

6) Drag a query platform also in to data flow & name it (here "Query_fcn_call").


7) Assign RFC call from Query

Double click on Query.



Right click on "Query_fcn_call" & click "New Function Call".



Select Function window will open. Choose appropriate function & click "Next".


In below window click on button & define an input parameter.



Select the file format that we have created earlier in "Input Parameter" window & press OK.



Select Column name from the input file format & press "OK".
Here the file format has one column only with name as Id.



Click "Next" & select Output Parameters.



Select the required output parameters & click "Finish".
Here i am selecting all the fields.





Now the Query editor for query platform "Query_fcn_call" can be seen as follows.



8) Add another query platform into dataflow for mapping & name it (here "Query_Mapping").



9) Add a template table also.



10) Mapping.
Double click on query "Query_Mapping" & do the necessary mappings.



11) Save the Job, validate & execute.

12) During execution employee id is taken as a input to the RFC & output of the rfc is stored in the table.
Output can be seen as follow after execution.
Here employee ids are taken from the File Format & given to RFC as an input.
Output of the RFC is given as an input to the query "Query_Mapping" where it is mapped to the
target table fields.



Thanks,
Rahul S. More
(Technical Lead)
IGATE Global Solutions Pvt Ltd.


246 Views 0 Comments Permalink Tags: rfc_in_bods, using_rfc_in_bods, rfc__bods

Demo on Real time job and configuration in Data service
Posted by Ravi Kashyap Aug 1, 2014
REAL TIME JOB DEMO
A real-time job is created in the Designer and then configured in the Administrator as a real-time
service associated with an Access Server into the management console..
This Demo will briefly explain the management console setting ..
We can execute the Real time job from any third party tool. let us use SOAPUI(third party tool)
to demonstrate our Real time job.
Below is the screenshot of Batch Job used to create a sample table in the database(First
Dataflow) and create the XML target file(second Dataflow). The XML Target file(Created in the
second Dataflow) can be used to create the XML MESSAGE SOURCE in the real time job.

Below is the screenshot transformation logic of dataflow(DF_REAL_Data)

Below is the screenshot transformation logic of dataflow(DF_XML_STRUCTURE)

Below is the screenshot transformation logic of Query Transform "Query" used in DF_XML_STRUCTURE

Below is the screenshot transformation logic of Query Transform "Query" used in DF_XML_STRUCTURE







Below image show the creation of the Real time job in Data services.







FINALLY RUN THE REAL-TIME JOB USING SOAP UI TOOL
1. Run the SoapUI tool
2. Create the project browser the WSDL file.
3. Under project Real-time servicescheck the project namesend the request.
4. Request Window will open now enter the search string in it.
5. Finally the record will come.
77 Views 0 Comments Permalink

Demo on Real time job
Posted by Ravi Kashyap Jul 29, 2014
REAL TIME JOB DEMO

A real-time job is created in the Designer and then configured in the Administrator as a real-time
service associated with an Access Server into the management console..
This Demo will briefly explain the management console setting ..
We can execute the Real time job from any third party tool. let us use SOAPUI(third party tool)
to demonstrate our Real time job.
Below is the screenshot of Batch Job used to create a sample table in the database(First
Dataflow) and create the XML target file(second Dataflow). The XML Target file(Created in the
second Dataflow) can be used to create the XML MESSAGE SOURCE in the real time job.

Below is the screenshot transformation logic of dataflow(DF_REAL_Data)

Below is the screenshot transformation logic of dataflow(DF_XML_STRUCTURE)

Below is the screenshot transformation logic of Query Transform "Query" used in DF_XML_STRUCTURE

Below is the screenshot transformation logic of Query Transform "Query" used in DF_XML_STRUCTURE

In the Below second query transform to nest the data. Select the complete
Query from schema IN and import under the Query of schema out

Creation of the XML schema from the Local Object Library


Go to the Second Query again and make the Query name same as in the
XML schema(Query_nt_1).
Note: If we do not change the Query name it give a ERROR

In the Below Image the Query name is rename the same name what its displayed in the XML schema

The Below image show the creation of the Real time job.






To Test and Validate the job
In the Demo, The End user pass the EMP_ID(1.000000) using the third party tool which triggers the Real-time job taking the
input as XML MESSAGE SOURCE and obtains other details from the database table based on the EMP_ID Value to the End
user in XML MESSAGE TARGET..
Below is the output of XML file ..


FINALLY RUN THE REAL-TIME JOB USING SOAP UI TOOL :
1. Run the SoapUI tool
2. Create the project browser the WSDL file.
3. Under project Real-time servicescheck the project namesend the request.
4. Request Window will open now enter the search string in it.
5. Finally the record will come
77 Views 0 Comments Permalink

Query to get all the dependent objects and their traverse paths of
a job
Posted by Sivaprasad Sudhir Jul 8, 2014
For a given job this query returns all the dependent objects and their traverse paths.(Job name should be given in
the outer where clause <<JOB_NAME>>)


1. SELECT JOB_NAME
2. , OBJECT
3. , OBJECT_TYPE
4. , PATH
5. FROM
6. (
7. SELECT Other_Objects.DESCEN_OBJ OBJECT
8. , Other_Objects.DESCEN_OBJ_USAGE OBJECT_TYPE
9. , Connection_Path1.PATH || Other_Objects.DESCEN_OBJ || '( ' || Other_Objects.DESCEN_OBJ_USAG
E || ' ) ' PATH
10. , substr(Connection_Path1.PATH, instr(Connection_Path1.PATH, ' -
>> ', 1)+5 , instr(Connection_Path1.PATH, ' ->> ', 2)-(instr(Connection_Path1.PATH, ' -
>> ', 1)+5)) JOB_NAME
11. FROM
12. (
13. SELECT DISTINCT PARENT_OBJ
14. , PARENT_OBJ_TYPE
15. , SYS_CONNECT_BY_PATH(PARENT_OBJ,' ->> ')|| ' ->> ' PATH
16. FROM ALVW_PARENT_CHILD
17. START WITH PARENT_OBJ_TYPE = 'Job'
18. CONNECT BY PRIOR DESCEN_OBJ = PARENT_OBJ
19. ) Connection_Path1,
20. (
21. SELECT PARENT_OBJ
22. , PARENT_OBJ_TYPE
23. , DESCEN_OBJ
24. , DESCEN_OBJ_USAGE
25. FROM ALVW_PARENT_CHILD
26. WHERE PARENT_OBJ_TYPE = 'DataFlow'
27. and
28. DESCEN_OBJ_TYPE = 'Table'
29. )Other_Objects
30. WHERE
31. Connection_Path1.PARENT_OBJ = Other_Objects.PARENT_OBJ
32. AND
33. Connection_Path1.PARENT_OBJ_TYPE = Other_Objects.PARENT_OBJ_TYPE
34. UNION
35. SELECT Connection_Path2.PARENT_OBJ OBJECT
36. , Connection_Path2.PARENT_OBJ_TYPE OBJECT_TYPE
37. , Connection_Path2.PATH PATH
38. , substr(Connection_Path2.PATH, instr(Connection_Path2.PATH, ' -
>> ', 1)+5 , instr(Connection_Path2.PATH, ' ->> ', 2)-(instr(Connection_Path2.PATH, ' -
>> ', 1)+5)) JOB_NAME
39. FROM
40. (
41. SELECT DISTINCT PARENT_OBJ
42. , PARENT_OBJ_TYPE
43. , SYS_CONNECT_BY_PATH(PARENT_OBJ,' ->> ')|| ' ->> ' PATH
44. FROM ALVW_PARENT_CHILD
45. START WITH PARENT_OBJ_TYPE = 'Job'
46. CONNECT BY PRIOR DESCEN_OBJ = PARENT_OBJ
47. ) Connection_Path2
48. ) WHERE
49. JOB_NAME LIKE <<JOB_NAME>>
355 Views 0 Comments Permalink

Jobs Traceability Matrix - Query in BODS
Posted by Sivaprasad Sudhir Jul 7, 2014
All the jobs and its associated component details can be retrieved by executing Query against below metadata tables

Database: <Repository DB> <Repository login>

Ex: UBIBOR01 d2_14_loc
Tables: ALVW_PARENT_CHILD, AL_PARENT_CHILD, AL_LANG, AL_USAGE, etc.

This is a query which will list all the jobs and their traverse paths till Source / Target table




select Connection_Path.PATH || Other_Objects.DESCEN_OBJ || '( ' || Other_Objects.DESCEN_OBJ_USAGE || ' ) ' PATH
, substr(Connection_Path.PATH, 2 , instr(Connection_Path.PATH, ' ->> ', 2)-2) Job_Name
FROM
(
SELECT DISTINCT PARENT_OBJ
, PARENT_OBJ_TYPE
, SYS_CONNECT_BY_PATH(PARENT_OBJ,' ->> ')|| ' ->> ' PATH
FROM ALVW_PARENT_CHILD
START WITH PARENT_OBJ_TYPE = 'Job'
CONNECT BY PRIOR DESCEN_OBJ = PARENT_OBJ
) Connection_Path,
(
SELECT PARENT_OBJ
, PARENT_OBJ_TYPE
, DESCEN_OBJ
, DESCEN_OBJ_USAGE
FROM ALVW_PARENT_CHILD
WHERE PARENT_OBJ_TYPE = 'DataFlow'
and
DESCEN_OBJ_TYPE = 'Table'
)Other_Objects
WHERE
Connection_Path.PARENT_OBJ = Other_Objects.PARENT_OBJ
AND
Connection_Path.PARENT_OBJ_TYPE = Other_Objects.PARENT_OBJ_TYPE
406 Views 0 Comments Permalink

DS Standard Recovery Mechanism
Posted by Samatha Mallarapu Jul 4, 2014
Introduction:
This document gives overview of standard recovery mechanism in Data Services.

Overview: Data Services provides one of the best inbuilt features to recover job from failed state. By enabling recovery, job will
start running from failed instance

DS provides 2 types of recovery
Recovery: By default recovery is enabled at Dataflow level i.e. Job will always start from the dataflow which raised exception.
Recovery Unit: If you want to enable recovery at a set of actions, you can achieve this with recovery unit option. Define all your
actions it in a Workflow and enable recovery unit under workflow properties. Now in recovery mode this workflow will run from
beginning instead of running from failed point.

When recovery is enabled, the software stores results from the following types of steps:
Work flows
Batch data flows
Script statements
Custom functions (stateless type only)
SQL function
exec function
get_env function
rand function
sysdate function
systime function

Example:
This job will load data from Flat file to Temporary Table. (I am repeating the same to raise Primary Key exception)



Running the job:

To recover the job from failed instance, first job should be executed by enabling recovery. We can enable under execution
properties.

Below Trace Log shows that Recovery is enabled for this job.


job failed at 3rd DF in 1st WF. Now i am running job in recovery mode

Trace log shows that job is running in Recovery mode using recovery information from previous run and Starting from Data
Flow 3 where exception is raised.

DS Provides Default recovery at Dataflow Level

Recovery Unit:
With recovery, job will always starts at failed DF in recovery run irrespective of the dependent actions.

Example: Workflow WF_RECOVERY_UNIT has two Dataflows loading data from Flat file. If any of the DF failed, then both
the DFs have to run again.

To achieve, This kind of requirement, we can define all the Activities and make that as recovery unit. When we run the job in
recovery mode, if any of the activity is failed, then it starts from beginning.

To make a workflow as recovery unit, Check recovery Unit option under workflow properties.

Once this option is selected,on the workspace diagram, the black "x" and green arrow symbol indicate that a work flow is a
recovery unit.

Two Data Flows under WF_RECOVERY_UNIT



Running the job by enabling recovery , Exception encountered at DF5.



Now running in recovery mode. Job uses recovery information of previous run. As per my requirement, job should run all the
activities defined under Work Flow WF_RECOVERY_UNIT instead of failed DataFlow.



Now Job Started from the beginning of the WF_RECOVERY_UNIT and all the Activities defined inside the workflow will run
from the beginning insted of starting from Failed DF (DF_RECOVERY_5).

Exceptions:
when you specify a work flow or a data flow should only execute once, a job will never re-execute that work flow or data flow
after it completes successfully, except if that work flow or data flow is contained within a recovery unit work flow that re-
executes and has not completed successfully elsewhere outside the recovery unit.
It is recommended that you not mark a work flow or data flow as Execute only once when the work flow or a parent work flow is
a recovery unit.
445 Views 1 Comments Permalink Tags: recovery, disaster_recovery, bods_concepts, restore;, job_recovery

How to improve performace while using auto correct load
Posted by Sivaprasad Sudhir Jun 27, 2014
Using auto correct load option in target table will degrade the performance of BODS jobs. This prevents a full push-down
operation from the
source to the target when the source and target are in different datastores.

But then Auto correct load option is an inavoidable scenario where no duplicated rows are there in the target. and its very useful
for data recovery operations.


When we deal with large data volume how do we improve performance?

Using a Data_Transfer transform can improve the performance of a job. Lets see how it works :-)

Merits:
Data_Transfer transform can push down the operations to database server.
It enables a full push-down operation even if the source and target are in different data stores.
This can be used after query transforms with GROUP BY, DISTINCT or ORDER BY functions which do not allow push down

The idea behind here is to improve the performance is to push down to database level.

Add a Data_Transfer transform before the target to enable a full push-down from the source to the target. For a merge operation
there should not be any duplicates in the source data. Here the data_transfer pushes down the data to database and update or
insert record into the target table until duplicates are not met in source.








247 Views 0 Comments Permalink

How to set exact value for ROWS PER COMMIT in Target
table
Posted by Sivaprasad Sudhir Jun 26, 2014
As we know that the default value to set on rows per commit is 1000, and maximum value to set is 50000.
BODS recommends to set the rows per commit value between 500 and 2000 for best performance.

The value of rows per commit depends based on number of columns in the target table.

Here is the formula for the same -

Calculate Rows per commit value = max_IO_size (64K for most of the
platform)/row size


Row size = (# of columns)*(20 bytes (average column size) )* 1.3 (30% overhead)



Eg: - If no. of columns is 10 then row size = 10*20*1.3=26


CommitRowValue
= 64 K/ 26 = 2.4 K
405 Views 4 Comments Permalink

SAP DataServices 4.2 Transports Re-Deployment Issues
Posted by chiranjeevi poranki Jun 25, 2014
Data Load from SAP ECC to SAP HANA

This a workaround to connect the SAP system as source and SAP HANA as target, establish connections
using data services data stores
and identify the issues that incurs during the process.

Creating a Data store to connect to ECC system:

Right click in the data store tab
Data store name: Provide meaningful name
Data Store Type: Select SAP Application from the drop down
Database Server Name: Enter provided server name
User Name:
Password:





In the Advance section,

Data transfer method: Select Shared directory( in my case)
Note: Select RFC, if RFC connection is established.
Working Directory on SAP Server: Provide the working directory path.
Application path to shared directory: Path on your local Directory.



Creating a Data store to connect to HANA system:
Data store name: Provide meaningful name
Data Store Type: Select Database from the drop down
Database Type: SAP HANA
Database Version: HANA1.X
Select the check box Use data store name (DSN)
Data source Name:
User Name:
Password:



After successful creating of both the data stores, import the respective source and target tables

Create a Job followed data store. Drag the source table, use query transform, map required fields to the output schema, connect to
the target table, validate and execute.



After successful execution of the job, record count can be seen in the monitor log as shown below.




ISSUES:
Make sure to have read and write access to the working directory from BODS System.
Working directory: E:\usr\sap\XYZ\ABCDEF\work
If in case of any issues, follow up with the basis team.
Make sure both BODS and ECC are in same domain. The users can be added from one system to another
system if they are in same system.
For the current version BODS 4.2, Had an issue with the transport files. For the same, found a note -
1916294.



Basis team implemented the above note.
After implementing the above note, got the below issue when executed the job



For the above issue, basis team granted permission to the functional
module \BODS\RFC_ABAP_INSTALL_AND_RUN

Make sure that the account has the following authorizations:
*S_DEVELOP
*S_BTCH_JOB
*S_RFC
*S_TABU_DIS
*S_TCODE
175 Views Permalink

Put Together A Data Archiving Strategy And Execute It Before
Embarking On Sap Upgrade
Posted by Avaali Solutions Jun 20, 2014
A significant amount is invested by organizations in a SAP upgrade project. However few really know that data archiving before
embarking on SAP upgrade yields significant benefits not only from a cost standpoint but also due to reduction in complexity
during an upgrade. This article not only describes why this is a best practice but also details what benefits accrue to
organizations as a result of data archiving before SAP upgrade. Avaali is a specialist in the area of Enterprise Information
Management. Our consultants come with significant global experience implementing projects for the worlds largest
corporations.

Archiving before Upgrade
It is recommended to undertake archiving before upgrading your SAP system in order to reduce the volume of transaction data
that is migrated to the new system. This results in shorter upgrade projects and therefore less upgrade effort and costs. More
importantly production downtime and the risks associated with the upgrade will be significantly reduced. Storage cost is another
important consideration: database size typically increases by 5% to 10% with each new SAP software release and by as much
as 30% if a Unicode conversion is required. Archiving reduces the overall database size, so typically no additional storage costs
are incurred when upgrading.

It is also important to ensure that data in the SAP system is cleaned before your embark on an upgrade. Most organizations tend
to accumulate messy and unwanted data such as old material codes, technical data and subsequent posting data. Cleaning your
data beforehand smoothens the upgrade process, ensure you only have what you need in the new version and helps reduce project
duration. Consider archiving or even purging if needed to achieve this. Make full use of the upgrade and enjoy a new, more
powerful and leaner system with enhanced functionality that can take your business to the next level.

Archiving also yields Long-term Cost Savings
By implementing SAP Data Archiving before your upgrade project you will also put in place a long term Archiving Strategy
and Policy that will help you generate on-going cost savings for your organization. In addition to moving data from the
production SAP database to less costly storage devices, archived data is also compressed by a factor of five relative to the space it
would take up in the production database. Compression dramatically reduces space consumption on the archive storage media
and based on average customer experience, can reduce hardware requirements by as much as 80% or 90%. In addition, backup
time, administration time and associated costs are cut in half. Storing data on less costly long-term storage media reduces total
cost of ownership while providing users with full, transparent access to archived information.
107 Views 0 Comments Permalink Tags: sap, management, enterprise, data, archiving

Functions - Data Services
Posted by Sujitha Grandhi May 22, 2014
This document describes briefly all available functions of Data Services.
613 Views 5 Comments Permalink

SCD Type 1 Full Load With Error Handle - For Beginners
Posted by Venky D May 22, 2014
This example may help us to understand the usage of SCD Type 1 and with how to handle the error
messages.


Brief about Slowly Changing Dimensions: Slowly Changing Dimensions are dimensions that have data that changes over
time.
There are three methods of handling Slowly Changing Dimensions are available: Here we are concentrating only on SCD Type 1.


Type 1- No history preservation - Natural consequence of normalization.

For a SCD Type 1 change, you find and update the appropriate attributes on a specific dimensional record. For example, to
update a record in the
SALES_PERSON_DIMENSION table to show a change to an individuals SALES_PERSON_NAME field, you simply update
one record in the
SALES_PERSON_DIMENSION table. This action would update or correct that record for all fact records across time. In a
dimensional model, facts have no meaning until you link them with their dimensions. If you change a dimensional attribute
without appropriately accounting for the time dimension, the change becomes global across all fact records.

This is the data before the change:

SALES_PERSON_
KEY
SALES_PERSON_
ID
NAME SALES_TEAM
15 00120 Doe, John B Atlanta

This is the same table after the salespersons name has been changed:


SALES_PERSON_
KEY
SALES_PERSON_
ID
NAME SALES_TEAM
15 00120 Smith, John B Atlanta


However, suppose a salesperson transfers to a new sales team. Updating the salespersons dimensional record would update all
previous facts so that the
salesperson would appear to have always belonged to the new sales team. This may cause issues in terms of reporting sales
numbers for both teams. If you want to preserve an accurate history of who was on which sales team, Type 1 is not appropriate.


Below is the step by Step Batch Job creation using SCD Type 1 using error Handling.


Create new job


Add Try and "Script" controls from the pallet and drag to the work area

Create a Global variable for SYSDATE



Add below script in the script section.

# SET TODAYS DATE
$SYSDATE = cast( sysdate( ), 'date');
print( 'Today\'s date:' || cast( $SYSDATE, 'varchar(10)' ) );


Add DataFlow.

Now double click on DF and add Source Table.

Add Query Transformation

Add LOAD_DATE new column in Query_Extract
Map created global variable $SYSDATE. If we mention sysdate() this functional call every time which may hit the performance.



Add another query transform for lookup table

Create new Function Call for Lookup table.






Required column added successfully via Lookup Table.

Add another Query Transform. This query will decide whether source record will insert and update.

Now remove primary key to the target fileds.



Create new column to set FLAG to update or Insert.

Now write if then else function if the LKP_PROD_ID is null update FLAG with INS if not with UPD.

ifthenelse(Query_LOOKUP_PRODUCT_TIM.LKP_PROD_KEY is null, 'INS', 'UP')



Now Create case Transform.




Create two rules to FLAG filed to set INS or UPD
Create Insert and Update Query to align the fields
Change LKP_PROD_KEY to PROD_KEY and PROD_ID to SOURCE_PROD_ID for better understanding in the target table.
Now create Key Generation transform to generate Surrogate key
Select Target Dimension table with Surrogate key (PROD_KEY)
Set Target instance



Add a Key_Generation transformation for the Quary_Insert to add count for the new column.

And for Query _Update we need Surrogate key and other attributes. Use the Map Operation transform to update records.

By default Normal mode as Normal. We want to update records in normal mode.



Update Surrogate key, Product key and other attributes.

Go back to insert target table --> Options --> Update Error Handling as below:


Go back to Job screen and create catch block



Select required exception you want to catch. and Create script to display error messages



Compose your message to print errors in the script_ErrorLogs as below.

print( 'Error Handling');
print( error_message() || ' at ' || cast( error_timestamp(), 'varchar(24)'));
raise_exception( 'Job Failed');

now Validate script before proceed further.

Now these messages will catch errors with job completion status.
Now create a script to print error message if there is any database rejections:



# print ( ' DB Error Handling');
if( get_file_attribute( '[$$LOG_DIR]/ VENKYBODS_TRG_dbo_Product_dim.txt ', 'SIZE') > 0 )
raise_exception( 'Job Failed Check Rejection File');

note: VENKYBODS_TRG_dbo_Product_dim.txt is the file name which we mentioned in the target table error handling section.

Before Execute, Source and Target table data of Last_updated_Date.



Now Execute the job and we can see the Last_Updated_Dates.


Now try to generate any error to see the error log captured our error Handling.

try to implement the same and let me know if you need any further explanation on this.

Thanks
Venky
505 Views Permalink Tags: scd, scdtype, nohistorypreservation

Better Python Development for BODS: How and Why
Posted by Jake Bouma Apr 23, 2014
Not enough love: The Python User-Defined Transform




In my opinion, the python user-defined transform (UDT) included in Data Services (Data Quality -> UserDefined) bridges
several gaps in the functionality of Data Services. This little transform allows you to access records individually and perform any
manipulation of those records. This post has two aims: (1) to encourage readers to consider the Python transform the next time
things get tricky and (2) to give experienced developers an explanation on how to speed up their Python development in BODS.

Currently, if you want to apply some manipulation or transformation record by record you have two options:
1. Write a custom function in the BODS Scripting language and apply this function as a mapping in a query.
2. Insert a UDT and write some python code to manipulate each record.

How to choose? Well, I would be all for keeping things within Data Services, but the built-in scripting language is a bit dry of
functionality and doesn't give you direct access to records simply because it is not in a data flow. In favour of going the python
route are the ease and readability of the language, the richness of standard functionality and the ability to import any module that
you could need. Furthermore with Python data can be loaded into memory in lists, tuples or hash-table like dictionaries. This
enables cross-record comparisons, aggregations, remapping, transposes and any manipulation that you can imagine! I hope to
explain how useful this transform is in BODS and how nicely it beefs up the functionality.

For reference, the UDT is documented chapter 11
ofhttp://help.sap.com/businessobject/product_guides/sbods42/en/ds_42_reference_en.pdf
The best way to learn python is perhaps just to dive in, keeping a decent tutorial and reference close at hand. I won't recommend
a specific tutorial; rather google and find one that is on the correct level for your programming ability!

Making Python development easier
When developing I like to be able to code, run, check (repeat). Writing Python code in the Python Smart Editor of the UDT is
cumbersome and ugly if you are used to a richer editor. Though it is a good place to start with learning to use the Python in
BODS because of the "I/O Fields" and "Python API" tabs, clicking through to the editor every time you want to test will likely
drive you mad. So how about developing and testing your validation function or data structure transform on your local machine,
using your favourite editor or IDE (personally I choose Vim for Python)? The following two tips show how to achieve this.

Tip#1: Importing Python modules
Standard Python modules installed on the server can be imported as per usual using import. This allows the developer to
leverage datetime, string manipulation, file IO and various other useful built-in modules. Developers can also write their own
modules, with functions and classes as needed. Custom modules must be set up on the server, which isn't normally accessible to
Data Services Designers.

The alternative is to dynamically import custom modules given their path on the server using the imp module. Say you wrote a
custom module to process some records called mymodule.py containing a function myfunction. After placing this
module on the file server at an accessible location you can access its classes and functions in the following way

1. import imp
2. mymodule = imp.load_source('mymodule', '/path/to/mymodule.py')
3. mymodule.myfunction()

This enables encapsulation and code reuse. You can either edit the file directly on the server, or re-upload it with updates, using
your preferred editor. What I find particularly useful is that as a data analyst/scientist/consultant/guy (who knows these days) I
can build up an arsenal of useful classes and functions in a python module that I can reuse where needed.

Tip#2: Developing and testing from the comfort of your own environment
To do this you just need to write a module that will mimic the functionality of the BODS classes. I have written a module
"fakeBODS.py" that uses a csv file to mimic the data that comes into a data transform (see attached). Csv input was useful
because the transforms I was building were working mostly with flat files. The code may need to be adapted slightly as needed.

Declaring instances of these classes outside of BODS allows you to compile and run your BODS Python code on your local
machine. Below is an example of a wrapping function that I have used to run "RunValidations", a function that uses the
DataManager and Collection, outside of BODS. It uses the same flat file input and achieves the same result! This has sped up
my development time, and has allowed me to thoroughly test implementations of new requirements on a fast changing project.

1. def test_wrapper():
2. import fakeBODS
3. Collection = fakeBODS.FLDataCollection('csv_dump/tmeta.csv')
4. DataManager = fakeBODS.FLDataManager()
5. RunValidations(DataManager, Collection, 'validationFunctions.py', 'Lookups/')
Limitations of UDT
There are some disappointing limitations that I have come across that you should be aware of before setting off:
The size of an output column (as of BODS 4.1) is limited to 255 characters. Workaround can be done using flat files.
You can only access data passed as input fields to the transform. Variables for example have to be mapped to an input column
before the UDT if you want to use them in your code.
There is no built-in functionality to do lookups in tables or execute sql through datastore connections from the transform.

How a powerful coding language complements a rich ETL tool
Python code is so quick and powerful that I am starting to draw all my solutions out of Data Services into custom python
modules. It is faster, clearer for me to understand, and more adaptable. However, this is something to be careful of. SAP BODS
is a great ETL tool, and is a brilliant cockpit from which to direct your data flows because of its high-level features such as
authorizations, database connections and graphical job and workflow building. The combination of the two, in my opinion,
makes for an ideal ETL tool.

This is possibly best demonstrated by example. On a recent project (my first really) with the help of Python transforms and
modules that I wrote I was able to solve the following:
Dynamic table creation and loading
Executeable metadata (functions contained in excel spreadsheets)
Complicated data quality analysis and reporting (made easy)
Reliable unicode character and formatting export from excel
Data Services 4.1 on the other hand was indispensable in solving the following requirements
Multi-user support with protected data (aliases for schemas)
Maintainable centralized processes in a central object library with limited access for certain users
A framework for users to build their own Jobs using centralized processes.
The two complemented each other brilliantly to reach a solid solution.

Going forward
With the rise of large amounts of unstructured data and the non-trivial data manipulations that come with it, I believe that every
Data analyst/scientist should have a go-to language in their back pocket. As a trained physicist with a background in C/C++
(ROOT) I found Python incredibly easy to master and put it forward as one to consider first.

I do not know what the plan is for this transform going forward into the Data Services Eclipse workbench, but hopefully the
merits of allowing a rich language to interact with your data inside of BODS are obvious enough to keep it around. I plan to
research this a bit more and follow up this post with another article.

about me...
This is my first post on SCN. I am new to SAP and have a fresh perspective of the products and look forward to contributing on
this topic if there is interest. When I get the chance I plan to blog about the use of Vim for a data analyst and the manipulation of
data structures using Python.
723 Views 6 Comments PermalinkTags: development, python, programming, data_quality, scripting, unstructured_data, faster, sap_data
_services,custom_transform, userdefined, transpose, editors

Substitution parameters in SAP DS
Posted by Mohammad Shahanshah Ansari Apr 13, 2014
What is substitution parameter?

Substitution parameters are used to store constant values and defined at repository level.
Substitution parameters are accessible to all jobs in a repository.
Substitution parameters are useful when you want to export and run a job containing constant values in
a specific environment.

Scenario to use Substitution Parameters:

For instance, if you create multiple jobs in a repository and those references a directory on your
local computer to read the source files. Instead of creating global variables in each job to store
this path you can use a substitution parameter instead. You can easily assign a value for the
original, constant value in order to run the job in the new environment. After creating a
substitution parameter value for the directory in your environment, you can run the job in a
different environment and all the objects that reference the original directory will automatically
use the value. This means that you only need to change the constant value (the original
directory name) in one place (the substitution parameter) and its value will automatically
propagate to all objects in the job when it runs in the new environment.

Key difference between substitution parameters and global variables:

You would use a global variable when you do not know the value prior to execution and it needs to be
calculated in the job.
You would use a substitution parameter for constants that do not change during execution. By using a
substitution parameter means you do not need to define a global variable in each job to parameterize a
constant value.

Global Variables Substitution Parameters
Defined at Job Level Defined at Repository Level
Can not be shared across Jobs Available to all Jobs in a repository
Data-Type specific No data type (all strings)
Value can change during job execution
Fixed value set prior to execution of Job
(constants)

How to define the Substitution Parameters?

Open the Substitution Parameter Editor from the Designer by selecting
Tools > Substitution Parameter Configurations....
You can either add another substitution parameter in existing configuration or you may add a
new configuration by clicking the Create New Substitution Parameter Configuration icon in the
toolbar.
The name prefix is two dollar signs $$ (global variables are prefixed with one dollar sign).
When
adding new substitution parameters in the Substitution Parameter Editor, the editor
automatically
adds the prefix.
The maximum length of a name is 64 characters.

In the following example, the substitution parameter $$SourceFilesPath has the value
D:/Data/Staging in the configuration named Dev_Subst_Param_Conf and the value
C:/data/staging in the Quality_Subst_Param_Conf configuration.



This substitution parameter can be used in more than one Jobs in a repository. You can use
substitution parameters in all places where global variables are supported like Query transform
WHERE clauses, Scripts, Mappings, SQL transform, Flat-file options, Address cleanse transform
options etc. Below script will print the source files path what is defined above.

Print ('Source Files Path: [$$SourceFilesPath]');

Associating a substitution parameter configuration with a system configuration:

A system configuration groups together a set of datastore configurations and a substitution
parameter configuration. For example, you might create one system configuration for your DEV
environment and a different system configuration for Quality Environment. Depending on your
environment, both system configurations might point to the same substitution parameter
configuration or each system configuration might require a different substitution parameter
configuration. In below example, we are using different substitution parameter for DEV and
Quality Systems.

To associate a substitution parameter configuration with a new or existing system
configuration:

In the Designer, open the System Configuration Editor by selecting
Tools > System Configurations
You may refer this blog to create the system configuration.

The following example shows two system configurations, DEV and Quality. In this case, there
are substitution parameter configurations for each environment. Each substitution parameter
configuration defines where the data source files are located. Select the appropriate
substitution parameter configuration and datastore configurations for each system
configuration.

At job execution time, you can set the system configuration and the job will execute with the
values for the associated substitution parameter configuration.

Exporting and importing substitution parameters:

Substitution parameters are stored in a local repository along with their configured values. The
DS does not include substitution parameters as part of a regular export. Therefore, you need to
export substitution parameters and configurations to other repositories by exporting them to a
file and then importing the file to another repository.

Exporting substitution parameters
1. Right-click in the local object library and select Repository > Export Substitution Parameter
2. Configurations.
3. Select the check box in the Export column for the substitution parameter configurations to export.
4. Save the file.
The software saves it as a text file with an .atl extension.

Importing substitution parameters
The substitution parameters must have first been exported to an ATL file.

1. In the Designer, right-click in the object library and select Repository > Import from file.
2. Browse to the file to import.
3. Click OK.
1351 Views 9 Comments Permalink Tags: substitution_parameter, substitution_parameter_configuration

JMS Real-Time integration with SAP Data Services
Posted by Martin Bernhardt Apr 9, 2014
Purpose
This how-to guide shows how to integrate a Java Messaging Services (JMS) Provider with SAP Data Services. This is a common
Enterprise Application Integration scenario where a service is called asynchronously via request/response messages. SAP Data
Services' role here is to provide a simple Real-Time service. Configuration includes quite a few steps to get everything up and
running. This step-by-step configuration example covers all components that need to be touched including the JMS provider.


Overview
We want an external information resource (IR) our JMS provider - to initiate a request by putting a request message into a
request queue. SAP Data Services is the JMS client that waits for request messages, executes a service and puts a correlated
response message into a response queue. Were using the pre-built JMS adapter in SAP Data Services 4.2 and use Active MQ as
JMS provider. Since we focus on Real-Time integration were not using an adapter datastore in this scenario. All incoming and
outgoing data is received/sent back via messages. We will configure a Real-Time Job, check the settings of the Job Server and
Access Server, configure a Real-Time Service, install Active MQ and configure the message queues, configure the JMS adapter
and its operation and finally send test messages from the Active MQ console.
Real-Time Job
For our service were using a Hello Word-Real-Time Job named Job_TestConnectivity. For details, please refer to the SAP
Data Services 4.2 tutorial Chapter 14 . SAP Data Services comes with all the ATL, DTD and XML files in
<DS_LINK_DIR>/ConnectivityTest to create Job_TestConnectivity. The job reads an input message that has one input string



and returns an output message that has one output string with the first two words of the input string in reverse order:



Job Server
We need to make sure that one JobServer supports adapters. Using Data Services Server Manager utility, we switch on support
adapter, message broker communication and Use SSL protocol for adapter, message broker communication. We associate the
Job Server with the repository that has the Real-Time Job Job_TestConnectivity. Finally we restart SAP Data Services by
clicking close and restart or we restart it later using the Control Panel => Administrative Tools => Services => SAP Data
Services (right mouse click) => restart.

Access Server
We need to have an Access Server up and running. The Access Server will receive the input messages from the JMS adapter and
dispatch them to an instance of the Real-Time Service RS_TestConnectivity. In SAP Data Services Management Console choose
Administrator => Management => Access Server and check if an Access Server is configured and add one if necessary. By
default, the AccessServer uses port 4000.


Real-Time Service
We configure a Real-Time Service RS_TestConnectivity for our Real-Time Job Job_TestConnectivity. In SAP Data Services
Management Console navigate to Administrator => Real-Time => <hostname>:4000 => Real-Time Services => Real-Time
Service Configuration. Configure a new Real-Time Service RS_TestConnectivity and select Job_TestConnectivity with the
Browse-Button:


Add the JobServer as Service Provider and click Apply. Start the Real-Time Service via Administrator => Real-Time =>
<hostname>:4000 => Real-Time Services => Real-Time Service Status, and click "Start":

Active MQ - Installation
We could use any JMS provider but in this case were using Active MQ since it can be quickly installed and configured.
Download and unzip Active MQ from http://activemq.apache.org/. In this scenario we use version 5.9.0 and we install it in
C:\local\ActiveMQ on the same machine as SAP Data Services. At the command line change to directory C:\local\ActiveMQ\bin
and execute activemq.bat:

Active MQ console
Now, we have our JMS provider up and running and we can access the Active MQ console athttp://<hostname>:8161/admin .
Were using admin / admin to login.


The browser should now display the homepage of the Active MQ console:

We click on the Queues menu to add 3 queues named FailedQueue, RequestQueue and ResponseQueue:

Active MQ JMS client
The SAP Data Services JMS Adapter will access the JMS client provided by Active MQ to communicate with the JMS provider.
The JMS client is in activemq-all-5.9.0.jar. We will add this jar file to the ClassPath of the JMS adapter later. According to
the JNDI documentation of Active MQ we need to create a jndi.properties file and either add it to the ClassPath or put it into
activemq-all-5.9.0.jar. The jndi.properties file maps the JNDI names of the queues to their physical names. Create jndi.properties
as shown below. You can add it to activemq-all-5.9.0.jar e.g. by using WinZip.

JMS Adapter
Now we are ready to configure our JMS Adapter in SAP Data Services. In SAP Data Services Management Console, choose
Administrator => Adapter Instances => Adapter Configuration

Choose JMSAdapter

Enter the configuration information as shown below:

set the adapter name; here: MyJMSAdapter
set the Access Server hostname and port; here: localhost, 4000

Remove the default entry of the ClassPath and add the following files to the ClassPath. All necessary jar files - except the JMS
client jar file are located in <DS_LINK_DIR>\lib\ or <DS_LINK_DIR>\ext\lib\. Replace <DS_LINK_DIR> with the
respective directory of your installation.
<DS_LINK_DIR>\lib\acta_adapter_sdk.jar
<DS_LINK_DIR>\lib\acta_broker_client.jar
<DS_LINK_DIR>\lib\acta_jms_adapter.jar
<DS_LINK_DIR>\lib\acta_tool.jar
<DS_LINK_DIR>\ext\lib\ssljFIPS.jar
<DS_LINK_DIR>\ext\lib\cryptojFIPS.jar
<DS_LINK_DIR>\ext\lib\bcm.jar
<DS_LINK_DIR>\ext\lib\xercesImpl.jar
C:\local\ActiveMQ\activemq-all-5.9.0.jar (make sure it contains jndi.properties)
Note: The template file JMSadapter.xml that has the default ClassPath and all other default values and choices, is located in
<DS_COMMON_DIR>\adapters\config\templates. You might want to adjust this file to have other defaults when configuring a
new JMS adapter. Once an adapter is configured you need to change its configuration file located in
<DS_COMMON_DIR>\adapters\config. On Windows <DS_COMMON_DIR> is %ALLUSERSPROFILE%\SAP
BusinessObjects\Data Services by default.


JMS Adapter - JNDI configuration
We use the Java Naming and Directory Interface (JNDI) to configure the JMS adapter. So we chose:

Configuration Type: JNDI

Next we set the Active MQ JNDI Name Server URL:

Server URL: tcp://localhost:61616

For Active MQ we need to set the JNDI context factory to org.apache.activemq.jndi.ActiveMQInitialContextFactory(see
ActiveMQ documentation section JNDI support). By default this string is not offered in the drop down box in the JNDI
configuration section, so we need to edit <DS_COMMON_DIR>\adapters\config\templates\JMSAdapter.xml and add the string
to the pipe-delimited list in the jndiFactory entry.
Note: If MyJMSAdapter already exists, we need to edit <DS_COMMON_DIR>\adapters\config\MyJMSAdapter.xml instead.

<jndiFactory Choices="org.apache.activemq.jndi.ActiveMQInitialContextFactory| >

After navigating to Administrator => Adapter Instances => Adapter Instances => My JMSAdapter we choose the right string
from the drop-down-list and set:

Initial Naming Factory: org.apache.activemq.jndi.ActiveMQInitialContextFactory

Finally we set the Queue Connection Factory and Topic Connection Factory as described in the Active MQ documentation:

Queue Connection Factory: QueueConnectionFactory

Topic Connection Factory: TopicConnectionFactory


Click Apply to save all settings.

JMS Adapter - start
We are ready to check if the JMS adapter starts now. We still have to configure an operation for the adapter yet (see below) but
we want to check first if our configuration works fine. There are many reasons that the adapter doesnt start at first - e.g. missing
or wrong files in ClassPath, typos in JNDI configuration, etc. You will not find any entry in the error file and trace file in this
case these files are for messages created by the adapter when it is up and running. To find the reason for the adapter in case it
doesnt start, switch on Trace Mode=True in the JMS Adapter Configuration and restart the JMS Adapter. Then Check the Job
Servers log file in <DS_COMMON_DIR>/log/<jobserver>.log. Search for the java call the Job Server executes to launch the
JMS Adapter. Copy the whole command, execute it from the command line and try to fix the problem by adjusting the
command.If the JMS adapter starts properly the Adapter Instance Status will look like this:
JMS Adapter - operation configuration
Now we need to configure the JMS operation. Well configure an operation of type Get: Request/Reply since we want our
adapter to dequeue requests from the RequestQueue, pass it to the Real-Time Service, wait for the response and enqueue the
response into the ResponseQueue.In DS Management Console navigate to Administrator => Adapter Instances =>
<jobserver>@<hostname>:<port> => Adapter Configuration => MyJMSAdapter => Operations and click Add. Select
Operation Type Get: Request/Reply and Request/Acknowledge using Queues and click Apply.


Set the operation details as displayed below. Since our Real-Time Service will respond to requests quickly, we reduce the polling
interval to 50 ms.

Click Apply to save the changes and restart the adapter. The adapter instance status should look like this:

JMS Adapter - test
Of course we want to see if the adapter works as expected. To do this we put a request message into the request queue and see
what happens. We open the Active MQ console again (URL http://<hostname>:8161/admin, Login: admin/admin) and select
Send. We create the request message as shown below:

After we have clicked Send the message is enqueued into RequestQueue, dequeued from there by the JMS adapter that passes
the request to the Real-Time Service and receives the response from it. The JMS adapter finally puts the response message into
ResponseQueue.
The Active MQ console should look like this after some seconds:
We have one message in the response queue. To display it we click on ResponseQueue and then on the message ID

and have a look at the repsonse message. You should see the "World Hello" string in the message details.



My Template Table Became 'Permanent', and I Want to Change
it Back
Posted by Carolyn Caster Apr 7, 2014
On our project, the innocent actions of a team member resulted in my template table being converted into a permanent one, and
since it was used in many data flows, I couldn't just right-click/delete it away. Instead, I searched for online help, but found none
but experimented with the following steps, which worked for me:

1- Check out the permanent table. From the Central Object Library window, right-click, check out.
2- In Designer, in the datastore area of your Local Object Library, right-click, delete the permanent table you just checked
out.
3- Now, remove the same permanent table from your data flow in Designer, and replace it with a new template table that
youll create with the exact same name and db (TRANS, STAGE, etc) as the permanent table had. The name is very important, it
must be exactly the same. Exactly.
4- Save, check, test your data flow.
5- Check in the permanent table. From the Central Object Library window, right-click, check in. You should see the
permanent table disappear from the Tables list, with the table instead appearing in the Template Tables list.
496 Views 2 Comments Permalink Tags: template, eim, tablecreation, sap_data_services, designer;

How to use Pre-Load and Post-Load command in Data Services.
Posted by Ramesh Murugan Mar 28, 2014
In this article we will discuss How to use Pre-Load and Post-Load command in data services.

Business Requirement: Need to execute two programs before and after transformation. The first program will create or
update status to receive data from source to Target system and the second program will publish the post transformed data in
Target system. These two program needs to execute before and after transformation.

For this scenario, we can use Pre-Load and Post-Load command. Below the details

What is Pre-Load and Post Load?

Specify SQL commands that the software executes before starting a load or after finishing a load. When a data flow is called, the
software opens all the objects (queries, transforms, sources, and targets) in the data flow. Next, the software executes Pre-Load
SQL commands before processing any transform. Post-Load command will process after transform.

How to use for our business requirement?

We can use both Pre-Load and Post-Load command to execute program before and after transform, below the steps will explain
in details

Right click on target object in Dataflow and press open

The Target object option will be shown as below

Both the Pre Load Commands tab and the Post Load Commands tab contain a SQL Commands box and
a Value box. The SQL Commands box contains command lines. To edit/write a line, select the line in the
SQL Commands box. The text for the SQL command appears in the Value box. Edit the text in that box.


To add a new line, determine the desired position for the new line, select the existing line immediately before or after
the desired position, right-click, and choose Insert Before to insert a new line before the selected line, or choose
Insert After to insert a new line after the selected line. Finally, type the SQL command in the Value box. You can
include variables and parameters in pre-load or post-load SQL statements. Put the variables and parameters in either
brackets, braces, or quotes.


To delete a line, select the line in the SQL Commands box, right click, and choose Delete.

Open Post-Load Tab and write post transformation command as same ad Pre-Load

Save and execute. The job will execute Pre-Load, Transform and Post-Load in a sequence.


Data processing successfully completed as per Business requirement.

Note:
Because the software executes the SQL commands as a unit of transaction, you should not include transaction
commands in Pre-Load or Post-Load SQL statements.
954 Views 2 Comments PermalinkTags: data_services_load_triggers, sap_data_services, sap_data_integrator, data_services_4_1

Custom function to get database name from a datastore in DS
Posted by Mohammad Shahanshah Ansari Mar 27, 2014
Getting function name from a datastore could be useful if you have configured more than one datastores
in your project. This helps avoiding code change while migrating objects from one environment to another
during the code promotion.

Below is step by step procedure to create custom function to get the database name from a datastore and
call it in a Batch Job.

1) Go to your Local Object Library and choose Custom Functions then right click on Custom Functionand
select New


2) Enter name of the function as CF_DATABASE_NAME


3) Enter the below line of code inside the editor.


Click on the above image to zoom it.

Then declare an input parameter named $P_Datastore of length varchar(64). Then Validate the function
and if no error found then Save it.


4) Create a global variable at Job level in any of the test batch job you have created and name this global
variable as $G_Database of length varchar(64)


5) Call this function in one script of your batch job and use this Global variable wherever required in your
Job.


You can call this function as under in script. You simply need to pass the name of your datastore in single
quotes.
$G_Database=CF_DATABASE_NAME('datastore_name');


Example of a practical use:


sql('CONV_DS',' TRUNCATE TABLE ||$G_DATABASE||'.ADMIN.CONV_CUSTOMER_ADDRESS');


So the above code will read database name at run time from global variable and truncate records before
it loads in the Job. This code can be used in any environment when you promote your code else if
database name is hard coded then you will end up updating the code in every new environment.
537 Views 4 Comments Permalink Tags: custom_functions, database_name

Custom function in BODS to remove special characters from a
string
Posted by Mohammad Shahanshah Ansari Mar 25, 2014
Below is step by step procedure to write a custom function in BODS to remove special characters in a string using ASCII values.

Step 1: Create a custom function in BODS and name it as 'CF_REMOVE_SPECIAL_CHARS'

Step 2: Use the below code in your function.


# This function is to remove special characters from the string.It only retains alphabets and numbers from the string.

$L_String =$P_Input_Field;
$L_String_Length =length( $L_String );
$L_Counter =1;
$L_String_final =null;

while($L_String_Length>0)
begin
$L_Char =substr( $L_String ,$L_Counter,1);

if((ascii($L_Char)>=48 and ascii($L_Char)<=57) or (ascii($L_Char)>=65 and ascii($L_Char)<=90) or (ascii($L_Char)>=97 and
ascii($L_Char)<=122))
begin
$L_String_final =$L_String_final||$L_Char;
$L_Counter =$L_Counter+1;
$L_String_Length =$L_String_Length-1;
end

else

begin
$L_Counter =$L_Counter+1;
$L_String_Length = $L_String_Length-1;
end
end

Return replace_substr( replace_substr( rtrim_blanks( rtrim_blanks( $L_String_final )),' ',' '),' ', ' ');


Your code in Editor would look like as under:


Step 3: Declare Parameters and local variables as shown in left pane of the above function editor.

$P_Input_Field - parameter type is input (data type varchar(255) )
$L_Char - datatype varchar(255)
$L_Counter - datatype int
$L_String - datatype varchar(255)
$L_String_final - datatype varchar(255)
$L_String_Length - datatype int

Note: Change the parameter return type to Varchar(255). By default return type is int.

Step 4: Save this function.

Step 5: Call this function while mapping any field in Query Editor where you want to remove special characters.

Ex: CF_REMOVE_SPECIAL_CHARS(Table1.INPUT_VAL)

Above function call shall remove all special characters from INPUT_VAL field in a table1 and output Value shall look like
below table's data.


1316 Views 0 Comments PermalinkTags: custom_functions, string_functions, remove_special_characters, remove_special_chars

How to capture error log in a table in BODS
Posted by Mohammad Shahanshah Ansari Mar 19, 2014
I will be walking you through (step by step procedure) how we can capture error messages if any dataflow fails in a
Job. I have taken a simple example with few columns to demonstrate.


Step 1: Create a Job and name it as ERROR_LOG_JOB


Step 2: Declare following four global variables at the Job level. Refer the screen shot below for Name and data
types.

Step 3: Drag a Try Block, Dataflow and Catch block in work area and connect them as shown in diagram below.
Inside dataflow you can drag any existing table in your repository as a source and populate few columns to a target
table. Make sure target table is a permanent table. This is just for demo.



Step 4: Open the Catch block and Drag one script inside Catch Block and name it as shown in below diagram.



Step 5: Open the scrip and write below code inside as shown in the diagram below.



The above script is to populate the values in global variables using some in-built BODS functions as well as calling
a custom function to log the errors into a permanent table. This function does not exits at this moment. We will be
creating this function in later steps.


Step 6: Go to Custom Function section in your repository and create a new custom function and name it as under.





Step 7: Click next in above dialog box and write the below code inside the function. You need to declare parameters
and local variables as shown in the editor below. Keep the datatypes of these parameters and local variables what we
have for global variables in setp 2. Validate the function and save it.





Step 8: Now your function is ready to use. Considering that you have SQL Server as a database where you want to
capture these errors in a table. Create a table to store the information.

CREATE TABLE [dbo].[ERROR_LOG](
[SEQ_NO] [int] IDENTITY(1,1) NOT NULL,
[ERROR_NUMBER] [int] NULL,
[ERROR_CONTEXT] [varchar](512) NULL,
[ERROR_MESSAGE] [varchar](512) NULL,
[ERROR_TIMESTAMP] [VARCHAR] (512) NULL
)

You may change the datastore as per your requirement. I have taken ETL_CTRL as a datastore in above function
which is connected to a SQL Server Database where above table is being created.

Step 9: Just to make sure that dataflow is failing, we will be forcing it to throw an error at run time. Inside your
dataflow use permanent target table. Now double click the target table and add one text line below existing comment
under load triggers tab. Refer below screen shot. This is one way to throw an error in a dataflow at run time.





Step 10: Now your Job is ready to execute. Save and Execute your Job. You should get an error message monitor
log. Open the table in your database and check if error log information is populated. Error Log shall look like as
shown below.



ERROR_LOG table shall capture the same error message in a table as under.





Hope this helps. In case you face any issue, do let me know.

1695 Views 2 Comments Permalink Tags: error_log, error_messages, error_statistics

Advantage of Join Ranks in BODS
Posted by Mohammad Shahanshah Ansari Mar 18, 2014
What is Join Rank?

You can use join rank to control the order in which sources (tables or files) are joined in a dataflow. The
highest ranked source is accessed first to construct the join.

Best Practices for Join Ranks:
Define the join rank in the Query editor.
For an inner join between two tables, in the Query editor assign a higher join rank value to the larger table
and, if possible, cache the smaller table.

Default, Max and Min values in Join Rank:
Default value for Join Rank is 0. Max and Min value could be any non negative number.
Consider you have tables T1, T2 and T3 with Join Ranks as 10, 20 and 30 then table T3 has highest join
rank and therefore T3 will act as a driving table.


Performance Improvement:


Controlling join order can often have a huge effect on the performance of producing the join result. Join
ordering is relevant only in cases where the Data Services engine performs the join. In cases where the
code is pushed down to the database, the database server determines how a join is performed.

Where Join Rank to be used?


When code is not full push down and sources are with huge records then join rank may be considered.
The Data Services Optimizer considers join rank and uses the source with the highest join rank as the left
source. Join Rank is very useful in cases where DS optimizer is not being able to resolve the most efficient execution plan
automatically. If join rank value is higher that means that particular table is driving the join.

You can print a trace message to the Monitor log file which allows you to see the order in which the Data Services
Optimizer performs the joins. This information may help you to identify ways to improve theperformance. To add
the trace, select Optimized Data Flow in the Trace tab of the "Execution Properties" dialog.

Article shall continue with a real time example on Join Rank soon.
962 Views 7 Comments Permalink Tags: performance_optimization, join_rank

Some cool options in BODS
Posted by Mohammad Shahanshah Ansari Mar 16, 2014
I find couple of cool options in BODS and used to apply in almost all the projects I have been doing. You
may also give a try if not done yet. Hope you would like these. You can see all these options in designer.

Monitor Sample Rate:
Right Click the Job > Click on Properties> Then click on Execution Options


You can change this value of monitor sample rate here and every time when you execute the Job it shall take the
latest value set.

Setting this value to a higher number has performance improvement as well as every time you need not
to enter this value while executing the Job. The frequency that the Monitor log refreshes the statistics is
based on this Monitor sample rate. With a higher Monitor sample rate, Data Services collects more data
before calling the operating system to open the file, and performance improves. Increase Monitor sample
rate to reduce the number of calls to the operating system to write to the log file. Default value is set to
5. Maximum value you can set is 64000.

Refer the below screen shot for reference.







Click on the Designer Menu Bar and select Tool > Options (see the diagram below). There are couple of cool
options available here which can be used in your project. Note that if you change any option from here,it shall apply
to whole environment.





Once selected Go to:
Designer >General > View data sampling size (rows)
Refer the below screen shot. You can increase this value to a higher number if you want to see more no. of
records while viewing the data in BODS. Sample size can be controlled from here.



Designer >General > Perform complete validation before Job execution
Refer the below screen shot. I prefer this to set from here as I need not to worry about validating the Job
manually before executing any Job. If you are testing the Job and there is chance of some syntax errors
then I would recommend this to set before hand. This will save some time. Check this option if you want
to enable.





Designer >General > Show dialog when job is completed
Refer the screen shot below. This is also one of the cool option available in designer. This option facilitate the
program to open a dialog box when Job completes. This way you need not to see the monitor log manually for each
Job when it completes. I love this option.



Designer >Graphics>
Refer the screen shot below. Using this option you change the line type as per your likes. I personally
likeHorizontal/Vertical as all transforms looks more clean inside the dataflow. You can also change the
color scheme, background etc.




Designer > Fonts
See the dialog box below. Using this option, you can change the Font Size.





Do feel free to add to this list if you have come across more cool stuffs in BODS.

851 Views 0 Comments Permalink Tags: bods_options, bods_environment_settings

Quick Tips for Job Performance Optimization in BODS
Posted by Mohammad Shahanshah Ansari Mar 15, 2014
Ensure that most of the dataflows are optimized. Maximize the push-down operations to the database as
much as possible. You can check the optimized SQL using below option inside a dataflow. SQL should
start with INSERT INTOSELECT statements.....

Split complex logics in a single dataflow into multiple dataflows if possible. This would be much easier to
maintain in future as well as most of the dataflows can be pushed down.

If full pushdown is not possible in a dataflow then enable Bulk Loader on the target table. Double click
the target table to enable to bulk loader as shown in below diagram. Bulk loader is much faster than using
direct load.


Right click the Datastore. Select Edit and then go to Advanced Option and then Edit it. Change the
Ifthenelse Support to Yes. Note that by default this is set to No in BODS. This will push down all the
decode and ifthenelse functions used in the Job.


Index Creation on Key Columns: If you are joining more than one tables then ensure that Tables have
indexes created on the columns used in where clause. This drastically improves the performance. Define
primary keys while creating the target tables in DS. In most of the databases indexes are created
automatically if you define the keys in your Query Transforms. Therefore, define primary keys in query
transforms itself when you first create the target table. This way you can avoid manual index creation on
a table.


Select Distinct: In BODS Select Distinct is not pushed down. This can be pushed down only in case you
are checking the Select Distinct option just before the target table. So if you require to use select distinct
then use it in the last query transform.


Order By and Group By are not pushed down in BODS. This can be pushed down only in case you have
single Query Transform in a dataflow.


Avoid data type conversions as it prevents full push down. Validate the dataflow and ensure there are no
warnings.


Parallel Execution of Dataflows or WorkFlows: Ensure that workflows and dataflows are not executing in
sequence unnecessarily. Make it parallel execution wherever possible.


Avoid parallel execution of Query Transforms in a dataflow as it prevents full pushdown. If same set of
data required from a source table then use another instance of the same Table as source.


Join Rank: Assign higher Join Rank value to the larger table. Open the Query Editor where tables are
joined. In below diagram second table has millions of records so have assigned higher join rank. Max
number has higher join rank. This improves performance.


Database links and linked datastores: Create database links if you are using more than one database for
source and target tables (multiple datastores) or in case using different database servers. You can refer
my another article on how to create the DB Link. Click URL


Use of Joining in place of Lookup Functions: Use Lookup table as a source table and set as an outer join in
dataflow instead of using lookup functions. This technique has advantage over the lookup functions as it
pushes the execution of the join down to the underlying database. Also, it is much easier to maintain the
dataflow.


Hope this will be useful.
1292 Views 7 Comments Permalink Tags: performance_optimization, job_optimization, optimization_tips

How to Create System Configuration in Data Services
Posted by Mohammad Shahanshah Ansari Mar 14, 2014
Why do we need to have system configuration at first place? Well, the advantage of having system configuration is that you can
use it for the lifetime in a project. In general all projects have multiple environments to load the data when project progresses
over the period of time. Examples are DEV, Quality and Production Environments.

There are two ways to execute your Jobs in multiple environments:
Edit the Datastores configuration manually for executing Jobs in different environment and default it to latest environment
Create the system configuration one time and select the appropriate environment while executing of the Job from the Execution
Properties window. We are going to discuss this option in this blog.

Followings are the steps to create system configuration in Data Services.

Prerequisite to setup the System Configuration:
You need to have at least two configurations ready in any of your datastores pointing to two different databases. For example,
one for staged data and another for target data. This can be done easily by editing the datastore. Right click the datastore and
select Edit.

Step 1: Execute any of the existing Job to check if your repository does not have any system configuration already created.
Below dialog box shall appear once you execute any Job. Do not click on the OK Button to execute. This is just to check the
execution properties.

If you look at the below dialog box, there is no system configuration to select.


Step 2:
Cancel the above Job execution and Click on the Tool menu bar as shown below and select System Configurations.


Step 3: You can see the below dialog box now. Click on the icon (red circle) as shown in the below dialog box to Create New
Configuration. This dialog box will show all the data stores available in your repository.

Step 4: Once clicked on the above button it will show the below dialog box with default config details for all datastores. Now
you can rename the system config name (by default it is System_Config_1, System_Config_1 etc. ).

Select an appropriate configuration Name against each data stores for your system config. I have taken the DEV and History DB
as an example for configuration. Note that these configs should be available in your datastores.

See the below dialog box how it is selected. You can create more than one configuration (Say it one for DEV, another for
History).

Once done, click the OK Button. Now your system configuration is ready to use.


Step 5: Now execute the any of the existing Job again. You can see System Configuration added to the 'Execution
Properties' Window which was not available before. From the drop down list you can select appropriate environment to execute
your Job.


Do let me know if you find it useful. Feel free to revert in case you face any issue while configuring. Hope this helps.
1064 Views 12 Comments PermalinkTags: eim, data_services, sap_data_services, system_configuration, system_configuration_wizard,
sap_data_integrator,system_config

Invoke Webservices Using SAP Data Services 4.2
Posted by Lijo Joseph Mar 14, 2014
Sometimes it is required to load data to a system working based on webservices.

For example for a requirement, where the downstream system is Oracle File Based Loader which demands a file from the storage
Server as an
input, Web Services will be the preference for most of the users as it can handle multiple files from a single zip /archive file and
load to many tables.

We would like to help you understand the simple steps to invoke web services through Data Services.

Steps involved.

1. Create a Data Store for web services.
Provide the link of the web services and its credentials as in below sample.



2. Import the function to WS data store


A web service will usually comprise of many functions. The function which is needed for a particular requirement has to be
imported to the
data store created for web services under its functions segment




3. Create the Job

Create a job which will take care of the following things




The input details required for the web services can be declared as global parameters and prepare the customized columns as per
the requirement. The below columns are the required columns for the sample given.




Create a nested /unnested column structure which is equivalent to the web services input data structure.





In order, to get the column structure of the web services, do the below steps.

Right click on the output schema -> New Function Call -> Select the Web services Data Store -> Select
the web services function from the list which you need to invoke.






Drag and drop or key in the Input Schema name to the text box in the Input parameter Definition pop up

The success or failure of the function call can be verified using the return codes of the web services
function. According to the error handling design you can divert the results to error tables.




Default return code for a successful web service call is 0.
652 Views 5 Comments Permalink Tags: eim, webservices, data_services, bods, sap_data_services

Use Match Transform for Data De-duplication
Posted by Ananda Theerthan Mar 13, 2014
Many a times we may have to find potential duplicates in the data and correct it so that correct and harmonized data can be
transferred to the target system.
During ETL process we might have to find and remove duplicate records to avoid data redundancy in the target system.

Data Services has two powerful transforms that can be used for many scenarios. The Match and Associate transforms under Data
Quality.

These 2 transforms in combination can do lot of data quality analysis and take required actions. In this part we will just see how
to use Match transform to identify duplicates in address data and eliminate them.

In next tutorial, we shall see how to post correct data back from duplicate record on to the original driver record.
The sample process that I used is demonstrated in below video.
381 Views 0 Comments PermalinkTags: sap, business_intelligence_(businessobjects), data_integration_and_quality_management, data_
services, data_quality,bods, bodi, data_integration, data_services_4.1, data_services_workbench, sap_businessobjects_data_services_wor
kbench,bods_concepts

Transfer data to SAP system using RFC from SAP Data Services
Posted by Ananda Theerthan Mar 13, 2014
This just an sample to demonstrate data transfer to SAP systems using RFC from Data Services.

To server the purpose of this blog, I am going to transfer data to SAP BW system from Data Services.

Sometimes we may need to load some lookup or reference data into SAP BW system from external sources.
Instead of creating a data source, this method will directly push data to the database table using RFC.
Below, will explain the steps that I used to test the sample.

1) Create a transparent table in SE11.

2) Create a function module in SE37 with import and export parameters.


3) The source code for the FM goes below.
FUNCTION ZBODS_DATE.
*"----------------------------------------------------------------------
*"*"Local Interface:
*" IMPORTING
*" VALUE(I_DATE) TYPE CHAR10
*" VALUE(I_FLAG) TYPE CHAR10
*" EXPORTING
*" VALUE(E_STATUS) TYPE CHAR2
*"----------------------------------------------------------------------

data: wa type zlk_date.

if not I_DATE is INITIAL.
clear wa.
CALL FUNCTION 'CONVERT_DATE_TO_INTERNAL'
EXPORTING
DATE_EXTERNAL = i_date
* ACCEPT_INITIAL_DATE =
IMPORTING
DATE_INTERNAL = wa-l_date
* EXCEPTIONS
* DATE_EXTERNAL_IS_INVALID = 1
* OTHERS = 2
.
IF SY-SUBRC <> 0.
* Implement suitable error handling here
ENDIF.

wa-flag = i_flag.
insert zlk_date from wa.
if sy-subrc ne 0.
update zlk_date from wa.
endif.

e_status = 'S'.
endif.

ENDFUNCTION.


4) Remember to set the attribute of the FM to RFC enabled, otherwise it will not be accessible from Data Services.

5) Make sure both the custom table and function module are activated in the system.
6) Login to DS Designer,Create new data store of type "SAP APPLICATION" using required details.
7) In the Object library, you will see an option for Functions.Right click on it and choose "Import By Name".Provide the
Function module name you just created in the BW system.

8) Now, build the job with source data, a query transform and an output table to store the result of function call.

9) Open the query transform editor, do not add any columns, right click and choose "New Function Call".
10) The imported function will be available in the list of available objects. Now, just choose and required function and provide
input parameters.

11) Note that for some reason, Data Services doesn't recognizes DATS data type from SAP. Instead, you have to use as CHAR
and do the conversion latter.

Hence, I am using to_char function to do the conversion to character format.

12) Now, save the Job and Execute. Once completed, check the newly created table in BW system to see the transferred data.


As this is just an sample, an RFC enabled function module can be designed appropriately to transfer data to any SAP system. The
procedure is similar for BAPIs and IDOCs. You just need to provide the required parameters in correct format and it works.
1022 Views 0 Comments PermalinkTags: abap, enterprise_data_warehousing/business_warehouse, data_services, business_intelligence,
bods, data_migration,data_integration, data_services_4.1, bods_concepts

How to create Alphanumeric Group Counters in DS
Posted by Mohammad Shahanshah Ansari Mar 13, 2014
You might have come across a scenario where you need to create more than 99 group counters for a particular group. Taking the
instance of Task List Object in PM. Task List object has three important tabs and those are Header, Operation and Operation
Maint. Package. Header has a field called group counter which is of max two digits length in SAP which means it cant exceed
99. So if your group counter is less than or equal to 99 then two digit numbers is ideal to use as group counter. But this may not
be always the case. What in case you have more than 99 group counters for a group? In that case we have no other option left but
to generate an alphanumeric group counter.

How to generate Alphanumeric Group Counters:

There could be many ways and logic of generating the alphanumeric group counters. I would here in this post illustrate one of the
simplest and easiest ways of doing it. Since two chars is the limitation for Task List group counter, lets create two digit
alphanumeric GC to cope up with the current requirement of more than 99 group counters. We can take the combination of two
alphabets. We have 26 alphabets in English so the no. of combination we can generate for group counters is 26*26=676. I am
assuming that your group counter wont go beyond 676. SAP recommendation is maximum 99 group counters for each group.

Steps to create the Group Counters:

1. Create the header, operation and Maint. Package in task list as usual but in case of group counter instead of generating a 2
digit number for group counter generate three digit numbers using gen_row_num_by_group(Group_Name) function. Group
counters are generated for each group separately.

2. Create a lookup table (Permanent Table) to map numeric group counters to its alphanumeric group counters. Your lookup
will have alphanumeric group counters like AA, AB, AC, AD, .AX, BA, BB, BC,..BXCA, CB....,CX......and so on. This
lookup table shall contain all the possible combination which are 676 in total.

3. In dataflow for Header, Operation and Maint. Package add a query transform at the end to use lookup_ext() function. This
function will map the three digit group counters to its alphanumeric group counters.

Your lookup function for a group counter field in query transform will look like this:

lookup_ext([DS_SEC_GEN.DBO.MAP_GROUP_COUNTER,'PRE_LOAD_CACHE','MAX'],
[CHAR_GROUP_COUNTER],[NULL],[GROUP_COUNTER,'=',Query_6_1_1_1.Group_Counter]) SET
("run_as_separate_process"='no', "output_cols_info"='')
360 Views 0 Comments Permalink

Dynamic File Splitting using SAP Data Services
Posted by Ananda Theerthan Mar 12, 2014
There might be a requirement where you need to split bulk table data into several files for loading further into other systems.
Instead of creating single file and executing the job several times, in Data Services we can automate this process to split the files
dynamically based on no of files and records required.

So lets assume we have a table T1 contains 10 Million records and we are required to split into 10,000 chunks each.

Overview of dynamic file splitting process

1) Adda new column to the table and populate with sequential numbers. This will be used to identify the chunks of records.
2) Create a script to declare and initialize variables for file count(ex 50),records count(ex 10000) etc.
3) Create a WHILE loop to run for no of times as many files are required.
4) Create a new DF inside the loop to split the records and push to file format.
5) Create a post processing script to increment the variable values.

Sample working demo of this process is shown in this video.
655 Views 2 Comments PermalinkTags: data_integration_and_quality_management, data, data_services, bods, bodi, data_services_4.1,d
ata_services_workbench, sap_data_services, sap_businessobjects_data_services_workbench, bods_concepts

How to create a Database links in Data Services using SQL
Server
Posted by Mohammad Shahanshah Ansari Mar 12, 2014
Sometimes you need to use multiple databases in a project where source tables may be stored into a database and target tables
into another database. The drawback of using two different databases in BODS is that you cannot perform full pushdown
operation in dataflow which may slow down the job execution and create performance issue. To overcome this we can create a
database link and achieve full push down operation. Here is step by step procedure to create a database link in BODS using SQL
Server on your local machine.
Pre-Requisite to create a database Link:
1. You should have two different datastores created in your Local repository which are connected to two different databases in SQL
Server (Ex: Local Server).
Note: You may have these databases on a single server or two different servers.It is up to you.
2. These two different databases shall exits in your Local SQL Server.
How to create a Database Links:
Step 1: Create two databases named DB_Source and DB_Target in your Local SQL Server.
SQL Server Code to create databases. (Execute this in your query browser)
CREATE Database DB_Source;

CREATE Database DB_Target;
Step2: Create two datastores in your local repository named DS_Source and connect this to DB_Source database. Create another
datastore named DS_Target and connect this to DB_Target database.
Now, I want to link DS_Target datastore with DS_Source datastore so that it behaves as a single datastore in data services.
Use below details in screenshot to create your Datastores:

a) Create DS_Source Datastore as shown under



b) Create DS_Target Datastore as shown under





Before we go for third step lets create a Job and see what will happen without using a database link when we use the tables from
these datastores in a dataflow. Will it perform full pushdown?

Step 3:
Follow the below screen shot to create your Project, Job and Dataflow in Designer.


Now go to your Sql Server database and open a query browser and use the below sql code to create a table with some data in
DB_Source database.
a)
--Create a Sample Table in SQL Server
Create table EMP_Details(EmpID int identity, Name nvarchar(255));
--Inserting some sample records
Insert into EMP_Details values (1, 'Mohd Shahanshah Ansari');
Insert into EMP_Details values (2, 'Kailash Singh');
Insert into EMP_Details values (3, 'John');.
b) Once table is created import this table EMP_Details into your DS_Sourcedatastore.

c) Drag a table from the datastore in your dataflow and use it as source table. Use a query transform then drag a template table
and fill it the data as shown in the screen shot below. So, you are creating a target table int DS_Target datastore.






Once target table created your dataflow will look as under.







d) Map the columns in Q_Map transform as under.


Now you have source table coming from one database i.e. DB_Source and Target table is stored into another database i.e.
DB_Target. Lets see if the dataflow is performing full pushdown or not.


How to see whether full pushdown is happening or not?

Go to Validation Tab in your designer and select Display Optimized SQL. Option. Below is the screen shot for the same.
http://2.bp.blogspot.com/-EbP7mLrxp4U/UdkiBhd1iVI/AAAAAAAABxM/VruWWofGQ88/s1600/6.png





Below window will pop up once you select above option.






If optimized SQL Code is starting from Select Clause that means Full pushdown is NOT performing. To perform the full
pushdown your SQL Query has to start with Insert Command.

Step 4:
How to Create a Linked Server in SQL Server

Now go to SQL Server Database and Create a linked Server as shown in the screen below.



Fill the details as shown in the screen below for General Tab

Now, go to Security tab choose the option as shown in below dialog box.


Click on OK Button. Your link server is created successfully.

Step 5:
Now It is time to create a datastore link and then see what optimized SQL it will generate.

Go to advance mode of your DS_Target datastore property and Click on Linked Datastore and choose DS_Source Datastore
from the list and then click OK Button.






Below dialog box will appear. Choose Datastore as DS_Source and click Ok.




Then Click on the browser button as shown below.




Then, select the option as show in dialog box below and then Click OK button.



Now you have successfully established a database link between two datastores i.e. between DS_Source and DS_Target.



Now Save the BODS Job and check the Optimized SQL from Validation Tab as done earlier.

Go to the dataflow and see what code is generated in Optimized SQL.



Below optimized code will be shown.





You can see that SQL has insert command now which means full pushdown is happening for your dataflow.
This is the way we can create a database link for SQL Server in DS and use more than one databases in a Job and still perform
full pushdown operations.

S-ar putea să vă placă și