Sunteți pe pagina 1din 18

Scheduling BODS Jobs Sequentially and Conditionally

Introduction:
This Article provides various solutions for scheduling multiple BODS Batch Jobs (Jobs) sequentially and conditionally. As you are aware that BODS does not contain an inbuilt mechanism to chain multiple Jobs within one parent Job and the default way to work is to chain multiple Workflow's within a Job. However we cannot execute workflow on its own and need a Job to execute it and in various scenarios there is a need to sequence the Jobs and conditionally run the Jobs. The approaches provided below can be used where chaining multiple Workflow's within a Job is not enough and Jobs have to be chained/Sequenced. The advantages of using below approaches are 1.There is no need for a Third Party Scheduling Tool, Various features within BODS can be combined to create a Job, which acts as a Parent Job, which can be scheduled to trigger Jobs one after other. Parent Job acts like a Sequencer of Jobs. 2. We can avoid scheduling each and every Job and only Schedule Parent Jobs. 3. Using WebServices approach, Global Variables can be passed Via XML file to the Jobs in a simplified manner. 4. Using WebServices approach, Developer would need access only to the folder, which JobServer can access, to place the XML files and does not require access to the JobServer itself. 5. Avoids loading a Job with too many WorkFlows. 6. Time based Scheduling (example:Schedule Jobs at 10 every minutes Interval) can be avoided and Hence there will not any overlap if the preceding Job takes more than 10 minutes. 7.As the Child Jobs and the Parent Job will have its own Trace Logs it would make it easier to troubleshoot in case of any issues. 8.At any point, Child Jobs can be run independently too in Production Environment, this will not be possible if the entire Job logic is put into a WorkFlow. Scheduling BODS Jobs Sequentially: If the requirement is to just sequence the jobs so that it can be executed one after the other irrespective of whether the preceding job completes successfully or terminates with some error, then, one of the below approaches can be used. Note that in the example provided below it is considered that the jobs do not have any Global Variables. Approach for Chaining/Sequencing Jobs with Global Variables is explained in the later part of the Article. Sequencing using Script: Consider two simple Jobs: Job1 and Job2 are to be executed in sequence and Job2 does not have any Business dependency on Job1. The requirement is to execute only one Job at a time. i.e Job1 can be run first and then Job2 or the other way round but the only criteria is that no two jobs should run at the same time. This restriction could be for various reasons like efficient utilization of Job Server or because both the Jobs use the same Temp Tables. Steps to Sequence Jobs using Script: 1.Export the Jobs as .bat (Windows) using the Export Execution Command from Management console.

2.Check availability of Job1.bat and Job2.bat files in the Job Server. 3.Create a new Parent Job (call it as Schedule_Jobs) with just one Script Object. 4. In the Script, Call the Job1 and Job2 one after another using the exec function as given below Print('Trigger Job1'); Print(exec('C:\Program Files (x86)\Business Objects\BusinessObjects Data Services\log\Job1.bat','',8)); Print('Trigger Job2'); Print(exec('C:\Program Files (x86)\Business Objects\BusinessObjects Data Services\log\Job2.bat','',8)); When the Schedule_Jobs Parent Job is run it triggers Job1 and then after completion (successful completion/Termination) of Job1 it triggers Job2. Now Parent Job can be Scheduled in the Management Console to run at a Scheduled time and it would trigger both Job1 and Job2 in sequence as required. Note that if the Job1 hangs due to some reason, Schedule_Job will wait until Job1 comes out of the hung state and returns control to Schedule_Job. In this way any number of Jobs can be sequenced. Sequencing using Webservices: If the same above two jobs (Job1 and Job2) have to be executed in sequence using Webservices, below approach can be used. 1.Publish both Job1 and Job2 as Webservice from Management Console.

2.Pick up the Webservice URL using the view WSDL option, the link will be as given below http://<hostname>:28080/DataServices/servlet/webservices?ver=2.1&wsdlxml

3.In Designer, create a new DataStore with Datastore type as WebService and provide the WebService URL fetched from View WSDL option.

4. Import the published Jobs as functions

5. Create a simple Parent Job (called Simple_Schedule) to trigger Job1 and Job2

6. In the Call_Job1 query object, call Job1 as shown in below diagrams, as no inputs are required for Job1, the DI_ROW_ID from Row_Generation or Null can be passed on to the Job1.

7. Similarly call Job2 in the Call_Job2 query object. When the Simple_Schedule Parent Job is run, It triggers Job1 and then after completion (successful completion/Termination) of Job1 it triggers Job2. Now the Parent Job can be Scheduled in the Management Console to run at a Scheduled time and it would trigger both Job1 and Job2 in sequence as required. Note that if the Job1 hangs due to some reason, Parent Job will wait until Job1 comes out of the hung state and returns control to Parent Job. In this way any number of Jobs can be sequenced. Scheduling BODS Jobs Conditionally:

In most of the cases, Jobs are dependent on other Jobs and some Jobs should only be run, after all the Jobs that this Job depends on, has run successfully. In these scenarios Jobs should be scheduled to run conditionally. Conditional Execution using Script: Lets consider that Job2 should be triggered after successful completion (not termination) of Job1 and Job2 should not be triggered if Job1 fails. Job status can be obtained from Repository table/view ALVW_HISTORY. The Job status for the latest instance of the Job1 run should be checked and based on that Job2 should be triggered.To do this, 1.The Repository Database\Schema should be created as new Datastore (Call it BODS_REPO). 2.Import the ALVW_HISTORY view from the Datastore. 3.Create a new Parent Job Conditionally_Schedule_Using_Script with just one Script Object 4.Create two Variables $JobStatus and $MaxTimestamp in the Parent job 5.Between the exec functions place the status check code as given in the below code Print('Trigger Job1'); Print(exec('C:\Program Files (x86)\Business Objects\BusinessObjects Data Services\log\Job1.bat','',8)); #Remain idle for 2 secs so that Job Status is Stable (Status moves from S to D for a Successful Job and E for Error) Sleep(2000); #Pick up the latest job Start time $MaxTimestamp = sql('BODS_REPO', 'SELECT MAX(START_TIME) FROM DataServices.alvw_history WHERE SERVICE=\'Job1\';'); PRINT($MaxTimestamp); #Check the latest status of the preceding job $JobStatus = sql('BODS_REPO', 'SELECT STATUS FROM DataServices.alvw_history WHERE SERVICE=\'Job1\' AND START_TIME=\'[$MaxTimestamp]\';'); PRINT($JobStatus); if ($JobStatus='E') begin PRINT('First Job Failed'); raise_exception('First Job Failed'); end else begin print('First Job Success, Second Job will be Triggered'); end Print('Trigger Job2'); Print(exec('C:\Program Files (x86)\Business Objects\BusinessObjects Data Services\log\Job2.bat','',8)); Using the above code in the Script, when the Parent Job is Run it will trigger Job1 and only if Job1 has completed successfully it will trigger Job2. If Job1 fails then Parent Job will be terminated using the raise_exception function. This approach can be used to conditionally schedule any number of Jobs. Conditional Execution using Webservices:

To conditionally execute Job (Published as WebService) based on the status of preceding Job (again Published as WebService), the same concept used in the Conditional Execution using Script can be applied. i.e Call Job1, Check the Status of Job1 and then if Job1 is successful trigger Job2. 1.Create a Parent Job with 2 DataFlows and a Script in between the DataFlows 2. Use First DataFlow to call the First Job (Refer above section for detail on calling a Job as webservice within another Job) 3. Use the Second DataFlow to call the Second Job 4. Use the Script to Check the Status of First Job

The Script will have below code to check the status #wait for 2 seconds sleep(2000); #Pick up the latest job start time $MaxTimestamp = sql('BODS_REPO', 'SELECT MAX(START_TIME) FROM DataServices.alvw_history WHERE SERVICE=\'Job1\';'); PRINT($MaxTimestamp); #Check the latest status of the Preceding job $JobStatus = sql('BODS_REPO', 'SELECT STATUS FROM DataServices.alvw_history WHERE SERVICE=\'Job1\' AND START_TIME=\'[$MaxTimestamp]\';'); PRINT($JobStatus); if ($JobStatus='E') begin PRINT('First Job Failed'); raise_exception('First Job Failed'); end else begin print('First Job Success, Second Job will be Triggered'); end Using the above code in the Script when the Parent Job is Run it will trigger Job1 and only if Job1 has completed successfully it will trigger Job2. This approach can be used to conditionally schedule any number of Jobs that are published as WebService.

Conditional Execution using Webservices: Jobs with Global Variables When Jobs have Global Variables for which values needs to be passed while triggering it,It needs to be handled differently as when the Job is called as webservice it expects Global Variables to be mapped. So the idea is to pass either Null Values (For Scheduled Run) or actual Values (For Manual Trigger) using XML file as input. Lets assume that the First Job has 2 Global Variables like $GV1Path and $GV2Filename and that the second Job does not have any Global Variables and the requirement is to trigger Job2 immediately after successful completion of Job1. 1.Similar to above Parent Job, Create a Parent Job with 2 DataFlows and a Script in between the DataFlows

2. Use First DataFlow to call the First Job (Refer above sections for detail on calling a Job as webservice within another Job), Instead of Using Row Generator Object use XML input File as source

The XSD for the Input XML file will be as given below, if there are more Global Variables in the Job then elements GV3, GV4 and so on should be added to the Schema. <xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="FIRSTJOB"> <xs:complexType> <xs:sequence> <xs:element name="GLOBALVARIABLES"> <xs:complexType> <xs:sequence> <xs:element type="xs:string" name="GV1"/> <xs:element type="xs:string" name="GV2"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence>

</xs:complexType> </xs:element> </xs:schema> The Input XML file used is as given below <FIRSTJOB> <GLOBALVARIABLES> <GV1>testpath1</GV1> <GV2>testfilename</GV2> </GLOBALVARIABLES> </FIRSTJOB>

3. In the "WebService function Call" in "call_FirstJob" Query object, map the Global Variables as shown below

4.Use Second DataFlow to call the Second Job. As this Job does not contain Global Variables, Row Generation object would be enough (as in previous section) 5. Use the Script object to Check the Status of First Job

Using the above approach, When the Parent Job is Run it will trigger First Job and pass the Global Variables present in the Input XML File and only if First Job has completed successfully it will trigger Second Job. This approach can be used to conditionally schedule any number of Jobs that are published

as WebService. For every Job that has Global Variables, An XSD and XML file should be created. The Global Variables passed from the XML file to the WebService seems to be working only when the parameters are passed in right order, Hence it would be good practice to name the Global Variables with good naming convention like $GV1<name>, $GV2<name> and so on. Data Services sequential and conditional batch job scheduling & launching Posted by Scott Broadway in Data Services and Data Quality on Jan 31, 2013 10:43:05 PM inShare I really appreciate the quality of Anoop Kumar's recent article "Scheduling BODS Jobs Sequentially and Conditionally". And the technical accuracy is high -- yes, you can accomplish what you are trying to do with the techniques discussed in the article. Love the visuals, too. However. I cannot really recommend this kind of solution. Data Services is not an enterprise scheduling or orchestration tool. This approach suffers a bit from Maslow's law of the instrument: "if the only tool you have is a hammer...treat everything as if it were a nail." Yes, I love Data Services and Data Services is capable of doing all of these things. Is it the best tool for this job? Not exactly. And this question is answered in the first paragraph that mentions chaining workflows. Data Services already gives you the capability to encapsulate, chain together, and provide conditional execution of workflows. If jobs only contain one dataflow each, why are you calling them jobs and why do you want to execute these jobs together as a unit? Data Services is a programming language like other programming languages, and some discretion needs to be taken for encapsulation and reusability. I do really like the use of web services for batch job launching. It is a fantastic feature that is underutilized by DS customers. Instead, I see so many folks struggling to maintain tens and sometimes hundreds of batch scripts. This is great for providing plenty of billable work for the administration team, but it isn't very good for simplifying the DS landscape. The web services approach here will work and seems elegant, but the section about "sequencing using web services" does not sequence the jobs at all. It just sequences the launching. Batch jobs launched as web services are asynchronous... you call the SOAP function to launch the job, and the web service provider replies back with whether the job was launched successfully. This does not provide any indication of whether the job hascompleted yet. You must keep a copy of the job's runID (provided to you as a reply when you launch the job successfully) and use the runID to check back with the DS web service function Get_BatchJob_Status (see section 3.3.3.3 in the DS 4.1 Integrator's Guide). [Note: scheduling and orchestration tools are great for programming this kind of logic.] Notice how it would be very hard to get true dependent web services scheduling in DS since you would have to implement this kind of design inside of a batch job: Have a dataflow that launches Job1 and returns the runID to the parent object as a variable Pass the runID variable to a looping workflow In the looping workflow, pass the runID to a dataflow that checks to see if Job1 is completed successfully When completed successfully, exit the loop Have a dataflow that launches Job2 and returns the runID to the parent object as a variable Pass the runID variable to another looping workflow In the looping workflow, pass the runID to a dataflow that checks to see if Job2 is completed successfully When completed successfully, exit the loop Build your own custom logic into both of those looping workflows to run a raise_exception() if the runID of the job crashes with an error. Encapsulate the whole thing with Try/Catch to send email notification if an exception is raised.

This convoluted design is functionally IDENTICAL to the following and does not rely on web services: Encapsulate the logic for Job1 inside of Workflow1 Encapsulate the logic for Job2 inside of Workflow2 Put Workflow1 and Workflow2 inside JobA Use Try/Catch to catch errors and send emails I'm also hesitant to recommend a highly customized DS job launching solution because of supportability. When you encapsulate your ETL job launching and orchestration in an ETL job, it's not very supportable by the consultants and administrators who will inherit this highly custom solution. This is why you invest in a tool like Tidal, Control-M, Maestro, Tivoli, Redwood, etc., so that the scheduling tool encapsulates your scheduling and monitoring and notification logic. Put the job execution logic into your batch jobs, and keep the two domains separate (and separately documentable). If you come to me with a scheduling/launching problem with your DS-based highly customized job launching solution, I'm going to tell you to reproduce the problem without the customized job launching solution. If you can't reproduce the problem in a normal fashion with out-of-the-box DS scheduling and launching, you own responsibility for investigating the problem yourself. And this increases the cost to you of owning and operating DS. If you really want to get fancy with conditional execution of workflows inside of a job, that is pretty easy to do. Set up substitution parameters to control whether you want to run Workflow1, Workflow2, Workflow3, etc. [Don't use Global Variables. You really need to stop using Global Variables so much...your doctor called me and we had a nice chat. Please read this twice and call me in the morning.] Ok, so you have multiple substitution parameters. Now, set up multiple substitution parameter configurations with $$Workflow1=TRUE, $$Workflow2=TRUE, $$Workflow3=TRUE, or $$Workflow1=TRUE, $$Workflow2=FALSE, $$Workflow3=FALSE, etc. Put these substitution parameters into multiple system configuration, e.g. RunAllWorkflows or RunWorkflows12. In your job, use Conditional blocks to evaluate whether $$Workflow1=TRUE -- if so, run Workflow1. Else continue with the rest of the job. To another conditional that evaluates $$Workflow2...etc. Depending on which workflows you want to execute, just call the job with a different system configuration. Yes, you can include System Configuration name when you call a batch job via command line or via a web service call. For web services, you just need to enable Job Attributes in the Management Console -> Administrator >Web Services (see section 3.1.1.1 step 9 in the DS 4.1 Integrator's Guide) and specify the System Configuration name inside of element: <job_system_profile>RunAllWorkflows</job_system_profile>. For command line launching, use the al_engine flag: -KspRunAllWorkflows Yes, you can override your own substitution parameters at runtime. For Web Services, enable Job Attributes and specify the overrides inside of the tags: <substitutionParameters> <parameter name="$$Workflow1">TRUE</parameter> <parameter name="$$Workflow2">FALSE</parameter> </substitutionParameters> For command line launching, use the al_engine flag: -CSV"$$Workflow1=TRUE:$$Workflow2=FALSE" (put a list of Substitution Parameters in quotes, and separate them with semicolons)

o o

How to schedule a Data Services job by BOE scheduler.


Skip to end of metadata

Log in to Data Services Management console.


Click on Administrator->Management->CMS Connection, input the CMS connection information which Data services is installed on and input the User account credentials for executing the program(This is OS user used to access Data Services).

Create a Data Services bath job schedule in DS management console.


Click on Administrator->Batch->repo name-> Batch Job Configuration->Add Schedule.

Choose BOE Scheduler and input the required information for a schedule.

Login to CMC and check the schedule generated file.


There are two files generated under Folder Data Services, one is Program file and the other is a file with Job execution params. We can check schedule history from CMC.

Right click on BOESchedule Program and choose properities.We can see the logon user here is get from Data Services management console CMS connection(User account credentials for executing the

program).

Go to Data Services repository and check schedule information.


Select the content from AL_SCHED_INFO table and we can see the job scheule information here. Column SCHEDULED_IN tells the scheduler

type.

S-ar putea să vă placă și