Documente Academic
Documente Profesional
Documente Cultură
Table of Contents
TABLE OF CONTENTS.................................................................................................................................. 2 1 BACKGROUND............................................................................................................................................. 3 2 DETAILED ETL PROCEDURES................................................................................................................... 4 3 INFORMATICA STANDARDS...................................................................................................................... 7 4 BUILD AND UNIT TEST ACTIVITIES......................................................................................................... 14 APPENDIX A: STEP-BY-STEP APPLICATION OF CODE TEMPLATE TO CORE PROCESSES..............21 APPENDIX B: ACCESSING COMMONLE LOGS........................................................................................24 APPENDIX C: IMPLEMENTING RECORD-LEVEL EXCEPTION LOGGING INTO CORE PROCESSES...29 APPENDIX D: IMPLEMENTING RECORD-LEVEL AUDIT LOGGING INTO CORE PROCESSES............32
1 Background
1.1 Purpose
This document has been created to provide a more detailed understanding of the ETL patterns and the usage of Informatica as it related to Project OneUP. This document should be leveraged during the technical design and build phases of the development effort. This document is NOT static. As architecture patterns evolve and new best practices are introduced and implemented, the pages that follow will be updated to reflect these changes.
1. 2.
Audit log is triggered to denote middleware will be receiving data. Source data is extracted via the specific source extract strategy defined for the interface. a. b. Source data is pulled directly from the source. Data is staged within the middleware database to support multiple requirements for the source data.
3.
Data is transformed via the ETL tool into the target-specific format(s).
This interface pattern does not require use of the middleware database. The middleware database (labeled Batch Data Store) in Step 2 is utilized to accomplish any one of the following requirements of the business process: Multiple passes through each received data set (for example, if source data is sent only once and multiple mappings will require this information, it is best to store the data within a database to facilitate one process to receive data and multiple process to load data) Audit trail for logging purposes SOX compliance requirements Error handling
2.2 Informatica Error Logging and Exception Handling 2.2.1 Informatica Standard Task Level Error Logging
When logging audit and exception data to CommonLE either task level or row level error logging can be utilized. Task level is required by all interfaces to track failure or success of all interface sessions within a workflow. The standard implementation is outlined in the Appendix for Audit Log and Error Messaging (CommonLE).
Informatica Architecture CommonLE Integration Design Aggregator Application Source Qualifier Custom, configured as an active transformation (It has been assumed that SAP custom transformations fall into this category as well) Joiner MQ Source Qualifier Normalizer (VSAM or pipeline) Rank Sorter Source Qualifier XML Source Qualifier Mapplet, if it contains any of the above active transformations
By default, the PowerCenter Server will log all transformation errors within the session log file and all rejected target records into the reject or bad file. When row error logging has been enabled, all such information is now filtered to the error log database/flat file structures. If the architecture landscape determines that all errors should reside in the error logging structures and the standard session log and reject/bad file, then the configuration should include enabling Verbose Data Tracing. All of this additional logging may negatively impact the performance of sessions and workflows being executed on the PowerCenter server, as data are being processed on a row-by-row basis instead of a block of records at once.
3 Informatica Standards
3.1 Workflow Development
For each business object, it is possible that multiple workflows exist to perform the full spectrum of interface activities from legacy to SAP. A workflow is defined as a set of sessions and other tasks (commands calling shell scripts, decision and control points, e-mail notifications, etc.) organized in concurrent and/or parallel processing streams. Each workflow will execute a mapping or series of mappings that extract source data and load it into target systems. Working with the AI team, each Solution Integration Design will need to be modularized into workflows that perform the required predefined business functions. As a result, the interface programs built for a particular business object within the Solution Integration Design documentation could span multiple workflows and thus multiple technical design documents (as each technical design is at the workflow level).
Author: Developer Name Date: 01/01/2005 Description: This mapping performs the core functionality for the XYZ interface. ================ Revision History: ================ 1.0 01/01/2005 - Initial development
In addition to this comment, each of the transformations within a mapping should also have a brief explanation defining its functionality within the mapping.
Target Definition
[table_name] or [flat_file_name]_ACTION
Source Qualifier
sq_[source_name] sqo_[source_name]
Sequence Generator Stored Procedure External Procedure Advanced External Procedure Joiner Normalizer Rank Mapplet Sorter Transformation Transaction Control Union XML Parser XML Generator Custom Transformation IDoc Interpreter
jnr_SourceTable/FileName1_ SourceTable/FileName2 Nrm_[RelevantDescriptor] rnk_[RelevantDescriptor] Mplt_[RelevantDescriptor] srt_[RelevantDescriptor] tc_[RelevantDescriptor] un_[RelevantDescriptor] Xmp_[RelevantDescriptor] Xmg_[RelevantDescriptor] ct_[RelevantDescriptor] int_[RelevantDescriptor]
Used to join disparate source types: Oracle to Flat File for example. Used to create multiple records from the one record being processed. For example: nrm_Create_Error_Messages rnk_ RelevantDescriptionOfTheProcessionBeingDone mplt_ RelevantDescriptionOfTheProcessionBeingDone srt_ RelevantDescriptionOfTheProcessionBeingDone tc_RelevantDescriptionOfControl un_RelevantDescriptionOfUnion xmp_RelevantDescriptionOfXMLParser xmg_RelevantDescriptionOfGenerator ct_RelevantDescriptionOfCustomTransformation int_idoc_RelevantDescriptionOfCustomTransformation
* Wherever possible, transformations should include the $PMRootDir/<release>/Temp and $PMRootDir/<release>/Cache directories. Such transformations include but are not limited to:
Transformation Name Sorter Joiner Aggregator Lookup Rank Directory
Session
s_m_ <RICEF_TYPE> _ <PROCESS_AREA> _ <SOURCE> _ <TARGET> _ <Optional Information> Wf_ <RICEF_TYPE> _ <PROCESS_AREA > _ <SPECIFIC_DESCRIPTOR or BUSINESS_OBJECT>_ <SRC>_<TGT>_<Optional Information> (ie: wf_INTFC_ISCP_INVENT_INF O_BW_I2)
Workflow
Worklets
Wklt_description.
Worklets are objects that represent a set of workflow tasks that allow you to reuse a set of workflow logic in several workflows. This is a session that may be shared among several workflow and may execute while another instance of the same session is running. You can use the Control takes to stop, abort, or fail the toplevel workflow or the parent workflow based on an input link condition. Event-Raise task represents a user-defined event. When the Informatica Server executes the Event-Raise task, the Event-Raise task triggers the event. Use the Event-Raise task with the Event-Wait task to define events. The Event-Wait task waits for an event to occur. Once the event triggers, the Informatica Server continues executing the rest of the workflow.
Reusable Session
rs_description
Cntrl Task
Cntrl_description
Event Task
Evnt_description
Decision Task
Dcsn_description
The Decision task allows you to enter a condition that determines the execution of the workflow, similar to a link condition. The Command task allows you to specify one or more shell commands to run during the workflow. For example, you can specify shell commands in the Command task to delete reject files, copy a file, or archive target files. The Workflow Manager provides an Email task that allows you to send email during a workflow. You can create reusable Email tasks in the Task Developer for any type of email. Or, you can create non-reusable Email tasks in the Workflow and Worklet Designer. The Assignment task allows you to assign a value to a user-defined workflow variable. The Timer task allows you to specify the period of time to
Command Task
Cmd_description
Email Task
eml_description
asmt_description tm_description
Input
Used in lookup and expression transformations to denote ports that are used within the transformation and do not carry forward. Used in expression transformations for unconnected lookups.
Lookup
Return
Return values are found in lookup transformations and are typically the column from the source object being referenced in the lookup code.
QTG2
INF Dev - phgp0233: /etlapps/dev/81/qtg2/SrcFiles/ /etlapps/dev/81/qtg2/TgtFiles/ INF QA - phgp0232: /etlapps/fit/81/qtg2/SrcFiles/ /etlapps/fit/81/qtg2/TgtFiles/
The STATUS field can consist of the following values. Depending on the interface not all STATUS codes will be used. N (New) flag indicating that the record has been successfully inserted into the staging DB. P (Processing) flag indicating that the middleware application is processing the record. C (Complete) flag indicating that the middleware application has successfully processed the record. F (Failed) flag indicting that the middleware application has failed to process the record. (Assumption depending on interface business rules, failed records will remain in the staging table until successfully processed). Type VARCHAR2 DATE VARCHAR2 VARCHAR2 Null No No No No
Table naming standards for a source system loading data into middleware staging are: <Process Area>_SRC_<Src\Tgt>_<Business_Object>_<Table_Name> Example: The ItemSiteMaster table for the ISCP process area, business objects Transportaion Lanes from BW would be as follows: ISCP_SRC_BW_TRNLANES_ITEMSITEMASTER The same applies to the middleware application needing to load data into the middleware staging before sending to the target system. <Process Area>_TGT_<Src\Tgt>_<Business_Object>_<Table_Name> Example: The ItemSiteMaster table for the ISCP process area, business objects Transportaion Lanes to I2RP would be as follows: ISCP_TGT_I2RP_TRNLANES_ITEMSITEMASTER
############################################################################### ## ## Used to start Project OneUP Informatica Workflow ## ############################################################################### //schedapps/p1up/start_workflow.sh US_CORP_1UP_QTG1_INTFC wf_INTFC_QTG1_SHARED_IDOC_LISTENER -wait
The yellow highlighted section of the script provides the proper initialization of the environment variables for the start_workflow.sh script. User name, password, and Informatica port number are set within the env_p1up_batch.sh script. The core functionality of these scripts is highlighted in grey. There are two versions of this line, start_workflow.sh and stop_workflow.sh. In nearly all situations, the start_workflow.sh is used with a wait condition. The only Informatica component that uses the stop_workflow.sh is the IDoc Listener, which is started without a wait condition. There are three parameters that are supplied to the start_workflow.sh and stop_workflow.sh scripts: folder name (highlighted in blue text), workflow name (highlighted in green text), and the wait condition (red text). The wait condition should be used by most interfaces, as this will allow the workflow to complete prior to sending a return code to Control-M. This is important because the return code is responsible for communicating success or failure to Control-M and ControlM uses this return code to dictate execution of subsequent jobs in the group.
There will be a script implemented for each interface. The script name should conform to the following standard: p1up_qtqg2_<interface name> The parameter values for each script will be interface specific. To manually start the Informatica workflow with out Control-M, run the start_workflow.sh for that particular interface from the /schedapps/p1up directory.
DO NOT DIRECTLY COPY THESE MAPPINGS INTO YOUR DEVELOPMENT FOLDER. Shortcuts are required so that each developer is referencing the latest version of the code. If the mapping changes within the Shared folder, those changes will be propagated into the developers folder as well. Changes may impact the developers session and its ability to execute, but this type of error should not be difficult to resolve with either a Validation of the session or a slight configuration change. Screenshot 7.1.1.a This demonstrates the creation of a SHORTCUT into a developer folder. Notice the shortcut icon on each mapping that was added.
4.) Navigate to the developers folder that is currently open and Paste using the Edit menu.
Screenshot 7.1.2.b
5.) Step #4 will generate a new window to emerge called Copy Wizard. The Copy Wizard is designed to help eliminate any conflicts Workflow Manager detects when copying sessions or workflows from one folder to the next. This wizard should determine that there is a conflict with regards to the session/mapping associations. For each mapping/session combination, you will need to go through and select the mapping shortcut you previously created. Screenshot 6.1.2.d demonstrates the resolution of the conflict. Screenshot 7.1.2.c Copy Wizard
6.) Click Next>> and Finish to complete this wizard. 7.) You should now have created copies of those sessions. You should now rename each of the sessions you copied to align with the interface you are building. The following is the naming convention you should follow for each reusable session: s_m_INTFC_[interface acronym]_AUDIT_LOG_BEGIN s_m_INTFC_[interface acronym]_AUDIT_LOG_END_SUCCESS s_m_INTFC_[interface acronym]_AUDIT_LOG_END_FAILURE s_m_INTFC_[interface acronym]_ERROR_MESSAGING 8.) Lastly, each of these sessions will require parameter file entries within the following text files on the Unix servers: //etlapps/[phase]/71/qtg1/Scripts/US_CORP_1UP_QTG1_INTFC_begin_audit_parms.txt //etlapps/[phase]/71/qtg1/Scripts/US_CORP_1UP_QTG1_INTFC_end_audit_parms.txt //etlapps/[phase]/71/qtg1/Scripts/US_CORP_1UP_QTG1_INTFC_error_parms.txt 9.) Refer to Section 6.1.3 for sample entries into the parameter files.
$$INTERFACE_NAME
DEFAULT_INTERFACE_NAM
$$SERVICE_NAME
DEFAULT_BUSINESS_OBJE CT
DEFAULT_TARGET_SYSTE M 0
$$WORKFLOW_NAME $$NEXT_SESSION
DEFAULT_WORKFLOW_NA ME DEFAULT_NEXT_SESSION
$$AUDIT_STATUS
DEFAULT_AUDIT_STATUS
$$PREVIOUS_SESSION
DEFAULT_PREVIOUS_SESSI ON
Below are samples from each of the parameter files. Screenshot 7.1.3.a US_CORP_1UP_QTG1_INTFC_begin_audit_parms.txt
4.4 Build Completion and Next Steps 4.4.1 String / Assembly Testing
For string and assembly testing, all code will need to be moved into the project specific string/assembly test folder (QTG1_INTFC). There are currently shortcuts for the shared mappings that exist in these folders. Therefore, the development lead will only be responsible for migrating the sessions and workflows into the project folder. The development lead will need to re-point each session to use the mapping shortcuts already created within the project folder. In addition, the parameter files must be changed to reflect the new folder that all code is residing in. These modifications should complete the migration into the project folders.
Create a session using the mapping m_P1UP_SHARED_AUDIT_LOG_BEGIN. To save time, create a copy of the session s_m_P1UP_SHARED_AUDIT_LOG_BEGIN_SAMPLE from folder SHARED_US_CORP_1UP. Rename the session to comply with the following standards for interfaces. i) s_m_INTFC_[interface_abbreviation]_AUDIT_LOG_BEGIN
4) 5) 6) 7)
Double-click the session and click on the Properties tab. Change the session log file name to your_session_name.log. Click on the Properties Tab of your session. Use the following value for the parameter file setting: $PMRootDir/ai/Scripts/US_CORP_1UP_AI_INTFC_begin_audit_parms.txt. Click on the Mapping Tab. For the target entitled shortcut_to_INFA_INTERFACE_LOG, change the reject file name to your_session_name.bad. Log into Unix command line for the Informatica server. Modify the parameter file for begin audit logs located in the //etlapps/dev/71/qtg1/Scripts directory. The file name will be US_CORP_1UP_AI_INTFC_begin_audit_parms.txt. To add the applicable data, copy and paste the following 8 lines into the parameter file and replace the parameter values with the values that pertain to your session. [US_CORP_1UP_QTG1_INTFC.s_m_P1UP_SHARED_AUDIT_LOG_BEGIN_SAMPLE ] $$INTERFACE_NAME=SAMPLE_INTERFACE_NAME $$APPLICATION_ID=1UP_QTG1_INF_DEV $$SERVICE_NAME=12345 (Note: This is actually the caliber ID) $$TRANSACTION_DOMAIN=BUSINESS_OBJECT_NAME $$APPLICATION_DOMAIN=TARGET_APPLICATION $$NEXT_SESSION=s_m_INTFC_NEXT_SESSION $$WORKFLOW_NAME=wf_P1UP_SHARED_INTERFACE_SAMPLE Please refer to Section 7.1.3 for mapping parameters and parameter files.
8)
Create a session using the mapping m_P1UP_SHARED_AUDIT_LOG_END_FAILURE. To save time, you can copy session s_m_P1UP_SHARED_AUDIT_LOG_END_FAILURE_SAMPLE from folder SHARED_US_CORP_1UP. Rename the session to comply with the following standards for interfaces. i) s_m_INTFC_[interface_abbreviation]_AUDIT_LOG_END_FAILURE
9)
10) Double-click the session and click on the Properties tab. Change the session log file name to your_session_name.log. 11) Click on the Properties Tab of your session. Use the following value for the parameter file setting: $PMRootDir/ai/Scripts/US_CORP_1UP_AI_INTFC_end_audit_parms.txt. 12) Create a session using the mapping m_P1UP_SHARED_AUDIT_LOG_END_SUCCESS. To save time, you can copy session s_m_P1UP_SHARED_AUDIT_LOG_END_SUCCESS_SAMPLE from folder SHARED_US_CORP_1UP. 13) Rename the session to comply with the following standards for interfaces. i) s_m_INTFC_[interface_abbreviation]_AUDIT_LOG_END_SUCCESS
19) Double-click the session and click on the Properties tab. Change the session log file name to your_session_name.log. 20) Click on the Properties Tab of your session. Use the following value for the parameter file setting: $PMRootDir/ai/Scripts/US_CORP_1UP_AI_INTFC_error_parms.txt. 21) Log into Unix command line for the Informatica server. Modify the parameter file for exception logs located in the //etlapps/dev/71/qtg1/Scripts directory. The file name will be US_CORP_1UP_AI_INTFC_error_parms.txt. To add the applicable data, copy and paste the following 8 lines into the parameter file and replace the parameter values with the values that pertain to your session. [US_CORP_1UP_QTG1_INTFC.s_m_P1UP_SHARED_ERROR_MESSAGING_SAMPLE ] $$INTERFACE_NAME=SAMPLE_INTERFACE_NAME $$APPLICATION_ID=1UP_QTG1_INF_DEV $$SERVICE_NAME=12345 (Note: This is actually the caliber ID) $$TRANSACTION_DOMAIN=BUSINESS_OBJECT_NAME $$APPLICATION_DOMAIN=TARGET_APPLICATION $$SEVERITY_CODE=3 (NOTE: This will be dependent upon the SID definition for the interface) $$WORKFLOW_NAME=wf_P1UP_SHARED_INTERFACE_SAMPLE Please refer to Section 7.1.3 for mapping parameters and parameter files. 22) Create a session using the mapping m_P1UP_SHARED_INTFC_ERR_LOG_MESSAGING. To save time, create a copy of the session s_m_P1UP_SHARED_INTFC_ERR_LOG_MESSAGING_SAMPLE from folder SHARED_US_CORP_1UP. 23) Rename the session to comply with the following standards for interfaces.
24) Double-click the session and click on the Properties tab. Change the session log file name to your_session_name.log. 25) Click on the Mapping Tab. For the target entitled INFA_INTERFACE_ERR_LOG1, change the reject file name to your_session_name.bad. 26) Click on the Properties Tab of your session. Use the following value for the parameter file setting: $PMRootDir/ai/Scripts/US_CORP_1UP_AI_INTFC_error_parms.txt. The same error parameter file will be leveraged throughout the record-level exception handling components. Copy the lines used for the summary exception messaging session and reference this new session. Keep these entries close together in case a change is required. 27) Create a session using the mapping m_P1UP_SHARED_INTFC_AUDIT_LOG_MESSAGING. To save time, create a copy of the session s_m_P1UP_SHARED_INTFC_AUDIT_LOG_MESSAGING_SAMPLE from folder SHARED_US_CORP_1UP. 28) Rename the session to comply with the following standards for interfaces. i) s_m_INTFC_[interface_abbreviation]_INTFC_AUDIT_LOG_MESSAGING
29) Double-click the session and click on the Properties tab. Change the session log file name to your_session_name.log. 30) Click on the Mapping Tab. For the target entitled INFA_INTERFACE_AUDIT_LOG, change the reject file name to your_session_name.bad. 31) Click on the Properties Tab of your session. Use the following value for the parameter file setting: $PMRootDir/ai/Scripts/US_CORP_1UP_AI_INTFC_begin_audit_parms.txt. The same audit begin parameters will be leveraged throughout the record-level audit logging components for this session. Copy the lines used for the begin audit messaging session and reference this new session. Keep these entries close together in case a change is required. 32) Within the core processing sessions, add the following entries to the workflow parameter file located at: $PMRootDir/ai/Scripts/US_CORP_1UP_AI_INTFC_workflow_parms.txt. [US_CORP_1UP_AI_INTFC.s_m_P1UP_SHARED_INTFC_AUDIT_LOG_MESSAGING_SAMPLE ] $$INTERFACE_NAME=SAMPLE_INTERFACE_NAME Shortcut_to_mplt_Process_Audit_Logs.$$AUDIT_LOGGING_SWITCH=ON
4) Click View Logs and choose Application. You may use the other fields to narrow the search. Click the Submit button.
6) The details of that specific log will be displayed at the bottom of the page.
1)
Port Description
This is the name of the interface currently being executed. This parameter should be consistent across all of the parameter files for a given interface. This parameter should be local to the mapping itself and have the full name of the mapping being executed. This value will be defined as a constant within a transformation in the mapping and will correspond to the name of the transformation where the exception occurred. This value will be the transformation type for the location of the exception. An expression should be used to concatenate the input values for a given failed transformation. This is most useful/vital for lookup procedures.
in_MAPPING_NAME
in_TRANSFORMATION_NAME
in_TRANSFORMATION_TYPE
Constant defined within the mapplet-calling mapping Concatenated value defined within the mapplet-calling mapping
in_TRANS_INPUT_DATA
in_ERR_CODE
in_ERR_BUSINESS_ID
in_ERR_TIMESTAMP
Error Codes
This section will contain a table of all of the acceptable error messages to be logged into the INFA_INTERFACE_ERR_LOG table. Emphasis must be placed upon using the proper messages when logging to this table.
Example Usage
When lkp_PAYMENT_TERMS returns -1, log this error along with the incoming data values for the transformation.
DATA_VALIDATION_ERROR
When exp_CHECK_DEBIT_CREDI T_MATCH detects a difference between AMT_DEBIT and AMT_CREDIT, route this information to the exception mapplet. When in_denominator = 0 then route record to exception mapplet with a divide by zero error using this message value. When in_oldValue is not in (1, 2, 3, 4, 5) then mark this as an error. When number of source records does not equal the number of target records, log this value. When target load conditions are not met, this error should be sent to the CommonLE to identify the record as not
COMPUTATION_ERROR
This error message value should be used when computation errors are detected within expressions, aggregators, etc. This value will be used when conversions or substitutions are used within expressions and no possible matches are found. This error message is reserved for source/target record count analysis. This error message is reserved for target load errors.
DATA_CONVERSION_ERROR
RECORD_COUNT_ERROR
TARGET_LOAD_ERROR
The outputs of this transformation will link directly to the target table, INFA_INTERFACE_AUDIT_LOG. Using the AutoLink feature of Informatica, the output from the mapplet transformation will automatically link or port to the target tables columns. During session creation, assign SAPEAI as the connection for this target table.