Sunteți pe pagina 1din 33

Presentation On

Ab Initio Custom Component

By
Vineet Agrawal
v.agrawal@tcs.com

Vineet Agarwal

Page 1

4/18/2007

Contents

Overview

Syntax of MPC file

Writing A Program Specification File

Steps To Create New Component

Detailed Descriptions of Attributes & Variables

Vineet Agarwal

Page 2

4/18/2007

Overview
Program specification files provide the Co>Operating System
with the information it needs to run program or shell script.
Program specification files should be with .mpc extensions.
All program specification files must start with <mpcfile>
The <mpcfile> line is followed by a series of attribute:
value lines that describe the attributes of program.

Vineet Agarwal

Page 3

4/18/2007

Syntax
Program specification file uses the following syntax
<mpcfile>
[ label: label ]
[ author: author-name ]
[ version: version-number ]
[ comment: comment ]
image: path
[ exit: code ]
[ port: type direction name [location] [ordering]
[fan-preference] [min-flows] [max-flows] [record-format] ]
[ environment : env-variable value ]
[ argument: literal value1 [value2 ...] ]
[ argument: flow portname ]
[ argument: partition ]
[ argument: depth ]

Vineet Agarwal

Page 4

4/18/2007

[ argument: file filename ]


[ argument: expression file filename ]
[ argument: expression string ]
[ argument: transform file filename ]
[ argument: transform string ]
[ parameter: placement name type default-value
description [restrictions] ]
[ metadata type: dest-portname = {source-portname | value} ]

Vineet Agarwal

Page 5

4/18/2007

Writing a program specification file


Must include the <mpcfile> line and the image line, all the
other lines are optional.
Can use as many port, argument, parameter, and metadata
type lines as you need to describe your particular program.
Comments in a program specification file are denoted by the
following:
Lines that start with #
From // to the end of the line
From /* to */
To write a program specification file in the Component
Organiser:
1.

In the GDE, open the Component Organiser.

2.

Right-click My Components.

3.

From the shortcut menu, choose New > Program.

A New Component icon appears under My Components.

Vineet Agarwal

Page 6

4/18/2007

4.

Do one of the following:

Enter a name in the New Component icon.


Press Enter to accept New Component as the name.
5.

Right click the New Component icon.

A pop-up menu appears.


6.

From the pop-up menu, choose Edit As text.

The Edit Program Component window opens.


7. Write your own program specification file by editing the
template in the window.

Vineet Agarwal

Page 7

4/18/2007

Steps to Create New Component


Step 1

Vineet Agarwal

Page 8

4/18/2007

Step 2

Vineet Agarwal

Page 9

4/18/2007

Step 3

Vineet Agarwal

Page 10

4/18/2007

Detailed Descriptions of
Attribute & Variables
Label
Syntax: label: label
This line specifies the name you want to appear for the
component in the Component Organiser and on the component
icon in the GDE.
Example:
Lebel: This Is My Own Component

Mpname
mpname: "mp component name"

Author
Syntax: author: "Your Name"
This line specifies the author's name that appears on the
Properties Description tab of the Properties dialog box in the
GDE.
Example: author: Vineet Agrawal

Vineet Agarwal

Page 11

4/18/2007

Version
Syntax: version: version-number
This line specifies the version number that appears on the
Properties Description tab of the Properties dialog box in the
GDE.
For built-in components, use Built-in <lowVer>:<highVer>
Examples: version: "1.0"

Comment
Syntax: comment: comment
This line specifies the comment that appears in the GDE in two
places:
On the Properties Description tab of the Properties dialog box
In the Description Panel of the Component Organizer when
you select the component
Examples: comment: "Your Description Here"

Vineet Agarwal

Page 12

4/18/2007

Image
Syntax: image: "path"
This line specifies the path for the executable program or script
for this component; path represents this path.
Specify an absolute or relative path. A relative path is relative to
the current search path specified in the PATH configuration
variable.
Supply one image line in the program specification file. This is
the only line other than the <mpcfile> line that you must have in
the program specification file.
When the Co>Operating System runs the program, it uses the
same path on all processing nodes. If the Co>Operating System
does not find the program on a processing node by following
that path, it copies the program from the control node to a cache
directory on the processing node, runs it, and saves it for future
use.
Examples: image: "name-of-executable"
NOTE: Enclose path in double quotes.

Vineet Agarwal

Page 13

4/18/2007

Exit
Syntax: exit: code
This line specifies the integer exit code that indicates a
successful termination. Use a specific integer or one of the
following:
Any negative
non_positive

positive

even non_negative

odd

The default is 0.
Examples: exit: negative

Port
Port lines define the ports that the component supports.
Syntax: port : type direction name [ordering ] [location ]
[fan-preference ] [min-flows ] [max-flows ] [record-format]
Ports in Ab Initio components represent the input/output
behaviour of program.

Vineet Agarwal

Page 14

4/18/2007

1. type
The type variable specifies one of four different types of ports,
as follows:
std - Invokes program with the standard input, standard
output, or standard error already connected.
In order to use a std port, program must read sequentially from
its standard input flow or write sequentially to the standard
output or standard error flow, without seeking.
The .mpc file needs no additional lines for a std port.
file - Passes a path to program.
In order to use a file port, program must accept a path argument
on the command line, and either read a file without modifying it
(for an in port) or write a file with that path (for an out port).
For an in port, your program may open the file any number of
times, and it may seek in the file.
For an out port, your program must open the file at least once, or
must create it in some other way.
The .mpc file should contain a line of the following form for
each file port:
argument: flow portname

Vineet Agarwal

Page 15

4/18/2007

npipe- Creates a named pipe or file and passes its path to


your program.
In order to use an npipe port, your program must accept a path
argument on the command line and perform sequential I/O on it.
For an in port, your program must open the file for reading
exactly once and then read sequentially to the end without
seeking or writing in the file.
For an out port, your program must open the file for writing
exactly once, and then write sequentially without seeking or
reading in the file.
Ports of the npipe type allow the graph to run faster, due to the
pipeline parallelism they support.
The .mpc file should contain a line of the following form for
each npipe port:
argument: flow portname

soc - The SOC interface is used by Ab Initio built-in


components

Vineet Agarwal

Page 16

4/18/2007

2. direction
Specifies the direction of the flow for a port.
direction
in
out
err

Port
input port or stdin
output port or stdout
Stderr

3. name
Specifies the name of the port. For example, the input port is
usually named in.

4. location
Specifies on which side of the component icon you want the
port to appear in the GDE.
Choose one of the following:
top bottom
left right

5. ordering
Controls whether or not the GDE may reorder multiple flows
connected to a port:
If you specify ordered, the GDE does not reorder multiple
flows connected to the port.
If you do not specify ordered, the GDE reorders port
connections for readability.
Ordered indicates ordering of attached flows is important
(e.g. input to CONCAT) (default NO)

Vineet Agarwal

Page 17

4/18/2007

6. fan-preference
Specifies the type of flow the GDE creates when you connect a
flow to the port.
The fan-preference option works as follows:
If you specify multiple, the GDE creates fan-in flows, fan-out
flows, or all-to-all flows where appropriate.
If you do not specify multiple, the GDE creates straight flows.
7.

min-flows and max-flows

Specifies the minimum and maximum number of flow partitions


you can connect to a port.
- use 0 for mintaps to indicate the port need not be
connected.
- use a big number like 99999 for maxtaps to indicate no
limit.
- defaults are 1 and infinity if not given
This is used to guess flow patterns.

Vineet Agarwal

Page 18

4/18/2007

8. record-format
The default is for ports to propagate their metadata from the
ports to which they connect.
metadata is quoted metadata default, either "=string" or
"&remotefilepath" or "Llocalfilepath"
Examples:
port: soc in in ordered 1 99999
port: soc out out 1 1
port: soc out log 0 1
port: npipe in indat ordered left multiple 1 1 "=string('\n')"

Environment
Syntax: environment: "env-variable value"
NOTE: Enclose env-variable value in double quotes.
This line lets you direct the Co>Operating System to set the
environment variable named by env-variable to value in the runtime environment of each partition of the component.
Examples: environment: BB_DQ_COUNT 10

Vineet Agarwal

Page 19

4/18/2007

Parameter
Used by the GDE to construct "mp custom" commands in the
job script.
The "mp custom" line will contain the parameters in the
specified order.
Syntax: parameter: <form> <name> <type> "<default>"
"<doc>" [reqspec] [visibility] ["<condition>"] [codegen]
form is how parameter looks on mp line in script, either
positional or keyword
name is parameter name (as seen on parameters property
page)
type is one of: expression transform integer float string
expression file dataset metadata transform layout date mode
protection bool special choice literal infile outfile.
default is quoted default value (for choice parameters, a list
of the choices, first is default)
doc is quoted documentation string (keep it short)

Vineet Agarwal

Page 20

4/18/2007

reqspec is optional or required (default is required for


positional, optional for keyword)
visibility is visible, cond_visible, or hidden (default is
cond_visible)
condition specifies when to display the parameter in the
Parameters Properties page (when visibility is cond_visible),
and when to generate code on the mp line for the parameter
(when codegen is cond_codegen).

codegen is codegen, cond_codegen, or nocodegen


default combination is based on visibility as :
visible->codegen
cond_visible->cond_codegen
hidden->nocodegen

Vineet Agarwal

Page 21

4/18/2007

EXAMPLES:
parameter: positional Key collator "" "Key specifier on which to
aggregate"
parameter: positional EncodingMode choice "-ascii -binary"
"add -ascii (default) or -binary to mp cmd" required
parameter: positional PickOne choice "\"\" c1 c2" "add c2 or c2
or neither (default)" optional
parameter: keyword Force bool "True" "Adds -force to mp line
if set TRUE"

Vineet Agarwal

Page 22

4/18/2007

Argument
Argument lines indicate the ordered arguments to the
executable.
The argument lines specify the arguments that the Co>Operating
System passes to program. The Co>Operating System passes the
arguments to the program in the same order the argument lines
occur in the .mpc file.
Syntax:
You can write nine different types of argument lines:
argument: literal
argument: flow
argument: partition
argument: depth
argument: file
argument: expression file
argument: expression
argument: transform file
argument: transform

Vineet Agarwal

Page 23

4/18/2007

argument: literal
Syntax: argument: literal "value1" ["value2" ...]
passes value to your program as is.

argument: flow
Syntax: argument: flow portname
passes the path of the named pipe or file attached to the port
named by portname.
If you attach multiple flows to a port and then use the name of
that port for portname, a flow argument will pass the paths of
the named pipes or files represented by those flows to your
program as a series of arguments, one path per argument.

argument: partition
Syntax: argument: partition
passes the number of the partition on which its copy of the
custom component is running. It passes this partition number on
the command line, as a number between 0 and the number of
partitions minus 1.

Vineet Agarwal

Page 24

4/18/2007

argument: depth
Syntax: argument: depth
passes the number of partitions on which the custom component
is running to program.
argument: file
Syntax: argument: file filename
passes filename to program.
If the Co>Operating System does not find the file named by
filename on the processing node where a particular partition of
the component is running, the Co>Operating System copies it to
a cache directory on that node, passes the name of the cached
copy to the program, and saves the copy for future use.
argument: expression file
Syntax: argument: expression file filename
passes filename to program in the same way that a file argument
does; use the expression file argument for files that contain
DML expressions.

Vineet Agarwal

Page 25

4/18/2007

argument: expression
Syntax: argument: expression string
passes string to your program as is; use the expression argument
for a string that denotes a DML expression.
argument: transform file
Syntax: argument: transform file filename
A transform file argument passes filename to your program in
the same way a file argument does; use the transform file
argument for files that contain a DML transform function.
argument: transform
Syntax: argument: transform string
A transform argument passes string to your program as is; string
represents a DML transform function.

Vineet Agarwal

Page 26

4/18/2007

Metadata Type
Syntax: metadata type: <portname> = <rule>
metadata type: dest-portname = {source-portname |
value}
These lines tell the GDE how to set the record format for the
component's ports.
Choose one of the following:
dest-portname = source-portname - The GDE propagates
(copies) the record format from the source port to the
destination port
dest-portname = value - the GDE sets the record format to
value
The value above represents a record format in double quotes,
with any embedded double quotes backslashed. The following is
an example:
metadata type: log = "record string(\"|\") node, timestamp,
component, subcomponent, event_type; string(\"|\\n\")
event_text; end"
EXAMPLES:
metadata type: out = in
metadata type: in = out

Vineet Agarwal

Page 27

4/18/2007

Sample mpc file of Sort Component

Vineet Agarwal

Page 28

4/18/2007

Filter By Expression
Description Panel

Vineet Agarwal

Page 29

4/18/2007

Properties Description Tab

Vineet Agarwal

Page 30

4/18/2007

Sample mpc file of Component Filter By Expression


<mpcfile>
label: "Filter by Expression"
mpname: "select-transform"
image: "unitool"
author: "Ab Initio Software"
version: "Built-in"
comment: "Filters data records according to a specified DML
expression."
port: soc in in
port: soc out out 1 1
port: soc out deselect right 0 1
port: soc out reject bottom 0 1
port: soc out error bottom 0 1
port: soc out log bottom 0 1
parameter: positional select_expr valid_expression "" "Filter expression"
parameter: implicit "reject-threshold" choice "0
Abort\177on\177first\177reject Never\177abort Use\177limit/ramp"
"When to abort if input records are rejected" optional visible ""
parameter: implicit limit integer "0" "Maximum rejected records before
failure" optional visible "param reject-threshold Use*"
parameter: implicit ramp float "0.0" "Rate of rejected records" optional
visible "param reject-threshold Use*"
parameter: local keyword limit_keyword integer "0" "Maximum rejected
records before failure" optional visible "value reject-threshold Never* 0
value reject-threshold Abort* 0 sameas limit"
parameter: local keyword ramp_keyword float "0.0" "Rate of rejected
records" optional visible "value reject-threshold Never* 99.0 value rejectthreshold Abort* 0.0 sameas ramp"
parameter: local implicit "keyword-map" string "limit_keyword limit
ramp_keyword ramp" ""

Vineet Agarwal

Page 31

4/18/2007

parameter: implicit logging bool "False" "Log internal events"


parameter: local keyword log string "" "Special log parameter" "param
logging True log_concat"
parameter: implicit log_input choice "0 \177 1 10 100 1000 10000
100000" "Frequency of input records to log" optional visible "param
logging True"
parameter: implicit log_output choice "0 \177 1 10 100 1000 10000
100000" "Frequency of output records to log" optional visible "param
logging True"
parameter: implicit log_reject choice "0 \177 1 10 100 1000 10000
100000" "Frequency of reject records to log" optional visible "param
logging True"
argument: literal "select"
argument: literal $1 /* name */
argument: literal $2 /* select_expr */
argument: literal $3 /* limit */
argument: literal $4 /* ramp */
argument: literal $5 /* log */
metadata type: out = in
metadata type: in = out
metadata type: deselect = in
metadata type: in = deselect
metadata type: deselect = out
metadata type: out = deselect
metadata type: reject = in
metadata type: reject = out
metadata type: reject = deselect
metadata type: error = "string('\n')"
metadata type: log = "record string(\"|\") node, timestamp, component,
subcomponent, event_type; string(\"|\\n\") event_text; end"
parameter: implicit "condition" string "" "" optional ""
parameter: implicit "conditionInputPort" string "in" "" optional ""
parameter: implicit "conditionOutputPort" string "out" "" optional ""
parameter: implicit "condition-interpretation" choice "0
Replace\177with\177flow Remove\177completely" "" optional ""

Vineet Agarwal

Page 32

4/18/2007

Thank You

Vineet Agarwal

Page 33

4/18/2007

S-ar putea să vă placă și