Sunteți pe pagina 1din 6

How to use "SUBSTR" functiion in mapping.

Explanation :
Returns a portion of a string. SUBSTR counts all characters, including blanks, starting at the
beginning of the string.

Syntax

SUBSTR( string , start [, length ] )

Example

Substr (IN_PHONE, 1 ,3)

 
Design a mapping , which generates sequence of numbers using setvariable function in exp
transformation( without using sequence generator)
 
Mapping Design a mapping generates sequence of numbers without using sequence
:
generator?
Solution : Source : Flatfile
Target : Relational
Database : Oracle
Note : usage of setmaxvariable() function and mapping variables !

Download : XML FILE  


m_sequence_variablefunction  
   
DWH
 
Design a mapping to move first half of the data to one target and second half of the data to other
target? eg., if you 20 records in source - first 10 to one target and other 10 to second target or if
your source records have odd number first n/2 +1 in one target and other in second target?

 
Mapping : first half to one target and second half to other target.
Solution : Source : Flatfile
Target : Relational
Database : Oracle
Tip : use stored procedure to count the records

Download : XML FILE  


m_firsthalf_secondhalf
REPOSITORY ADMIN CONSOLE

Actions

 Create Local or Global Repository


 Start Repositories.
 Back up repository
 Move the copy of the Repository to a different Server
 Disable the Repository.
 Export connection information.
 Notificy Users :: Notification message can be send to all the users connected to the
Repository
 Propagate
 Register Repositories
 Rstore Repository
 Upgrade Repository

Actions

 Create Local or Global Repository


 Start Repositories.
 Back up repository
 Move the copy of the Repository to a different Server
 Disable the Repository.
 Export connection information.
 Notificy Users :: Notification message can be send to all the users connected to the
Repository
 Propagate
 Register Repositories
 Rstore Repository
 Upgrade Repository

Actions

 Create Reusable tasks , Worklets , Workflows.


 Schedule Workflows.
 Configure tasks.

Workflow

A workflow is a set of instructions that describes how and when to run tasks related to extracting,
transforming, and loading data.
Worklets
A worklet is an object that represents a set of tasks.

When to create Worklets?


Create a worklet when you want to reuse a set of workflow logic in several workflows. Use the
Worklet Designer to create and edit worklets.

Where to use Worklets?


You can run worklets inside a workflow. The workflow that contains the worklet is called the
parent workflow. You can also nest a worklet in another worklet.
WORKFLOW MONITOR  
You can monitor workflows and tasks in the Workflow Monitor. View details about a workflow
or task in Gantt Chart view or Task view.

Actions
You can run, stop, abort, and resume workflows from the Workflow Monitor.
You can view the log file and Performance Data
Slowly Changed Dimension

 It is a Dimension which slowly changes over a time.

Slowly Changed
Type Description
Dimension Mapping
SCD Type 1 Slowly Changing Dimension Inserts new dimensions.
Overwrites existing
dimensions with
changed dimensions.
(Shows Current Data)
SCD Type 2 /Version Slowly Changing Dimension Inserts new and changed
Data dimensions. Creates a
version number and
increments the primary
key to track changes.
SCD Type 2 /Flag Slowly Changing Dimension Inserts new and changed
Current dimensions. Flags the
current version and
increments the primary
key to track changes.
SCD Type 2 /Date Slowly Changing Dimension Inserts new and changed
Range dimensions. Creates an
effective date range to
track changes.
SCD Type 3 Slowly Changing Dimension Inserts new dimensions.
Updates changed values
in existing dimensions.
Optionally uses the load
date to track changes.
OLTP OLAP

On Line Transaction processing On Line Analytical processing

Continuously updates data Read Only Data

Tables are in normalized form Partially Normalized / Denormalized Tables

Single record access Multiple records for analysis purpose

Holds current data Holds current and historical data


Records are maintained using Primary key Records are baased on surogate keyfield
feild
Delete the table or record Cannot delete the records

Complex data model Simplified data model

DATAMART DATA WAREHOUSE


A scaled - down version of the Data It is a database management system that
Warehouse that addresses only one subject
facilitates on-line analytical processing by
like Sales Department, HR Department allowing the data to be viewed in different
etc., dimensions or perspectives to provide business
intelligence.
One fact table with multiple dimension More than one fact table and multiple
tables. dimension tables.
[Sales Department] [HR Department] [Sales Department , HR Department ,
[Manufacturing Department] Manufacturing Department]
Bigger Organization prefer DATA
Small Organizations prefer DATAMART
WAREHOUSE
 
Ans DIMENSION TABLE FACT TABLE
It provides the context /descriptive It provides measurement of an enterprise.
information for a fact table measurements.
Structure of Dimension - Surrogate key , Measurement is the amount determined by
one or more other fields that compose the observation.
natural key (nk) and set of Attributes.
Size of Dimension Table is smaller than Structure of Fact Table - foreign key (fk),
Fact Table. Degenerated Dimension and Measurements.

. In a schema more number of dimensions Size of Fact Table is larger than Dimension
are presented than Fact Table. Table.
Surrogate Key is used to prevent the In a schema less number of Fact Tables observed
primary key (pk) violation(store historical compared to Dimension Tables.
data).
Provides entry points to data. Compose of Degenerate Dimension fields act as
Primary Key.
Values of fields are in numeric and text Values of the fields always in numeric or integer
representation. form.

DATA MINING VS WEB MINING


 
DATA MINING WEB MINING
Data mining involves using techniques to find Web mining involves the analysis of
underlying structure and relationships in large Web server logs of a Web site.
amounts of data.
Data mining products tend to fall into five The Web server logs contain the
categories: neural networks, knowledge entire collection of requests made by
discovery, data visualization, fuzzy query a potential or current customer
analysis and case-based reasoning. through their browser and responses
by the Web server
FACT TABLE VS DIMENSION TABLE
 
FACT TABLE DIMENSION TABLE
A table in a data warehouse whose entries A dimensional table is a collection of
describe data in a fact table. Dimension tables hierarchies and categories along which
contain the data from which dimensions are the user can drill down and drill up. it
created. A fact table in data ware house is it contains only the textual attributes.
describes the transaction data. It contains
characteristics and key figures.
In a Data Model schema less number of fact In a Data Model schema more number
tables are observed. of dimensional tables are observed.
RDBMS SCHEMA VS DWH SCHEMA
 
RDBMS SCHEMA DWH SCHEMA
* Used for OLTP systems  * Used for OLAP systems 
* Traditional and old schema  * New generation schema 
* Normalized  * Denormalized 
* Difficult to understand and * Easy to understand and navigate 
navigate  * Extract and complex problems
* Cannot solve extract and can be easily solved 
complex problems  * Very good model
* Poorly modelled 
How to find the number of success , rejected and bad records in the same mapping.
 First we seperate this data using Expression transformation.Which is used to flag the row for
1 or 0 .The condition as follows ..
 IIF(NOT IS_DATE(HIREDATE,'DD-MON-YY') OR ISNULL(EMPNO) OR
ISNULL(NAME) OR ISNULL(HIREDATE) OR ISNULL(SEX) ,1,0)
 FLAG =1 is considered as invalid data and FLAG =0 is considered as valid data .This data
will be routed into next transformation using router transformation .Here we added two user
groups one as FLAG=1 for invalid data and the other as FLAG=0 for valid data.
 FLAG=1 data is forwarded to the expression transformation .Here we take one variable port
and trwo ouput ports .One for increament purpose and the other for flag the row ...

S-ar putea să vă placă și