Sunteți pe pagina 1din 26

TALEND Open Studio for Data

Integration
What is Talend
Talend is an open source software vendor that
provides the software and services for:
Data Integration
Data Management
Enterprise Application Integration and
Big Data
Talend Provides its products in two categories:
Open Software
Enterprise Software





TALEND ENTERPRISE
for
DATA INTEGRATION
Open & Enterprise Comparison
Open & Enterprise Comparison
Talend Enterprise Data
Integration

Talend Enterprise suite contains following
components:---
An application server (Apache Tomcat server +
CommandLine) that hosts Talend Administration
Center (WAR file).
A database server storing the administration
metadata of Talend Administration Center (by default,
an embedded H2 database is used).
A SVN server for Project metadata.
Execution servers (JobServers) or Talend Runtime
execution containers (based on Apache Karaf) to
deploy and execute processes.
A Studio API to carry out technical processes.

Overview of Talend Enterprise
In Detail
In order to work with the Talend Enterprise for
Data Integration the following should be
Installed:---
For Talend Administration Center (Web
Application):-
1. Install the Tomcat/Jboss Application servers.
2. Then place the Web Archive (WAR) file
provided by the talend in the WEBAPPS folder
of the Tomcat Server.
For SVN Server:--
1. Install the Visual SVN server, in a machine and
define a repository , which is used as a
centralized repository for storing of all Project
Metadata.

2. Now, Link the SVN Server to the Talend Administrator
Center, to manage the project metadata from the web
browser.
For CommandLine:--
1. Install the CommandLine software in the system
which hosts the Talend Administrator Center web
application.
2. This CommandLine is used to deploy and execute the
selected jobs in the job servers/ Talend Runtime
Containers.
For Job Servers:--
1. First, Select the systems which should be act as
Execution servers. Then, Instll the JobServer
application in it, to deploy and execute the Jobs
created in Talend Studio.
2. Now, Link this Job Server, to the Talend Administrator
Center
For Talend Runtime Containers:--
1. First, Select the systems which should be act as
Execution servers. Then, Install the Talend
Runtime Containers application in it, to deploy
and execute the Jobs created in Talend Studio.
2. Now, Link this Talend Runtime Container, to the
Talend Administrator Center





TALEND OPEN STUDIO
for
DATA INTEGRATION
Talend Open Studio (TOS)
features
It is an easy-to-use, Eclipse-based graphical
environment
It is a code generator, which generates code in
Java/Perl
Metadata-driven design and execution
Real-time debugging
Robust execution
Talend Products
Talend provides its TOS for:-
Big Data
BPM
Data Integration
Data Quatility
ESB
Master Data Management
Talend Open Studio for Data
Integration
TOS for DI provides solution for both ETL for
Analytics and ETL for Operational Integration
ETL for Analytics
ETL for Operational
Integration
TOS for DI Installation
TOS for DI can be freely downloaded from the
Talend website.
A zip file TOS_DI-Win32-r118616-V5.5.1.zip can
be downloaded from the website and should be
Unzipped.
This Zip file is common for all OS. It contains
Binary files which are operable on all OS.
Java should be installed and all required
Environment Variables should be set before
installation of TOS for DI.

Important Concepts in Talend
Studio
Repository--- storage location
Project--- structured collections of technical items
Workspace--- directory
Job--- Graphical design
Component--- preconfigured connector
Item--- fundamental technical unit in a project
TOS for DI- Welcome screen
TOS for DI screen
Business Model Node
A Business Model is a non technical view of a
business workflow need.
Business Model allow data integration project
stakeholders to graphically represent their needs
regardless of the technical implementation of
requirements
IT Operation staff can go through the Business
Models and can convert them into the Technical
code by creating Jobs.
Job Design Node
A Job Design is the runnable layer of a business
model.
A Job Design translates business needs into
code, routines and programs
It is a graphical design, of one or more
components connected together, that allows you
to set up and run dataflow management
processes.
Components in TOS for DI

Talend provides a rich set of Components for
building Jobs in DI.
A Component is the unit which provides a
particular functionality
Talend is the only one which provides over 400+
Components which are required for various
activities.
All these Components are available under Palette
panel in Talend Studio.
Metadata Node
Certain database connections or specific files
when creating data integration Jobs can be
used/required for more jobs.
In order to avoid defining the same properties
over and over again , Those Connection
properties can be created once and can be stored
in the Metadata node in the Repository tree
view
Connections
Job or a subjob are created with a group of
components logically linked to one another via
connections.
4 Types of connections:-
Row Connection--- Main, Lookup, Reject, Output
Iterate ConnectionLooping purpose
Trigger ConnectionSubjob Level & Component
Level
Link Connection Handles Metadata
SQL Templates Node
Talend Studio allows to benefit from using some
system SQL templates since many query
structures are standardized with common
approaches.
There, can be several standardized SQL
templates including Generic, Hive, MySQL,
Oracle, and Teradata.
Remaining Nodes
Context Node: A context is characterized by
parameters. These are Context Specific variables
which can be accessed from the component
specific properties of the Component view.
Code Node: This node has all Pre defined
Routines which are in Java.






Thank You

S-ar putea să vă placă și