Sunteți pe pagina 1din 10

Enterprise Application Integration

(Middleware)
Gustavo Alonso
Computer Science Department
Swiss Federal Institute of Technology (ETHZ)
alonso@inf.ethz.ch
http://www.iks.inf.ethz.ch/

Course Administration

Lecture: Tuesdays 13.15 - 15:00 (HRS building)


Discussion and exercises: Thursdays 11:10 - 11:55 (HRS building)
Getting in touch with us:
Gustavo Alonso
HRS G 08
alonso@inf
632 7306
Web site http://www.iks.inf.ethz.ch/
Course material:
Book (recommended)
Script of the lecture (available from the web)
Practical exercises:
Web services (designing, building and programming a composite Web
service)
Exercise is mandatory

Gustavo Alonso, ETH Zrich.

Course Goals
The course aims at introducing and discussing in depth several important topics
related to distributed and parallel information systems in general and enterprise
application integration in particular. In many ways, the course explores the
synergy between information and communication systems and how this synergy
can be best exploited.
The course is more practical than theoretical. The objective is to give a clear
overview of the problems and their nature, how can they be solved, and how this
solutions are implemented in practice. While we will spend some time
understanding the theoretical underpinnings of the ideas discussed, the emphasis
will be on how these ideas can be implemented in practice. An important part of
the course will be devoted to how technology has evolved and the reason why
existing systems are the way they are.
You will have the opportunity to program a relatively large information system.
This exercise is an integral part of the course.
The discussions and presentations form an integral part of the course. If you take
the time to learn from them, you will get much more out of this course. Take
advantage of the opportunity!

Gustavo Alonso, ETH Zrich.

Motivation for this course

The architecture of the information systems we use is becoming increasingly


complex.

The access methods, the capabilities, the goals, and the available technology
is continuously changing. What can we learn that will remain valuable in the
years to come?
One example: 70 - 90 % of the software costs are maintenance costs.
Using the right abstractions helps!! Databases used as services remove
about 40 % of the code of commercial applications
Another example: software reuse is truly efficient and makes economic
sense at a large granularity. How can we build systems that can be
tailored to the user needs and yet are applicable in a wide range of areas
and environments?

Comunications

Demand

Components

7RGD\VV\VWHPVDUHQRORQJHU

7KHGHPDQGVRQWKHH[LVWLQJ

6\VWHPLQWHJUDWLRQLVWKH

LVRODWHG&RPPXQLFDWLRQVSOD\

V\VWHPVNHHSJURZLQJ

PRVWFKDOOHQJLQJDVSHFWRIWKH

DNH\UROHLQWKHLUXVH1HZ

FHQWUDOL]HGVROXWLRQVDUHQRW

,7ZRUOG3URJUDPPLQJWRGD\

DFFHVVPHWKRGVDOVRFKDQJH

DOZD\VIHDVLEOHFRRSHUDWLRQ

LVWRFRPELQHDOUHDG\H[LVWLQJ

WKHQDWXUHRIWKHSUREOHPV

DPRQJV\VWHPVLVDPXVW

KHWHURJHQHRXVWRROV

Gustavo Alonso, ETH Zrich.

wap
client

java
client

web
web and
and wap
wap browsers
browsers
specialized
specialized clients
clients (Java,
(Java, Notes)
Notes)
SMS
SMS ...
...

CLIENT
TIER

CLIENT

web
client

ACCESS TIER

business
object

business
object

web
web servers,
servers, J2EE,
J2EE, CGI
CGI
JAVA
JAVA Servlets
Servlets API
API

api
APP
TIER

business
object

MOM, HTML, IIOP,


RMI-IIOP, SOAP, XML

TP-Monitors,
TP-Monitors, stored
stored procedures
procedures
programs,
programs, scripts,
scripts, beans
beans

INTEGRATION TIER

MOM, IIOP,
RMI-IIOP, XML

wrapper

wrapper

db

db

system
system federations,
federations, filters
filters
object
object monitors,
monitors, MOM
MOM

RESOURCE
TIER

wrapper

ODBC, JDBC, RPC,


MOM, IIOP, RMI-IIOP

databases,
databases, multi-tier
multi-tier systems
systems
backends,
backends, mainframes
mainframes

db

APP

api

RESOURCE INTEGRATION

api

ACCESS

HTML, SOAP, XML

Gustavo Alonso, ETH Zrich.

Understanding the layers

3UHVHQWDWLRQORJLF

$SSOLFDWLRQ/RJLF

5HVRXUFH0DQDJHU

1-2 years Clients and


external
interface
(presentation, access channels)

2-5 years
Applica
tion
(systems logic)

~10 years Data management systems


(operational and strategic data)
Gustavo Alonso, ETH Zrich.

Client is any user or program that wants


to perform an operation over the system.
To support a client, the system needs to
have a presentation layer through which
the user can submit operations and
obtain a result.
The application logic establishes what
operations can be performed over the
system and how they take place. It takes
care of enforcing the business rules and
establish the business processes. The
application logic can be expressed and
implemented in many different ways:
constraints, business processes, server
with encoded logic ...
The resource manager deals with the
organization (storage, indexing, and
retrieval) of the data necessary to
support the application logic. This is
typically a database but it can also be a
text retrieval system or any other data
management system providing querying
capabilities and persistence.

Scale-up versus Scale-out


6FDOHXSLVEDVHGRQXVLQJDELJJHU

FRPSXWHUDVWKHORDGLQFUHDVHV7KLV

UHTXLUHVWRXVHSDUDOOHOFRPSXWHUV 603

ZLWKPRUHDQGPRUHSURFHVVRUV

6FDOHRXWLVEDVHGRQXVLQJPRUH

FRPSXWHUVDVWKHORDGLQFUHDVHVLQVWHDGRI
XVLQJDELJJHUFRPSXWHU

6FDOHXS

%RWKDUHXVXDOO\FRPELQHG6FDOHRXWFDQ

EHDSSOLHGDWDQ\OHYHORIWKHVFDOHXS
6FDOHRXW

     
             

Gustavo Alonso, ETH Zrich.

A modern e-commerce platform


5

Cache Server

ASP
SSL
FARM B

ASP
SSL
FARM A

Basket/Ad/Surplus
ASP File Server

ASP File Server

SQL Product Server

SQL Product Server

Receipt/Fulfillment
Games/Music

Comp/Soft Books

Videos

Music

Search Servers
Gustavo Alonso, ETH Zrich.

Games/Music

Monitor and cache


             
         

Comp/Soft Books

Videos

Music

Search Servers
8

WEB BROWSER

STREAMCORDER

(HTTP)
THIN CLIENT

HEDC
web server
(Apache)
www.hedc.ethz.ch

(HTTP)
LOCAL
DB

JAVA CLIENT

PRESENTATION LAYER

PROCESSING LOGIC (PL)


SERVER
MANAGER

DATA MANAGEMENT (DM)

DIRECTORY
SERVICES

FRONT
END

(HTTP, RMI)

ARCHIVE
MANAGER

REFERENCE
MANAGER

DATA
FILTERS

APPLICATION LAYER

IDL
SERVER

TMP
STORAGE
SPACE

IDL
SERVER

TMP
STORAGE
SPACE

DBMS 1
(Oracle)

IDL
SERVER

...

TMP
STORAGE
SPACE

...

DBMS 2
(Oracle)

LESS
RELEVAT
DATA

IMAGES
AND
RAW DATA

DB
SPACE

DB
SPACE

NETWORK FILE SYSTEM

Gustavo Alonso, ETH Zrich.

TAPE
ARCHIVE

RESOURCE MANAGEMENT LAYER

Get products #23


and #45

Buy products #23,


#45 and part #101

Retailer

Customer 2
Build product #3,
according to specs.
Customer 1
Get parts #A1,
#B42, #H2, #R2
Manufacturer 1

Order parts #A1,


#H2, #G7, #G11,
#B42
Supplier 1
Gustavo Alonso, ETH Zrich.

Get parts #G7,


#G11, #ES-01, #R2
Manufacturer 2

Order parts
#R2, #101, #ES-01,
#G7, #G11
Supplier 2
10

Astronomy
Exploration of the
solar system
Gustavo Alonso, ETH Zrich.

11

Scientific Method?
2XUDELOLW\WRSURGXFHGDWDH[FHHGV
RXUFDSDFLW\WRH[SODLQKRZWKHGDWD
ZDVSURGXFHG

Gustavo Alonso, ETH Zrich.

12

Planets outside the solar system

Gustavo Alonso, ETH Zrich.

13

Planets or program bugs?

Gustavo Alonso, ETH Zrich.

14

7KH*ULG

Gustavo Alonso, ETH Zrich.

15

Course philosophy

Addressing the increasing need for connectivity, the ever growing demand, and
facing the challenge of component based software design requires to solve a
number of data management issues. By learning to identify the problems and
being aware of the state of the art and possible solutions both theoretical and
practical, a system designer will be in a much better position to deal with
evolving technology.

Design
Problem

Gustavo Alonso, ETH Zrich.

System
Design

Technical
Solutions

16

The future of distributed IS


Why distributed information systems?
Computer environments:
Distributed, heterogeneous,
autonomous nodes linked by a
network (intranet, internet.
Emphasis on communication).
Technology advances: On
computing power (powerful
clients), on networks (reliability,
speed. ATM, ISDN ).
Application demands: Larger and
larger applications. Decentralized
corporations. Need for autonomy.
New environments and business
models: WWW, distributed
service providers, Java, CORBA,
Workflow Management.
Basic services: A great deal of
work is being invested in
producing the type of standards
and reusable software needed to
make this a reality

Distributed IS applications:
Emphasis on interoperability:
combine your data with that of the
rest of the world.
Emphasis on distribution: Intranet,
Internet are here to stay. Huge
demand for this functionality:
Lotus Notes (applications built
on replicated databases).
WWW+Java+persistence
(distributed service providers).
TP-Monitors (OLTP, OLAP,
transactional processing).
Queuing Systems (applications
on top of reliable,
asynchronous
communications).
CORBA (applications on top
of a TP-Monitor like object
oriented system)
Workflow and more

Gustavo Alonso, ETH Zrich.

17

The distributed systems dilemma


Theoretical advantages of distributed
systems:
Locality of reference: With the
proper data placement, most
accesses should be to local data,
which increases response time and
throughput.
Scalability/Processing capacity:
With better hardware available,
the overall processing power
should be a function of the number
of nodes in the system (see
parallelism). If more power is
needed, add more nodes.
Availability/Fault tolerance: A
distributed system should be able
to provide services even when part
of the system is down (unlike
centralized systems). This is
important for large installations
and mission critical applications
(24x7 computing).

Gustavo Alonso, ETH Zrich.

In theory, a distributed system is


faster (better response time and
throughput), bigger (more
capacity), and more reliable (builtin redundancy). But, in practice,
this is not true.
Centralized (mainframe based):
the old-fashioned approach. Most
of the valuable data is still in
mainframes, although it is only 1
% of all existing data (mainframes
are still a good business).
Client/Server (a variation of the
centralized version): a first
approach to distribution. Made too
many promises and now it is
suffering from its lack of success.
Servers are not mainframes and
quickly become a bottleneck.
Applications move towards
distribution, and find there is no
support for it.

18

Concrete goals for the course

Provide a basic understanding of the problems associated with distributed


environments (many of the ideas we will discuss apply in many areas, not just
typical commercial applications).
Provide the conceptual tools required to understand commercial products (basic
idea behind a product, what its weaknesses are, how to solve them ).
Understanding how technology has evolved and why products are the way they
are is the key to understanding what might happen in the future
Develop the skills and know how necessary to participate in an enterprise
application integration effort: motivation, vocabulary, systems, some
programming experience.
Gain sufficient awareness of the state-of-the-art (some of the problems covered
in the course are very hard and many people have worked on them for years.
Know what has been done so far and how it can be used.)
and having fun in the process!!

Gustavo Alonso, ETH Zrich.

19