Documente Academic
Documente Profesional
Documente Cultură
Dr. N. P. Singh Professor Management Development Institute Mehrauli Road, Sukhrali Gurgaon -122001 E-mail: knpsingh@mdi.ac.in
Contents
Definitions
Definition: The process of integrating multiple applications that were independently developed, may use incompatible technology, and remain independently managed. By this definition, EAI would include:
Data Propagation
It is the set of processes by which an organization centralizes and optimizes application integration through bulk data movement. Underlying technologies are EAI. The target of these technologies are the application themselves
Messaging How?
IBM WebSphere MQ Microsoft BizTalk TIBCO WebMethods SeeBeyond (now Sun owned) Vitria and others Java Message Service (JMS) Microsofts Message Queuing (MSMQ) and/or Messaging libraries in Microsoft .NET Web services standards that support asynchronous Web services
Using
WS-ReliableMessaging Suns Java API for XML Messaging (JAXM) Microsofts Web Services Extensions (WSE).
Selection of vendor
The three short-listed vendors were: IBM, TIBCO and Microsoft. Several criteria were identified as critical for S-Tels requirements and each vendor was evaluated against them, in an attempt to ensure that the final selection process will not be flawed because of inconsistent data. At the end of this process, S-Tel decided to go for IBMs MQSeries because it had the greatest score overall
Basic Architecture
Basic Architecture
EAI Vendors
Vendor Vitria Technology Tibco Software SeeBeyond Technology CrossWorlds Software webMethods IBM Iona
Product BusinessWare ActiveEnterprise e*Xchange eBusiness Integration Suite Crossworlds webMethods Enterprise WebSphere Business Integration. Orbix E2A Web Services Integration Platform
Britannia Airways estimates savings of 975,000 annually through a mobile application that enables over 2,000 crew members to access key enterprise systems before a flight, including email, flight crew rosters, health and safety information and a duty free point-of-sale application. Mobile middleware played a critical role in both solutions.
application level command and control application level information propagation conductance of inter-business transactional agreements and enforcements
EAI B2B
EAI typically deals with the integration of applications and data sources within an enterprise to solve a local problem. B2B application integration is the integration of systems between organizations to support any business requirement, such as sharing information with trading partners to support a supply chain or collaborating on a product design
The technology and approaches applied to both types of solutions are similar. For example, both may employ middleware, such as message brokers, to exchange information between various systems. Both may have similar approaches to systems integration. Which solution come before?
Drivers EAI
Companies has addressed the integration problem by focusing one single application provider but success is limited. Reasons for failure includes
Software drivers: inability of software providers to deliver 100% needs of a business organization to combine legacy applications which has created heterogeneous IS Architecture Financial Drivers: 40 percent of the IT budget is spend on integration. It was $85 billion in 1998. The cost of integrating & maintenance of packaged software is very.
Drivers: Continued
Internal Divers: The support for fast reorganization of business processes and integration of legacy systems which are developed internally. External Drivers: Integration of business partners such as customers & suppliers at both data and process level. It necessitate time & cost efficient integration.
One process is supported by one application and one database. This model avoids the problems emerging from redundant data storage and asynchronous data exchange between separated applications.
Several identical processes in different business units are supported by several identical applications that runs on different computers and rely on logically separated data bases. Example : Application link enabling (ALE) from SAP, which provide a mechanism for the coordination of master and transaction data in physically distributed SAP environment
Heterogeneous
Several different processes in different business units are supported by several different applications. The applications are build on divergent data models Which means that they provide different semantics (of the data to be interchanged).
Integration
A common language
Example: EDI
Integration levels
Company A Pragmatics Company B Pragmatics
Process Level integration (Workflow etc.) Semantics Object level integration (EAN, D&B etc) Syntax Data level integration (EDIFACT, ANSI etc) Communication Communication Services Standards ( ISO/OSIlayers 4 to 7)
NF
Integration Backbone
Distributed Architecture B2B integration Event driven Architecture EAI Architecture Enterprise Integration Architecture Integration infrastructure
F F F F
F F F
NF NF F F
NF F NF
NF & NF
Integration Backbone
Distributed Architecture B2B integration Event driven Architecture EAI Architecture Enterprise Integration Architecture Integration infrastructure
NF NF F F
NF F F
EAI Systems
EAI Systems
Traditionally known as Middleware. These systems are known for data level integration They do not provide any functionality at object level or process integration EAI systems fill this gap.
Interface services: Connectors & adaptors ease the burden of programming by providing pre-built interface. Transformation Services: ease the development burden of encoding message formats & routing messages based on their contents.
Continued
Process Management Services: Process management services gather messages and execute multiple transformations by ensuring that the flow of information between a set of resources follows the flow defined in an established business process
Continued
Development Services: It helps programmers in the development & adoption of adaptors for integrating custom built systems. Run Time Services: They ensure performance, scalability, availability & reliability for all the applications that are integrated over an EAI systems
CASE STUDY
The Robert Bosch Group in an international company with 190000 employee in 132 countries and annual revenue higher than US25 Billion.
EAI Laws
1) The whole is greater than the sum of its parts. 2) There is no end-state. 3) There are no universal standards. 4) Information adapts to meet local needs. 5) All details are relevant.
EAI Principles
Align EAI plans with business strategy Consolidate first, integrate second Use a process-driven approach to develop end-to-end solutions Establish clear lines of ownership and accountability Enforce an EAI architecture Mandate integration requirements for new applications Develop a common representation of data and process Test early and often Re-factor interfaces constantly so they never become legacy Evolve business processes through experimentation
The first choice, whenever feasible, should be to consolidate disparate systems that perform similar functions. Try to remove delicacies
Integration dimensions can include data, process, platform, network, organization, location, employees, customers, and products among others. With so many variables, the potential solutions are infinite.
data-centric approach WWW Business Event Model is the common integration element driving all others.
Two critical aspects of ownership and accountability are 1) Program management for the initial deployment of an integrated solution, and 2) Ongoing management of the shared integration infrastructure. In addition to C-level executive (CIO, CFO, CPO, even the CEO.
Applications should be designed with the assumption that they will be part of a larger end-to-end process.
Use simulation games at the exploratory stage to predict outcome. Use integrator simulators Write software components with built in testing features.
Stable interface is key to sustainability But there is no stability in applications, protocols etc
EAI
EAI enables data propagation and business process execution throughout the numerous distinct networked applications as if it would be a unique global application. It is a distributed transactional approach and its focus is to support operational business functions such as taking an order, generating an invoice, and shipping a product
Top Vendors, Product average price of software & deployment Company Product Price
Vitria Technology
Tibco Software SeeBeyond Technology CrossWorlds Software
BusinessWare
ActiveEnterprise
$500,000 to $700,000
Starting at $100,000
webMethods
webMethods Enterprise
$700,000
Business Intelligence
Messaging
Adapters
Be viewed with prestige and in a positive way Bring together all aspects of the company to do what is nearly impossible team building Analyze all controls Document rules and processes in the automation Document the data lineage from reports
3. 4. 5.
6.
Have access to and check historical data to see what the people and automated systems are catching and what they are missing
Effectively and proactively handle external audits to minimize penalties for infractions
7.
Step One: Identify Critical Information End-items Step Two: Trace Data Lineage Back to Origins Step Three: Determine the Meaning and Validate the Quality of the Original Data Step Four: Validate Application Processes, Business Rules and Related Controls and Verify Automation Security Step Five: Follow Data Lineage Forward to Validate Mappings, Transformations and Data Quality Step Six: Verify Security at Data Consumption Points
Data Marts
Management Reports
ETL
Query
Website
EAI
ETL ET L
ETL
Query
EDI
ETL
ETL
Query
Green Screen
Data Marts
Management Reports
ETL
Query
Website
EAI
ETL ET L
ETL
Query
EDI
ETL
ETL
Query
Green Screen
Data Marts
Management Reports
ETL
Query
Website
EAI
ETL ET L
ETL
Query
EDI
ETL
ETL
Query
Green Screen
Data Marts
Management Reports
ETL
Query
Website
EAI
ETL ET L
ETL
Query
EDI Message
ETL
ETL
Query
Green Screen
Data Marts
Management Reports
ETL
Query
Website
EAI
ETL ET L
ETL
Query
EDI Message
ETL
ETL
Query
Green Screen
Data Marts
Management Reports
ETL
Query
Website
EAI
ETL ET L
ETL
Query
EDI
ETL
ETL
Query
Green Screen
Data Marts
Management Reports
ETL
Query
Website
EAI
ETL ET L
ETL
Query
EDI
ETL
ETL
Query
Green Screen
Data Marts
Management Reports
ETL
Query
Website
EAI
ETL ET L
ETL
Query
EDI
ETL
ETL
Query
Green Screen
Data Marts
Management Reports
ETL
Query
Website
EAI
ETL ET L
ETL
Query
EDI
ETL
ETL
Query
Green Screen
Data Marts
Management Reports
ETL
Query
Website
EAI
ETL ET L
ETL
Query
EDI
ETL
ETL
Query
Green Screen
Authorized
Unauthorized
Data Marts
Management Reports
ETL
Query
Website
EAI
ETL ETL
ETL
Query
EDI
ETL
ETL
Query
Green Screen
Each distribution center or warehouse services hundreds of stores (>1200 total stores).
Each distribution center is moving thousands of cartons (i.e. boxes) around the warehouse each day
Receiving them from trucks through dock doors. Moving them with fork lifts to storage areas in the warehouse Conveying them to break down areas for distribution to stores. Conveying them down belts to storage areas or outbound trucks. Moving them onto trucks that depart the warehouse.
Some reads are manual and some are automated. Generating literally hundreds of events per second per warehouse. More reads from more points in the warehouse. Potentially adding store reads to the event list.
The System
Part I The retail chain wanted all the data on events regarding the movement of cartons sent to HQ
Providing them with unparalleled real time information on inventory levels and product status. Providing more accurate information for merchandise analyst and productivity monitoring for warehouse managers.
Part II (Not germane to the discussion today) Providing a Java Web application to nearly 10,000 users to access the data company wide.
Averaging 400 messages a second incoming at HQ Peak around 1300 messages a second incoming at HQ
Data around an event ~200bytes/msg 24x7x52 (31,449,600 seconds for those not counting) = ~ 4-7GB a day
During Christmas time things were worse much worse. The organization wants to double its current size by 2010! Oh yahdid I mention RFID was coming
Challenge
Design and implement a system to get the data from the warehouses to HQ
In near real time to support the reporting needs Use whatever makes sense (to some degree more later) With a good size team (20-25 people in various roles)
The Solution
Put SeeBeyond at all the endpoints (warehouses and HQ). All data would move through SeeBeyond. SeeBeyond is Java based (also a company technology direction). Write routing/minor processing code in Java in SeeBeyond.
Oracle already at the warehouses Obtain a honking big Oracle DB at HQ. Use Oracle stored procedures for heavy lifting (data processing report data preparation).
Solution Diagram
Ex: move this carton there, but have We gotten the receive carton msg yet? We have recd a carton do we have the reference data for the product yet?
As an architect, I was not aware how different an EAI messaging system is.
Asynchronous-everywhere nature Had no patterns to follow (No I had not read Hohpe/Woolf EAI book) Did not have an awareness of the vendor landscape Was easily talked into solutions by others. Had only implemented smaller EAI solutions Internally lots of support but no experience Contractors lots of desire, but little implementation experience to the scale/level of effort
Understand your options all the three letter Es (EAI, ETL, EII, EDR, etc.) Read EAI patterns Know the products (WBI, Vitria, Tibco, WebMethods, SeeBeyond, etc.)
Experienced with systems matching the size of your app Find people with product expertise Find people with design/pattern expertise
EAI Patterns
Enterprise Integration Patterns: Hohpe/Woolf Next Generation Application Integration: Linthicum IT Architectures and Middleware: Britton
EAI Patterns
As the GOF pointed out in generic software, there are common behaviors in software systems.
They are powerful tools for communicating behavior. They represent naturally occurring processes. Are generally repetitive in nature, and lend themselves to reuse.
Each of the message components also has several patterns that represent common behaviors in a messaging system and encourage reuse.
You are going to be using a significant amount of pipe. Have you considered failover/load balancing? (comm lines around warehouses get cut on occasion)
Terabytes of data to be stored and processed where will it go? Consider backup/recovery systems Database logs/archiving
Support staffs will be lost at turnover How many of your support shops really know
Can you expect them to be able to operate, maintain and support component based messaging systems?
Do they know what a message server or bus is? Across a very distributed environment? Have them help you design the monitoring tools and alert systems. Work together to develop proactive systems checks and troubleshooting procedures.
For example, finding experienced testers for asynchronous messaging systems is difficult.
They usually need intricate knowledge of the messaging subsystem monitors and admin capabilities.
EAI Products/Solutions are many. EAI Standards are few. EAI/ETL/EII/ market place is tumultuous
Sun has purchases SeeBeyond IBM bought Ascential Everyone calling their product an ESB (example on next page)
Some they know about Others they do not
There is a reason MQ has been around a long time. Where possible consider tried, true and already deployed platforms But again do the math and see if they can support the extra load. In house support is probably better equipped (more in a bit) Consider multiple/alternate technologies for parts of your solution. ETL is great for certain parts of a large solution There is a reason why products like Oracle are expensive (technologies like Oracle Replication more in a bit). Does, however, create more issues of timing.
ETL is great for certain parts of a large solution Examine features in your DB/App Servers
There is a reason why products like Oracle are expensive (technologies like Oracle Replication more in a bit). How about those Message Beans in the app server?
Reference Data
In many applications, you need reference data on both ends of the messaging systems.
You can build a replicating message engine to treat this like other message data (not recommended). Referential integrity becomes a real problem. Consider issues of message timing (PR becomes the 51st state but messages with PR references start to arrive before the new state data does)
ETL tools - if reference data changes only happen at certain times. Technologies like Oracle Replication for real time (it can operate over a WAN).
Interoperability
We used Java, but Even when you use Java, how is it being applied?
Java running inside of proprietary components (like SeeBeyond eWays) does not make you portable.
Write component code that can be used by or incorporated by proprietary systems. Under the covers, is the vendor using
Let the bus focus on delivering the goods. Scalability problems Monitoring problems Possibly interoperability problems (especially when using proprietary technology/components) Flexible easy to get at (and change) interoperable (if possible) and contain reusable business logic (if possible)
We didnt do enough math up front. We didnt plan for failure/growth. The messages moved slower than anticipated. The message processing took more time than expected. The amount of data was larger than expected.
Work with the business analysts to figure out how many messages need to be moved.
Make volume estimates part of the non-functional requirements gathering process. Check that against the existing databases if possible.
Design the messages and calculate the size of the overall message (XML and all). Calculate the rate and add up the total volume.
Can the messaging system handle that (on both ends)? Can the consuming database handle that? Can the hardware and network handle that? What happens if something/anything goes down for an hour? What happens if you go down for a day? What happens if you have unexpected growth?
Anticipate failure
Versus Web application Unplanned system issues Planned outages Load balancing and Failover were both after thoughts
Load balancing and failover must be accommodated Like security, you need a multi-layered approach
Hardware (like Big IP) Redundant message bus/message servers Processing components Database EAI system throttling
How are you going to kick over to the failover systems (and return to regular systems)?
Throttling
Throttling limits ("throttles") the number of requests it will respond to within a specified period of time.
Used in messaging systems to ensure that no one part of the system is driven beyond its capacity or performance efficiently.
Throttle points. Potentially lots of messages, especially if the WAN goes down
x 25
Message Poll & Decoration Queue Message bundler
Distribution Systems
WAN
WAN
Congestion point
DB
Recving Queue Validation & Request Processing (VRP)
DB Ready Queues
!
Dead Letter Queue Retry Logic Problem Message
A place for queued messages to sit if something goes down Space in the DB or space in the message channels or both Consider the time lags for getting additional hardware bought, installed, and up and running You are going to want to keep log files around for a while. Some problems take time to manifest to a point of awareness. Devise an automated archive/clean up for logs. Nonot all EAI systems provide log clean up utilities.
????? Data
????
Order Entry
ACCOUNTS RECEIVABLE LEDGE R
AC CO UNT NO . SH EET NO . DATE INVOICE NUMBER/DESCRIPTION CHARGES CREDITS BALANCE
Financials
ACCOUNTS RECEIVABLE LEDGE R
AC CO UNT NO . SH EET NO . DATE INVOICE NUMBER/DESCRIPTION CHARGES CREDITS BALANCE BALANCE FORWARD
BALANCE FORWARD
B2B
Bank
B2C
A2A
Logistics
B2B
E-Commerce Portal
Parcel Service
Customer
Just the old concept of timeliness with an additional twist Refers to the real time (zero) or near-real time (low) transfer of information between applications Zero data latency tends to imply transactional solutions, while low data latency or near-real time allows for assured delivery (messaging) options Low data latency within the enterprise is no longer enough; it must be enabled across the enterprise perimeter
98
Adding EAI oriented solutions to the problem of data availability does not reduce the modeling effort required it intensifies it Each application view of the data must be defined and understood Additionally the dependencies involved in sharing and transferring information must be supported
99
Manual or straight-through
flows Batch or immediate, individual transfers One business process Multiple steps One-way, asynchronous interactions (loosely coupled) Systems are physically and logically independent Potential long transactions
Batch or immediate transfers Multiple processes Multiple steps One-way, asynchronous interactions (loosely coupled) Systems are physically and logically independent
Immediate interaction
One business process One step Two-way, synchronous interactions (Tightly Coupled) Systems are physically and logically dependent Usually a client server (request reply) interaction
100