Sunteți pe pagina 1din 46

Deploying ODI 11g in the Enterprise

Mark Rittman, Technical Director, Rittman Mead BIWA Summit 2013, San Francisco, January 2013
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

About the Speaker

Mark Rittman, Co-Founder of Rittman Mead Oracle ACE Director, specialising in Oracle BI&DW 14 Years Experience with Oracle Technology Regular columnist for Oracle Magazine Author of two Oracle Press Oracle BI books Oracle Business Intelligence Developers Guide Oracle Exalytics Revealed Writer for Rittman Mead Blog : http://www.rittmanmead.com/blog Email : mark.rittman@rittmanmead.com Twitter : @markrittman

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

About Rittman Mead

Oracle BI and DW platinum partner World leading specialist partner for technical excellence, solutions delivery and innovation in Oracle BI Approximately 50 consultants worldwide All expert in Oracle BI and DW Offices in US (Atlanta), Europe, Australia and India Skills in broad range of supporting Oracle tools:
OBIEE OBIA ODIEE Essbase, Oracle OLAP GoldenGate Exadata Endeca

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Oracle Data Integrator 11g

Oracles strategic data integration tool, originally came via the Sunopsis acquisition (2006) Java architecture, part of the wider Oracle Data Integration Suite, and Oracle Fusion Middleware Heterogenous database and source/target support Long-term successor to OWB, most common ETL tool on new projects now Commonly used alongside OBIEE, Essbase
and Oracle RDBMS for BI/DW projects

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Oracle Data Integrator 11g Key Features

Same philosophy as OWB and Oracle RDBMS DB as the ETL engine Declarative design - separates logic from implementation
Business rules define what goes where, and using which transformation rules Technical implementation defines how data is moved Built for SOA environments Support for Web Services, EII etc Supports batch, event-based and real-time integration Extensible through Knowledge Modules Change Data Capture Slowly Changing Dimensions Bulk load Java client application with server elements (agents)

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Part of the Wider Oracle Data Integration Suite

Oracle Data Integrator for large-scale data integration across heterogenous sources and targets Oracle GoldenGate for heterogeneous data replication and changed data capture Oracle Enterprise Data Quality for data profiling and cleansing Oracle Data Services Integrator
for SOA message-based data federation

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Part of Oracle Fusion Middleware 11g

Oracle complete set of middleware servers and technologies Based around Java, SOA, Oracle WebLogic Server and non-Java technologies Foundation for Oracles applications and platforms such as Oracle Fusion Applications

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Deploying ODI within an Enterprise

As ODI becomes more mainstream, and data integration more mission-critical, ODI needed to evolve Data warehousing and BI projects dont just access (Oracle) relational sources and targets any more Data quality requires more thought than just ad-hoc corrections and filtering ODI needs to participate in modern software development techniques such as continuous integration Its no longer acceptable for ODI jobs to fail, and be unavailable all day or weekend The stakes are raised - can ODI deliver?

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Loading More than a Data Warehouse, Accessing More than Oracle RDBMS

Most of us know ODI through its ability to load Oracle data warehouses Data typically sourced from Oracle databases, files, maybe the odd non-Oracle RBDMS source Enterprises now work with many and varied data sources and applications, such as
Multidimensional servers such as Oracle Essbase, and associated EPM apps XML sources, and JMS queues SOA environments, using messaging and service buses, typically in real-time More recently - big data sources such as Hadoop clusters, NoSQL databases

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Working with Essbase data, and Hyperion Planning

ODI11g is the strategic, long-term DI tool for Essbase and associated EPM applications IKMs and LKMs for loading, and extracting from, Essbase databases and EPM metadata stores Data models for Essbase databases represented as tables, columns, the same as with other data sourcres Data loads via rules files, Essbase / Planning / HFM APIs However ... not really Essbase native, learning curve for admins sources of ODI + Essbase/EPM Suite information: Good http://john-goodwin.blogspot.co.uk
Cameron Lackpour OOW2012 Presentation Slay the Bad Data in Essbase with ODI http://tinyurl.com/lackpour-odi Rittman Mead Blog

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Support for SOA Environments, and Messaging

ODI has technology adapters and features for many SOA, queue and messaging-type technologies JMS Queue, JMS Topic (plain message or XML), SOAP messages via Web Services etc Main role for ODI in SOA enviroments is bulk-data movement, invoked by web service calls Regular inter-service messaging for low volume, switching to ODI for high-volume Web services provided by runtime agents
Start, monitor, stop and restart scenarios Start, monitor, stop and restart load pans Public introspection web service List contexts List scenarios Requires deployment in Java EE container Call from BPEL or any other standard process

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Example: ODI 11g for Bulk Data Handling in an Orders Process

1. Large file arrives, detected by BPEL file 2. Execution starts (BPEL / ESB) - and a step for transforming a large document payload occurs 3. Pass XML payload, by reference, to ODI 4. ODI loads payload

5. 6. 7. 8.

ODI transforms payload ODI sends payload wherever instructed ODI notifies BPEL / ESB that job has completed Core BPEL / ESB processing completes

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Oracle Data Services Integrator - A Data Federation Alternative to ODI in SOA

ODI is essentially a batch-orientated DI tool, though batches can be micro-batches (and event-driven) ODI moves and transforms data, loading it into a central, integrated location In some cases though, you may wish to take a different approach
Data federation vs. integration - read and transform data in-place Data read and integrated on-demand, as a service Approach could be preferable for many reasons Security rules dont allow data to be replicated Consumer Development is dynamic, sources frequently added or changed ODSI Data volumes dont warrant a full ETL solution Data format is inherently nested and does not easily map onto relational model
Source A Source B Consumer Consumer ODSI
Consolidated Hub Consolidated Hub

ODI/OGG

ODI/OGG Source A Source B Source C

Source A

Source B

Federation only

Consolidation only

Hybrid Consolidation + Federation

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

ODI, ODSI, Golden Gate and OEDQ in a SOA Environment

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Big Data, Hadoop and Unstructured Data Sources

Big data is the hot topic in BI, DW and Analytics circles The ability to harness vast datasets, at a highly-granular level, by harnessing massively-parallel computing Crunching loosely-structured and modelled datasets using simple algorithms: Map (project) + Reduce (agg) Largely based around open-source projects, non-relational technologies
Apache Hadoop MapReduce Hadoop Distributed File System Apache Hive, Sqoop, HBase etc Emerging commercial vendors Cloudera Hortonworks etc Can be used standalone, or linked to an enterprise DW/BI architecture

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

ODI as Part of Oracles Big Data Strategy

ODI is the data integration tool for extracting data from Hadoop/MapReduce, and loading into Oracle Big Data Appliance, Oracle Exadata and Oracle Exalytics Oracle Application Adaptor for Hadoop provides required data adapters
Load data into Hadoop from local filesystem, or HDFS (Hadoop clustered FS) Read data from Hadoop/MapReduce using Apache Hive (JDBC) and HiveQL, load into Oracle RDBMS using Oracle Loader for Hadoop Supported by Oracles Engineered Systems Exadata Exalytics Big Data Appliance (w/Cloudera Hadoop Distrib)

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

How ODI Accesses Hadoop and MapReduce

ODI accesses data in Hadoop clusters through Apache Hive


Metadata and query layer over MapReduce Provides SQL-like language (HiveQL) and a metadata store (data dictionary) Provides a means to define tables, into which file data is loaded, and then queried via MapReduce Accessed via Hive JDBC driver (separate Hadoop install required on ODI server, for client libs) Additional access through Oracle Direct Connector for HDFS and Oracle Loader for Hadoop
Hadoop Cluster
MapReduce

Hive Server
HiveQL

Oracle RDBMS ODI 11g


Direct-path loads using Oracle Loader for Hadoop, transformation logic in MapReduce

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Running a MapReduce / Hive Job in ODI

Data is extracted and loaded using regular interfaces LKMs and IKMs generate HiveQL queries Functionally identical to RDBMS access/loading

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Oracle Loader for Hadoop

Oracle technology for accessing Hadoop data, and loading it into an Oracle database Pushes data transformation, heavy lifting to the Hadoop cluster, using MapReduce Direct-path loads into Oracle Database, partitioned and non-partitioned Online and offline loads Key technology for fast load of
Hadoop results into Oracle DB

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Profiling Data, and Managing Data Quality Issues

ODI has built-in capabilities for defining data rules, data firewalls Static controls, Flow controls, constraints etc But what if you dont know what issues your data actually has? What if you need to profile, deduplicate, merge or otherwise manage your data? This is almost a topic in itself...

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Oracle Enterprise Data Quality

Data profiling, auditing and cleansing based on the industry-leading Datanomic platform Integration with Oracle Data Integrator for a complex data management solution

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Oracle EDQ Features Relevant to ODI 11g

Ability to profile data from many sources (file, RDBMS, JDNI, XML, MS Office) Create data quality cases, track and assign to owner Cleanse, transform, parse and match incoming data via a palette of operators (processors) Batch or real-time operation All-Java architecture, thin-client and runs in WebLogic Server Replaces previous Trillium-based OEM solution
(but extra-cost option, as was Trillium solution)

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

ODI 11g Integration with Oracle EDQ

Limited integration at present, but Datanomic only just acquired Can run in same WLS domain, environment EDQ result schema can be on same DB as ODI staging area EDQ processes can be executed from ODI package or load plan
using EDQ Open Tool

Connection details to EDQ Server, and details of job

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Participation in Large-Scale Enterprise Projects, and DevOps

As ODI and data integration becomes more integral to enterprises, expectations rise ODI project elements, and executable code, needs to go into source control Build systems need to be able to include ODI functionality in their releases Development Operations (DevOps) systems need to be able to spin-up ODI environments automatically Ideas such as continuous integration and smoke testing can also apply to ODI projects ODI topologies need to be flexible enough to
deal with DEV/PROD network & responsibility separations

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Typical ODI Repository Topology : DEV, TEST and PROD

Typical enterprise customers deploy all non-PROD environments on their own network, isolated from the main production systems This stops you having a single master repository for all ODI work repositories Good practice is to have all non-DEV environments use
execution work repositories Only allows load plans and scenarios to be imported Can only run existing code, not alter or change code Challenge is how you deploy code without DEV assistance Requires command-line tools Requires scripting Requires an API?

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Accessing ODI 11g Admin Features from the Command-Line

ODIs admin functions are available through ODI Tools

Run from the command-line, from an ODI procedure, or other methods Scriptable using the startcmd.bat|sh utility Run from the agent home directory, connects to master and work repositories The key to automating the deployment and administration of ODI projects and environments

cd c:\oracle\product\11.1.1\Oracle_ODI_2\oracledi\agent\bin startcmd.bat OdiImportObject -FILE_NAME=c:\Test_Build_Files\ SCEN_LOAD_PROD_DIM_Version_001.xml -WORK_REP_NAME=PROD_EXECREP -IMPORT_MODE=INSERT_UPDATE

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Continuous Integration and Smoke Testing using ODI

For complex, multi-developer projects, continuous integration is a good practice Continously taking shipped code and testing it in a smoke test environment
Identifies changes that break the build early Use a suite of regression tests that run the code with optimal coverage, end-to-end ETL runs Gives you confidence that a release shipped into test will actually compile, deploy and pass functional tests Enables more agile development, through having a robust build and regression testing process that welcomes change

Security Topology Versioning

DEV/TEST Master Repository

Models Projects Execution Execution Execution

DEV Development Work Repository

CI / SMOKE TEST Execution Work Repository

TEST Execution Work Repository

Regression Test #1 Regression Test #2 Regression Test #n

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Using Jenkins and OdiImportObject Tool for Continuous Integration

Jenkins is an open-source build automation and continuous integration tool Supports a range of build tools including Ant, Maven, Subversion, Git etc Use to detect new ODI export files in a given directory, and then
automatically deploy them to the CI / Smoke-Test environment Or monitor a source-control system for new check-ins Deploy ODI code through ODI Tools (OdiImportScen, OdiImportObject)
Security Topology

Jenkins CI Server with scheduler

Versioning

PROD Master Repository

startcmd.bat OdiImportObject -FILE_NAME = %1.xml ...

Execution

PROD Execution Work Repository

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Steps to Set up a Continuous Integration Environment using Jenkins

Download Jenkins from http://jenkins-ci.org Set up a new build job, optionally integrate with SVN etc Run ODI tools through Execute a Batch File function Or take it further using Maven, Ant etc Run the build process manually, to a schedule, or on check-in of new code to the source control system Report on stability of build, see last failure, reason for fail

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

The ODI SDK

For other automation tasks, the ODI SDK can be used to perform all functions available in ODI Studio Java-based API analogous to OMB+ within Warehouse Builder Script the creation of repositories & interfaces, updating of models, registering of data sources and topologies etc Used either within Java applications (compiled),
or interpreted using Groovy (editor now shipped with ODI)
import oracle.odi.domain.project.OdiProject; import oracle.odi.core.persistence.transaction.support. DefaultTransactionDefinition; txnDef = new DefaultTransactionDefinition(); tm = odiInstance.getTransactionManager() txnStatus = tm.getTransaction(txnDef) project = new OdiProject("Project For Demo", "PROJECT_DEMO") odiInstance.getTransactionalEntityManager().persist(project) tm.commit(txnStatus)

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Making ODI ETL Processes Resilient and Highly-Available

ODI routines when deployed in the enterprise, need to be resilient, fail gracefully, be restartable They are often considered mission critical You need to code defensively, and anticipate #fail
.. or this. Make your ETL routines like this...

Not like this...

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Why do ODI ETL and Data Integration Jobs Fail?

ODI ETL processes typically fail for one of two main reasons

Reason #1 : An error in your code, unexpected data, run out of disk space etc - the process fails Reason #2 : An agent crashes, ODI repositories goes down etc - the infrastructure fails Most modern databases (Oracle 11g+ etc) have capabilities to recover from DB process issues Can we make use of these within ODI packages, KMs etc?

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Enabling ETL Resumption : Resumable Space Allocation and ODI

Oracle Database 9i+ has provided resumable space allocation

When enabled, suspends INSERT operations when out of disk space, rather than fail load Datafiles can then be extended, or new ones added Can be incorporated into ODI KM to enable more load operations to complete

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Resumable Space Allocation in Action

Insert process becomes suspended, ODI Operator shows step as still running Suspended operation can be detected using DBA_RESUMABLE, USER_RESUMABLE Once more disk space added, step will resume, operation can complete

select name from dba_resumable; NAME ------------------------------LOAD Sales Fact Table alter database datafile 'sales_ts.dbf' resize 200m; database datafile 'SALES_TS.DBF' altered.

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Making ETL and Data Integration Processes Restartable

Recoverability is another enterprise ETL requirement - graceful failure and ability to restart process Can be as simple as re-running the job, but some failures may be catastrophic - how to you unwind? Oracle RDBMS has several flashback technologies that can help
Flashback database, to a given SCN or restore point Flashback table, etc Example : An ETL process performs an UPDATE, then and INSERT - if the INSERT fails, the UPDATE stays present. Can we use FLASHBACK TABLE to restore the table back to original state, so the process can be restarted safely?

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Using Load Plans to Aid Restartability

Alternative to packages for sequencing interfaces and other steps Helps organize an optimal execution schedule for a batch Advanced sequencing capabilities
Parallel or Serial, Conditional branching Exception handling Complements Scenarios and Packages, does not replace them Exception handling feature could be very useful in restart / graceful failure scenarios Run ODI procedure, package, to correct errors Run commands to roll-back/flashback the database or tables Lets use one for our example...

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Defining a Load Plan Exception to Handle Catastrophic ETL Failures : I

Flashback table requires an SCN (System Change Number) to flashback-to Record the current SCN before performing thre integration in a project variable
Requires SELECT privilege on V$_DATABASE

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Defining a Load Plan Exception to Handle Catastrophic ETL Failures : II

Load plan will define an exception, to be raised if the final INSERT operation fails Exception will call an ODI Procedure that runs the FLASHBACK TABLE command, using the saved SCN

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Defining a Load Plan Exception to Handle Catastrophic ETL Failures : III

Now, when the INSERT step fails due to an error, the UPDATE
is rolled-back as well through the FLASHBACK TABLE feature Table restored to state at original recorded SCN

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Agent and ODI Infrastructure Failure

Enterprises typically deploy ODI using standalone agents, in a parent/child load-balancing configuration Repository database has regular backups, or ideally uses DataGuard / log-shipping Scheduled jobs assigned to the parent, master runtime agent Jobs then delegated to the child agents, that then do the work based on load factor, availability But what if the parent agent goes down?
What about the schedule?

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Using OPMN To Manage, and Restart, Standalone Agents

OPMN (Oracle Process Manager and Notification Server) can be installed to manage standalone agents Not part of the base install or license, but you probably have it somewhere Standalone agents then run, stopped, restarted and monitored using OPMN server Ensures that failed agents are restarted, including
the parent agent for load balancing

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Deploying Agents within WebLogic Server - New with ODI 11g

Runtime agents can now be deployed in WebLogic Server managed servers (requires WebLogic Server license) Benefit from WebLogic clustering, Enterprise Manager (+ODI Console), more resilient JVM Better for high-availability protects the scheduler how?

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

How JEE Agents, WebLogic and Coherence Protect Against Agent Failure

Hardware load balancer provides the load-balancing Agents are all equal - one elects to be the scheduler on cluster start, another takes over if that one crashes Oracle Coherence cache grid holds details of the schedule, available to all nodes in the cluster WebLogic Server clustering restarts failed managed servers, and Java processes (JEE runtime agents) However ... more complex setup, extra license cost, and

may not be necessary if external scheduler used instead Still benefits from running agents in production JVM though And you get Enterprise Manager, ODI Console etc

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Further Reading - ODI11g in the Enterprise series on Rittman Mead Blog

Five-part series on the Rittman Mead Blog: ODI 11g in the Enterprise
Part 1: Beyond Data Warehouse Table Loading Part 2 : Data Integration using Essbase, Messaging, and Big Data Sources and Targets Part 3: Data Quality and Data Profiling using Oracle EDQ Part 4: Build Automation and Devops using the ODI SDK, Groovy and ODI Tools Part 5: ETL Resilience and High-Availability

http://www.rittmanmead.com/2012/12/

odi11g-in-the-enterprise-part-1-beyond-datawarehouse-table-loading/

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Thank You for Attending!


Thank you for attending this presentation, and more information can be found at http://www.rittmanmead.com Contact us at info@rittmanmead.com or mark.rittman@rittmanmead.com Look out for our book, Oracle Business Intelligence Developers Guide out now! Follow-us on Twitter (@rittmanmead) or Facebook (facebook.com/rittmanmead)

T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

Deploying OBIEE 11g in the Enterprise


Mark Rittman, Technical Director, Rittman Mead UKOUG Conference & Exhibition, Birmingham December 2012
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com

Saturday, 5 January 13

S-ar putea să vă placă și