Sunteți pe pagina 1din 10

Oracle BI EE Implementation on Netezza

Prepared by SureShot Strategies, Inc.

The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE reporting deployment. This paper does not cover the ETL component of Netezza data load/update/delete performance related areas. Contact SureShot Strategies for further detail and learn more about our implementation of OBIEE-Netezza technical architecture, performance benchmarking and deployment methodology

Introduction
Netezza uses cutting edge data warehouse appliance architecture and is preconfigured in the range from 1.0 TB to 100 TB. It can utilize an SMP node and up to 896 single-CPU SPUs (snippet processing units) configured in an MPP arrangement in the overall architecture, referred to as AMPP for asymmetric massively parallel processing. The SPUs are connected by a Gigabyte Ethernet, which serves the function of the interconnect. There are 112 SPUs in a rack. Each rack fully populated contains 4.5 TB. The DBMS is a derivative of Postgres, the open source DBMS, but has been significantly altered to take advantage of the performance of the architecture. Whats inside are Hitachi drives, 2- or 4-way HP/Intel host CPUs and Red Hat Linux Operating System. The architecture is a shared nothing . The I/O module is placed adjacent to the CPU. The disk is directly attached to the SPU processing module. More importantly, logic is added to the CPU with a Field Programmable Gate Array (FPGA) that performs record selection and projection, processes usually reserved for relatively much later in a query cycle for other systems. The FPGA and CPU are physically connected to the disk drive.. The SMP host will perform final aggregation and any merge sort required. All tables are striped across all SPUs and no indexes are necessary. Indexes are one of the traditional options that are not provided with Netezza. All queries are highly parallel table scans. Netezza does provide use of highly automated zone map and materialized view functionality for fast processing of short and/or tactical queries.

NPS Hardware
A Netezza system consists of multiple hardware and software components working together to provide performance, reliability and the asymmetric massively parallel processing of the NPS architecture. The key hardware components within an NPS include the following: NPS Host Snippet Processing Units Snippet Processing Arrays

NPS Host
The NPS host, located within the NPS rack, controls and coordinates the activity of the NPS. It performs query optimization; controls table and database operations; consolidates and returns query results; and monitors the NPS system components to detect and report problems. The host is a highly redundant, highly available, HP server with dual power supplies, error correcting memory, a disk channel controller, and redundant disks (RAID 5).

Snippet Processing Units


The Snippet Processing Unit (SPU) is the basic unit of processing and storage in the NPS. Each SPU is basically a standalone microcomputer, with a CPU, logic processors, memory, and disk storage. The SPU is an intelligent disk storage device, as it has logic to quickly search for the correct information and to return only the matching results of the portions of the data that are saved on its disk. An NPS system has many SPUs: for example, there are up to 56 in the NPS model 10500 and up to 896 in the NPS model 10800. User database tables are distributed across all of the SPUs to allow for the parallel query processing. Each SPU is responsible for managing a portion of your database and tables (called a primary partition), as well as for maintaining a copy of another SPUs primary partition (called a mirror partition). If an SPU should fail, the mirror partition is used to create a new primary partition on a standby SPU within the system, which will then take the place of the failed SPU.

Snippet Processing Arrays


Snippet Processing Arrays (SPAs) are racks within the NPS system that contain up to 14 SPUs and have the power supplies, fans, and communication fabric that allows the SPUs to communicate with each other and with the NPS host. NPS systems contain at least two SPAs. If you add SPAs to the system to increase the number of SPUs, the SPAs are added in pairs. For each pair of SPAs, one of the 28 SPUs takes the role of a hot standby, ready to take the place of a failed SPU within the system. Model Model Model Model 5200 10050 10100 10200 is a single rack system containing 28 SPUs housed in 2 SPAs. is a single rack system containing 56 SPUs housed in 4 SPAs. is a single rack system containing 112 SPUs housed in 8 SPAs. is a two rack, high availability system containing 224 SPUs housed in 16 SPAs. is a four rack, high availability system containing 448 SPUs housed in 32 SPAs. is a six rack, high availability system containing 672 SPUs housed in 48 SPAs. is a six rack, high availability system containing 672 SPUs housed in 48 SPAs. is a eight rack, high availability system containing 896 SPUs housed in64 SPAs.

Model 10400 Model 10600 Model 10600 Model 10800

OBIEE Implementation Using Netezza


Typically the following architecture is supported to run OBIEE Platform & applications using Netezza data warehouse:

The following section provides detail of Oracle Business Intelligence Enterprise Edition (OBIEE) components and various performance techniques and tools that can be used with OBIEE.

OBIEE includes the following software components:


Required Components of Oracle Business Intelligence Oracle Business Intelligence Server The OBI Server is accessed using the logical SQL against a unified semantic layer. The logical layer is translated into Netezza specific physical SQL by an advance navigation and rewrite process driven by semantic layer. To enable access to extended physical database capabilities, a set of new functions are available as a part of BI server SQL API. The new BI server API functions allow unique source specific features and functions to be embedded into logical SQL and passed through to the physical database.

Oracle Business Services Oracle Business Services Plug-in

Intelligence

Presentation

Intelligence

Presentation

Oracle Business Intelligence Scheduler Oracle Business Intelligence Administration Tool Oracle Business Intelligence Cluster Controller (Optional)

Oracle Business Intelligence Client Oracle Business Intelligence Netezza ODBC Driver Oracle Business Intelligence Catalog Manager Oracle Business Intelligence Job Manager Oracle Business Intelligence Publisher Optional Components of Oracle Business Intelligence Oracle BI Publisher Desktop Oracle BI Open Intelligence Interface Oracle BI Office Plug-In Oracle BI Briefing Book Reader A Windows-based design tool that allows you to create layouts for Oracle BI Publisher. Oracle BI ODBC interface only. This is identical to the Oracle BI ODBC interface installed through the main installer, but has a smaller footprint. The Oracle BI Office Plug-In is a Windows application under the Oracle BI Presentation Services. It requires a separate installer. Windows application that provides a way to save static and linked dashboard content for review offline. Required for Installing Oracle Business

Third-Party Intelligence

Installations

Java SDK 1.5.0 or later Oracle Application Server 10.1.3.1.0 or later OR an other supported Web Server for Oracle BI Presentation Services Netezza ODBC Database connectivity software that Oracle BI servers use to connect to the database

Java must be installed on the same machine on which you are installing Oracle Business Intelligence. Use Tomcat, Web Sphere, IIS or OAS. The following Oracle Application Server components are required: Oracle HTTP Server Oracle Containers for J2EE (OC4J) Oracle Process Manager and Notification Server.

OBIEE Architecture

Data Warehouse Query Performance


OBIEE uses Netezza ODBC connectivity to read data from the Data warehouse or Operational Data Source(ODS). Unlike traditional Databases, it does not provide any runtime database instance configurable parameters to optimize query. Instead, an overall Database design implementation lays foundation for SQL optimization. Netezza built-in architecture, pre-planning of tables distribution key, physical ordering of rows and using OBIEE EVALUATE_* series of functions in the Business Layer to optimize query. Netezza is optimized to efficiently handle star-schema Netezza architecture is designed to handle a typical star schema RDBMS data models. It uses and creates zone to leverage the numeric surrogate keys to speed up data retrieval process. Therefore, carefully design your DW model in line with dimensional modeling. Avoid using excessive snow flaking where possible. In summary, combination of Netezza architecture and OBIEE metadata design are in line with the dimensional modeling and it offers a better performance. Netezza Built-in Architecture Netezza does have a quasi index called zone maps but they are maintained automatically and are very granular. The zone maps only track the min/max value for integer and date columns per 3 megabyte extent on disk. As long as the data has some ordering on disk (date being a very good example since data is usually loaded by date) the zone map feature works very well. Distribution Specification Each table in a Netezza RDBMS database has only one distribution key, which consists of one to four columns. You can use the following SQL syntax to create distribution keys. To create an explicit distribution key, the Netezza SQL syntax is: usage: create table <tablename> [ ( <column> [, ] ) ] as <select_clause> [ distribute on [hash] ( <column> [ , ] ) ]

The phrase distribute on specifies the distribution key, the word hash is optional. To create a round-robin distribution key, the Netezza SQL syntax is: usage: create table <tablename> (col1 int, col2 int, col3 int); distribute on random; The phrase distribute on random specifies round-robin distribution. To create a table without specifying a distribution key, the Netezza SQL syntax is: usage: create table <tablename> (col1 int, col2 int, col3 int); Netezza pushes the CPU down to the disk level. Each disk is connected to a single CPU, and thus Netezza is able to process data just as fast as the disk can read the data.

In a traditional database you have CPUs connected via a SAN switch to a bunch of disks. Your throughput off disk is limited by the number of fiber channels you have connected to the SAN. Scaling up that throughput is very expensive and is quite limited. Evaluate Function The Oracle BI server now supports the capability to directly call functions defined within the Database from either the Answers interface or using a Logical column (in the Logical Table source) within the Metadata (repository). One can leverage Netezza Analytical SQLs to push and optimize SQLs at database level. Here is a summary of functions available now with OBIEE current release: EVALUATE This function is intended for scalar and analytic calculations. Example: SELECT e.lastname,sales.revenue,EVALUATE('dense_rank() over(order by %1 )',sales.revenue) FROM sales s, employee e; EVALUATE_AGGR This function is intended for aggregate functions with group by clause. Example: SELECT year.year, sales.qtysold, EVALUATE_AGGR('sum(%1)', sales.quantity) From SnowFlakeSales; EVALUATE_PREDICATE This function is intended for functions with a return type of boolean. Example: SELECT year, Sales as DOUBLE,CAST(EVALUATE('OLAP_EXPRESSION(%1,''LAG(units_cube_sales, 1, time, time LEVELREL time_levelrel)'')', OLAP_CALC) AS DOUBLE) FROM "Global".Time, "Global"."Facts - sales" WHERE EVALUATE_PREDICATE('OLAP_ CONDITION(%1, ''LIMIT time KEEP ''''1'''', ''''2'''', ''''3'''', ''''4'''' '') =1', OLAP_CALC) order by year;

Conclusion
To summarize and wrap up the discussions, the OBIEE implementation on Netezza clearly benefits the overall data warehouse reporting content deployment over other traditional RDBMS. Netezza scalable HW architecture, to utilize an SMP node and up to 896 single-CPU SPUs (snippet processing units) configured in an MPP arrangement, optimizes data storage and retrieval performance. However, to weigh the cost benefit of Netezza-OBIEE deployment, an assessment of Database HW architecture and sizing is strongly recommended. And to buy Netezza HW and DW Database solution vs. expanding or buying other Database technologies should be determined based on the over all DW deployment strategy over the period of 3-5 years time frame. Also, based on the benchmarks, the performance of SQLs is scalable and goes up proportionally based on the number of SPUs before it flattens out. Leveraging Netezza analytics functions in OBIEE further help improving the report data retrieval performance. Netezza built-in multi-SPU architecture parses and executes SQLs & provides better performance for TB size of data warehouses. However, it is important to note that DW must be modeled using the dimensional star schema model and Facts tables are created with proper surrogate and distribution keys defined.

S-ar putea să vă placă și