Documente Academic
Documente Profesional
Documente Cultură
A Speak-Tech Training
Steven Feuerstein
steven@stevenfeuerstein.com www.StevenFeuerstein.com
Bulk Processing with BULK COLLECT and FORALL Table Functions including Pipelined TFs NOCOPY Optimizing Datatypes
Page 2
PL/SQL Obsession
http://www.ToadWorld.com/SF
Download and use any of my scripts (examples, performance scripts, reusable code) from the same location: the demo.zip file.
filename_from_demo_zip.sql
You have my permission to use all these materials to do internal trainings and build your own applications.
But remember: they are not production ready. You must test them and modify them to fit your needs.
Copyright 2011 Feuerstein and Associates Page 4
www.plsqlchannel.com
27+ hours of detailed video training on Oracle PL/SQL
www.stevenfeuerstein.com
Monthly PL/SQL newsletter
www.toadworld.com/SF
Quest Software-sponsored portal for PL/SQL developers
Page 5
Oracle-BASE.com
Tim Hall's incredibly useful set of resources and scripts
Instead, I will focus on changes you can make in the way you write PL/SQL that will impact performance.
Copyright 2011 Feuerstein and Associates Page 7
Make sure that you are familiar with the most critical optimization features.
Apply these proactively.
Manage memory
Most optimizations involve a tradeoff: less CPU, more memory.
Copyright 2011 Feuerstein and Associates Page 9
Two profilers:
DBMS_PROFILER: line by line performance DBMS_HPROF: hierarchical profiler, rollup to program units
Copyright 2011 Feuerstein and Associates
cachedPLSQL.sql
Page 10
DBMS_PROFILER
BEGIN DBMS_OUTPUT.PUT_LINE ( DBMS_PROFILER.START_PROFILER ( 'my_application ' || TO_CHAR (SYSDATE, 'YYYYMDD HH24:MI:SS') )); run_your_application; DBMS_PROFILER.STOP_PROFILER; END;
Requires EXECUTE privilege on DBMS_PROFILER. Must create tables: $RDBMS_ADMIN/proftab.sql Run queries (or use your IDE) to view results. This profiler also provides raw data for code coverage analysis.
profrep.sql dbms_profiler_example.sql
Page 11
11g
DBMS_HPROF
DECLARE l_runid BEGIN NUMBER;
=> 'HPROF_DIR'
Requires EXECUTE privilege on DBMS_PROFILER. Must create tables: $RDBMS_ADMIN/dbmshptab.sql Run queries against tables or call plshprof to generate HTML reports
Copyright 2011 Feuerstein and Associates
dbms_hprof_example.sql
Page 12
Oracle offers DBMS_UTILITY.GET_TIME and GET_CPU_TIME (10g) to compute elapsed time down to the hundredth of a second.
Can also use SYSTIMESTAMP
DECLARE l_start_time PLS_INTEGER; BEGIN l_start_time := DBMS_UTILITY.get_time; -- Do stuff here... DBMS_OUTPUT.put_line ( DBMS_UTILITY.get_time l_start_time); END;
Copyright 2011 Feuerstein and Associates
Page 13
memory_error.sql
Page 14
As you work with more advanced features, like collections and FORALL, you will need to pay attention to memory, and make adjustments. First, let's review how Oracle manages memory at run-time.
Copyright 2011 Feuerstein and Associates Page 15
Library cache
Update emp Set sal=...
Large Pool
calc_totals
show_emps
upd_salaries
Session 1
Session 2
Session 1 memory UGA User Global Area PGA Process Global Area
Copyright 2011 Feuerstein and Associates
Session 2 memory UGA User Global Area PGA Process Global Area
Page 16
The Process Global Area contains session-specific data that is released when the current server call terminates.
Local data
The User Global Area contains session-specific data that persists across server call boundaries
Package-level data
Copyright 2011 Feuerstein and Associates
top_pga.sql
Page 17
Oracle keeps track of and shows the PGA and UGA consumption for a session in the v_$sesstat dynamic view. With the correct privileges, PL/SQL developers can analysis their code's memory usage.
BEGIN plsql_memory.start_analysis; run_my_application; plsql_memory.show_memory_usage; END;
Copyright 2011 Feuerstein and Associates
Page 18
It is up to developers and DBAs to determine how much PGA memory can be used per connection. Then developers must make the necessary changes in their code to conform to that limit.
Page 20
loop_invariants*.sql
Page 21
Automatic, transparent optimization of code Compile-time warnings framework to help you improve the quality of your code. Conditional compilation: you decide what code should be compiled/ignored!
Copyright 2011 Feuerstein and Associates Page 22
Stick with the default, unless you have a clear need for an exception.
Copyright 2011 Feuerstein and Associates Page 23
Some examples:
Unless otherwise specified, the operands of an expression operator may be evaluated in any order. Operands of a commutative operator may be commuted. The actual arguments of a call or a SQL statement may be evaluated in any order (including default actual arguments).
Some Examples
... A + B ... ... ... A + B ...
for i in 1 .. 10 loop A := B + C; ... end loop; A := B + C; for i in 1 .. 10 loop ... end loop; FOR rec in (SELECT ...) LOOP ... do stuff END LOOP;
10g_optimize_cfl.sql
Page 25
The PL/SQL runtime engine will always execute your subprograms, even if the optimizer detects that the results of that subprogram call are "not needed."
Exception: DETERMINISTIC functions in 11g
You cannot rely on a specific order of evaluation of arguments in a subprogram call or even when package initialization takes place.
The compiler will even avoid initialization of a package if it not needed (using a TYPE for example).
Copyright 2011 Feuerstein and Associates Page 26
and then:
ALTER PROCEDURE bigproc COMPILE REUSE SETTINGS;
Page 27
11g
11g
You can also keep the optimization level at 2 and request inlining explicitly for specific subprogram invocations with a new INLINE pragma. Inlining applies to the following statements:
Assignment, CALL, conditional, CASE, CONTINUE-WHEN, EXECUTE IMMEDIATE, EXIT-WHEN, LOOP, RETURN
You can also request inlining for all executions of the subprogram by placing the PRAGMA before the declaration of the subprogram. Inlining, like NOCOPY, is a request.
Under some circumstances, inlining will not take place.
Copyright 2011 Feuerstein and Associates
11g_inline*.sql
Page 29
11g
Page 30
DETERMINSTIC Functions
A function is deterministic if the value it returns is determined completely by its inputs (IN arguments).
In other words, no side effects.
Dont lie! Oracle will not reject your use of the keyword, even if it isnt true.
Copyright 2011 Feuerstein and Associates
deterministic.sql deterministic_in_plsql.sql
Page 34
PGA-Based Caching
When you declare variables at the package level, their state persists in your session.
A PGA-based cache, specific to each session.
And if you declare a collection at the package level, you can cache multiple rows of data. Not a reliable technique for Web-based (usually stateless) applications Let's start with a trivial example: USER
thisuser*.*
Page 35
associative_array_example.sql nested_table_example.sql
Page 36
Function
Application
PGA
Application Requests Data
Subsequent accesses
Database / SGA
Data found in cache. Database is not needed.
Function
Application
PGA
Application Requests Data Copyright 2011 Feuerstein and Associates
emplu.pkg / emplu.tst
Page 37
Very difficult to update the cache once the data source is changed.
Especially by/from, other sessions. Possible to use DBMS_ALERT.
syscache.pkg
Page 38
11g
You can use and should use it to retrieve data from any table that is queried more frequently than updated.
Copyright 2011 Feuerstein and Associates Page 39
11g
11g_frc_demo.sql
Page 40
11g
Add RESULT_CACHE keyword to header of function in both specification and body. RELIES_ON clause is deprecated in 11.2. Oracle will automatically determine all tables on which the function relies. RELIES_ON is then ignored.
Copyright 2011 Feuerstein and Associates Page 41
11g
11g_emplu*.*
Page 42
11g
Caching is not performed for complex types: records with CLOBs, collections, etc. The cache is not related to SQL statements in your function.
It only keeps track of the input values and the RETURN clause data.
Copyright 2011 Feuerstein and Associates
11g_frc_demo.sql
Page 43
11g
11g_frc_vpd.sql 11g_frc_vpd2.sql
Page 44
11g
show_frc_dependencies.sp
Page 45
11g
A change to a table could invalidate just a subset of the results in the cache.
It's not all or nothing - when your function's different logic paths could "hit" different tables.
11g_frc_dependencies.sql 11g_frc_dependencies2.sql
Page 46
Conclusions - Caching
Oracle offers several different ways you can build upon its own caching. DETERMINISTIC for functions in SQL PGA caching is very fast, but cannot be used in most situations The function result cache is the simplest, most widely applicable, and biggest-impact technique.
Get ready for it now by hiding queries inside functions.
Copyright 2011 Feuerstein and Associates
11g_emplu.pkg
Page 47
string_tracker0.pkg
Page 48
Page 49
This means that you can now index on string values! (and concatenated indexes and...)
Copyright 2011 Feuerstein and Associates Page 50
But the type of data returned by FIRST, LAST, NEXT and PRIOR methods is VARCHAR2.
The longer the string values, the more time it takes Oracle to "hash" or convert that string to the integer that is actually used as the index value.
Relatively small strings, say under 100 characters, do not incur too large a penalty.
assoc_array*.sql assoc_array_perf.tst
Page 51
string_tracker1.*
Page 52
string_index.sql genaa.sql
Page 53
If a set has no order, then it has no index, so it must be manipulated as a set. In Oracle Database 10g, Oracle added MULTISET set operators to manipulate the contents of nested tables (only).
Use in both PL/SQL blocks and SQL statements.
Copyright 2011 Feuerstein and Associates Page 54
Will perform much faster than writing your own algorithm to do the same thing.
Copyright 2011 Feuerstein and Associates Page 55
This is an enormous advantage over writing a program to compare the contents of two collections.
DECLARE TYPE clientele IS TABLE OF VARCHAR2 (64); group1 clientele := clientele ('Customer 1', 'Customer 2'); group2 clientele := clientele ('Customer 1', 'Customer 3'); group3 clientele := clientele ('Customer 3', 'Customer 1'); BEGIN IF group1 = group2 THEN DBMS_OUTPUT.put_line ('Group 1 = Group 2'); ELSE DBMS_OUTPUT.put_line ('Group 1 != Group 2'); END IF; END; 10g_compare.sql 10g_compare_nulls.sql Copyright 2000-2008 Steven Feuerstein - Page 56 10g_compare_old.sql Copyright 2011 Feuerstein and Associates
Page 56
authors.pkg 10g_set.sql
Page 57
in_clause.* 10g_member_of.sql
Page 58
authors.pkg 10g_submultiset.sql
Page 59
The resulting collection is either empty or sequentially filled from index value 1.
You do not need to initialize or extend first.
authors.pkg 10g_union.sql
Page 60
The resulting collection is either empty or sequentially filled from index value 1.
You do not need to initialize or extend first.
10g_intersect.sql
Page 61
Take away the elements of one nested table from another Use EXCEPT (not MINUS!) to take all elements in one nested table out of another. Duplicates are preserved unless you include the DISTINCT modifier.
And the ALL modifier is the default.
The resulting collection is either empty or sequentially filled from index value 1.
You do not need to initialize or extend first.
Copyright 2011 Feuerstein and Associates
10g_except.sql
Page 62
Page 63
GTTs still require interaction with the SGA. So collections will still be faster, but they will use more memory.
GTTs consume SGA memory.
global_temp_tab_vs_coll.sql
Page 64
Page 65
Row-by-row = Slow-by-slow?
Many PL/SQL blocks execute the same SQL statement repeatedly, with different bind values. Retrieves data one row at a time. Performs same DML operation for each row retrieved. The SQL engine does a lot to optimize performance, but you this row-by-row processing is inherently slow.
But, but...aren't SQL and PL/SQL supposed to be very tightly integrated? Let's take a look "under the covers.
Copyright 2011 Feuerstein and Associates Page 66
SQL Engine
Page 68
SQL Engine
Page 69
Only one difference: BEFORE and AFTER statementlevel triggers only fire once per FORALL INSERT statements.
Not for each INSERT statement passed to the SQL engine from the FORALL statement.
Copyright 2011 Feuerstein and Associates
statement_trigger_and_forall.sql
Page 70
Page 71
Retrieve multiple rows into a collection with a single fetch (context switch to the SQL engine). Deposit the multiple rows of data into one or more collections.
Page 72
are fetched; instead, the collection is empty. The "INTO" collections are filled sequentially from index value 1.
There are no "gaps" between 1 and the index value returned by the COUNT method.
Only integer-indexed collections may be used. No need to initialize or extend nested tables and varrays. Done automatically by Oracle.
Copyright 2011 Feuerstein and Associates Page 73
bulkcoll.sql bulkcollect.tst
But what if I need to fetch and process millions of rows? This approach could consume unacceptable amounts of PGA memory.
Page 74
If you do not know in advance how many rows you might retrieve, you should:
1. Declare an explicit cursor. 2. Fetch BULK COLLECT with the LIMIT clause.
Copyright 2011 Feuerstein and Associates Page 76
COLLECT operation.
Definitely the preferred approach in production applications with large or varying datasets.
With very large volumes of data and small numbers of batch processes, however, a larger LIMIT could help.
Copyright 2011 Feuerstein and Associates Page 78
You will need to break the habit of checking %NOTFOUND right after the fetch.
You might skip processing some of your data.
Explicit BULK COLLECTs will usually run a little faster than cursor for loops optimized to BC.
10g_optimize_cfl.sql
Copyright 2011 Feuerstein and Associates Page 80
Page 81
FORALL Agenda
Introduction to FORALL Using the SQL%BULK_ROWCOUNT Referencing fields of collections of records Using FORALL with sparsely-filled collections Handling errors raised during execution of FORALL
Page 82
Convert loops that contain inserts, updates, deletes or merges to FORALL statements. Header looks identical to a numeric FOR loop.
Implicitly declared integer iterator At least one "bind array" that uses this iterator as its index value. You can also use a different header "style" with INDICES OF and VALUES OF (covered later)
Copyright 2011 Feuerstein and Associates
forall_timing.sql forall_examples.sql
Page 83
More on FORALL
Use any type of collection with FORALL. Only one DML statement is allowed per FORALL. Each FORALL is its own "extended" DML statement. The collection must be indexed by integer. The bind array must be sequentially filled. Unless you use the INDICES OF or VALUES OF clause. Indexes cannot be expressions. forall_restrictions.sql
Copyright 2011 Feuerstein and Associates Page 84
Use the SQL%BULK_ROWCOUNT cursor attribute to determine how many rows are modified by each statement.
A "pseudo-collection" of integers; no methods are defined for this element.
bulk_rowcount.sql
Copyright 2011 Feuerstein and Associates Page 85
11g
Page 87
What if you want the FORALL processing to continue, even if an error occurs in one of the statements? Just add the SAVE EXCEPTIONS clause!
Copyright 2011 Feuerstein and Associates Page 88
The SAVE EXCEPTIONS clause tells Oracle to save exception information and continue processing all of the DML statements. When the FORALL statement completes, if at least one exception occurred, Oracle then raises ORA-24381. You then check the contents of SQL%BULK_EXCEPTIONS.
Copyright 2011 Feuerstein and Associates Page 89
It's a pseudo-collection, because it only supports a single method: COUNT. So you iterate from 1 to SQL%BULK_EXCEPTIONS.COUNT to get information about each error. Unfortunately, it does not store the error message.
Copyright 2011 Feuerstein and Associates Page 91
cfl_to_bulk_0.sql cfl_to_bulk_5.sql
Page 95
You trade off increased complexity of code for dramatically faster execution.
But remember that Oracle will automatically optimize cursor FOR loops to BULK COLLECT efficiency. No need to convert unless the loop contains DML or you want to maximally optimize your code.
Page 96
Table Functions
A table function is a function that you can call in the FROM clause of a query, and have it be treated as if it were a relational table. Table functions allow you to perform arbitrarily complex transformations of data and then make that data available through a query: "just" rows and columns!
After all, not everything can be done in SQL.
Page 97
The function header and the way it is called must be SQL-compatible: all parameters use SQL types; no named notation allowed until 11g.
In some cases (streaming and pipelined functions), the IN parameter must be a cursor variable -- a query result set.
Page 100
tabfunc_scalar.sql
Page 101
tabfunc_streaming.sql
Page 102
BEGIN INSERT INTO tickertable SELECT * FROM TABLE (stockpivot (CURSOR (SELECT * FROM stocktable))); END; / tabfunc_streaming.sql
Page 103
Pipelined functions allow you to return data iteratively, asynchronous to termination of the function.
As data is produced within the function, it is passed back to the calling process/query.
And pipelined functions use less PGA memory than non-pipelined functions!
Page 105
RETURN...nothing at all!
Page 106
Page 108
Optimizing Datatypes
All datatypes are not created equal. When performing intensive integer computations, consider PLS_INTEGER instead of INTEGER. When performing computations with floating point values, consider the BINARY_DOUBLE or BINARY_FLOAT types. Remember, for most situations, the impact will not be noticeable.
Copyright 2011 Feuerstein and Associates
integer_compare.sql binary_types.sql
Page 110
When changing a large number of rows, you might want to continue past errors, to get as much work done as possible.
LOG ERRORS allows you to do this even when a DML statement fails.
Copyright 2011 Feuerstein and Associates Page 111
All previous changes from that statement are rolled back. No other rows are processed. An error is passed out to the calling block (turns into a PL/SQL exception). No rollback on completed DML in that session.
errors_and_dml.sql
Page 112
But you will first need to created an error log table with DBMS_ERRLOG.
Copyright 2011 Feuerstein and Associates Page 113
The log table contains five standard error log info columns and then a column for each VARCHAR2-compatible column in the DML table.
Copyright 2011 Feuerstein and Associates
dbms_errlog.sql
Page 115
UPDATE employees SET salary = salary_in LOG ERRORS REJECT LIMIT 100;
Specify the limit of errors after which you want the DML statement to stop or UNLIMITED to allow it to run its course. Then...make sure to check the error log table after you run your DML statement!
Oracle will not raise an exception when the DML statement ends big difference from SAVE EXCEPTIONS.
Copyright 2011 Feuerstein and Associates Page 116
Error reporting is often obscure: "Table or view does not exist." Its up to you to grant the necessary privileges on the error log table. If the DML table is modified from another schema, that schema must be able to write to the log table as well. Use the DBMS_ERRLOG helper package to get around many of these issues.
Copyright 2011 Feuerstein and Associates
dbms_errlog.sql
Page 117
dbms_errlog_helper.sql dbms_errlog_helper_demo.sql
Page 118
Make sure that you check and manage any error logs created by your code.
Copyright 2011 Feuerstein and Associates Page 119
Page 120
RETURNING Clause
Use the RETURNING INTO clause to retrieve information about rows modified by the DML statement.
Avoid a separate SELECT.
When more than one row is modified, use RETURNING BULK COLLECT INTO. You cannot return an entire row into a record.
returning.tst method_2_returning.sql
Page 121
Confusion lingers from ancient "guru" advice. Implicit cursors (aka, SELECT INTO) will almost always be the most efficient way to fetch a single row of data.
Even though it must perform a second fetch to check for TOO_MANY_ROWS, Oracle has optimized it to run faster than explicit cursors.
Copyright 2011 Feuerstein and Associates
single_row_fetch.sql
Page 122
Page 123
11g
DBMS_PARALLEL_EXECUTE (11.2)
Incrementally update data in a large table in parallel:
1. Group sets of rows in the table into smaller chunks. 2. Apply the desired UPDATE statement to the chunks in parallel, committing each time you have finished processing a chunk.
Improves performance, reduces rollback space consumption, and reduces the number of row locks held.
Copyright 2011 Feuerstein and Associates Page 124
11g
DBMS_PARALLEL_EXECUTE - continued
Define chunks of data to be processed in parallel. Specify those chunks by ROWID, a SQL query, a numeric column. Especially helpful in data warehousing environments. See May/June 2010 Oracle Magazine issue for more thorough introduction.
Copyright 2011 Feuerstein and Associates
11g_parallel_execute.sql
Page 125
11g
11g
Native Compilation
Page 127
Page 128
11g
11g
11g
11g_native_sequence.sql
Page 131
11g
11g_continue.sql local_modules_with_continue.sql
Page 132
11g
PL/Scope
A compiler-driven tool that collects information about identifiers and stores it in data dictionary views. Use PL/Scope to answer questions like:
Where is a variable assigned a value in a program? What variables are declared inside a given program? Which programs call another program (that is, you can get down to a subprogram in a package)? Find the type of a variable from its declaration.
Copyright 2011 Feuerstein and Associates Page 133
PL/Scope must be enabled; it is off by default. When your program is compiled, information about all identifiers are written to the ALL_IDENTIFIERS view. You then query the contents of the view to get information about your code. Check the ALL_PLSQL_OBJECT_SETTINGS view for the PL/Scope setting of a particular program unit.
Copyright 2011 Feuerstein and Associates Page 134
USAGE
The way the identifier is used (DECLARATION, ASSIGNMENT, etc.)
SIGNATURE
Unique value for an identifier. Especially helpful when distinguishing between overloadings of a subprogram or "connecting" subprogram declarations in package with definition in package body.
11g
Page 136
11g
Page 137
11g
plscope_helper_setup.sql plscope_helper.pkg
Page 138
11g
PL/Scope Summary
PL/Scope gives you a level of visibility into your code that was never before possible. The ALL_IDENTIFIERS view is not straightforward. Use the helper package to get you started. Hopefully we will see PL/Scope interfaces built into the most popular IDEs.
Page 139
11g
Interoperability
Convert DBMS_SQL cursor to cursor variable Convert cursor variable to DBMS_SQL cursor
Improved security
Random generation of DBMS_SQL cursor handles Denial of access/use of DBMS_SQL with invalid cursor or change of effective user.
Page 140
11g
exec_ddl_from_file.sql exec_ddl_from_file_11g.sql
Page 141
Interoperability
DBMS_SQL.TO_REFCURSOR
Cursor handle to cursor variable Useful when you need DBMS_SQL to bind and execute, but easier to fetch through cursor variable.
DBMS_SQL.TO_CURSOR_NUMBER
Cursor variable to cursor handle Binding is static but SELECT list is dynamic
Page 142
11g
DBMS_SQL.TO_REFCURSOR
Converts a SQL cursor number to a weak cursor variable, which you can use in native dynamic SQL statements. Before passing a SQL cursor number to the DBMS_SQL.TO_REFCURSOR function, you must OPEN, PARSE, and EXECUTE it (otherwise an error occurs). After you convert a SQL cursor number to a REF CURSOR variable, DBMS_SQL operations can access it only as the REF CURSOR variable, not as the SQL cursor number.
Using the DBMS_SQL.IS_OPEN function to see if a converted SQL cursor number is still open causes an error.
Copyright 2011 Feuerstein and Associates
11g_to_refcursor.sql
Page 143
11g
DBMS_SQL.TO_CURSOR_NUMBER
Converts a REF CURSOR variable (either strong or weak) to a SQL cursor number, which you can pass to DBMS_SQL subprograms. Before passing a REF CURSOR variable to the DBMS_SQL.TO_CURSOR_NUMBER function, you must OPEN it. After you convert a REF CURSOR variable to a SQL cursor number, native dynamic SQL operations cannot access it.
Copyright 2011 Feuerstein and Associates
11g_to_cursorid.sql
Page 144
11g
Improved Security
Cursor handles generated by the DBMS_SQL.OPEN_CURSOR function are now random and not sequential. Pass an invalid cursor handle to many DBMS_SQL programs and DBMS_SQL is then disabled.
Have to reconnect.
Page 145
11g
11g
11g_gen_invoc.sql
Copyright 2011 Feuerstein and Associates Page 147
So if any change is made to a referenced object, all dependent objects' status are set to INVALID.
Even if the change doesn't affect the dependent object.
Page 148
Impact of change:
You can minimize invalidation of program units.
You cannot obtain this fine-grained dependency information through any data dictionary views yet.
11g_fgd*.sql
Copyright 2011 Feuerstein and Associates Page 149
Page 151