Sunteți pe pagina 1din 77

Unit 1

Module 1: Processing and Complexity of SQL


sentences
Phases in the processing of an SQL sentence
Every time Oracle is requested to execute an SQL sentence, it internally performs a series of
actions, which are mainly grouped into 3 blocks: Parse (analysis), Execute and Fetch (data
collection).
It is checked that the SQL
sentence exists in the cache, it
is syntactically checked (i.e. it
Existence of the sentence in the
is written correctly) and it is
cache, syntactic and semantic check.
semantically checked (correct
structure, fields and
permissions).

Transformation

PARSE

RBO (Rulebased
Optimizer).
It decides
on the plan
in
accordance
with rules

Calculation
of the cost
CBO (Cost- of the
based
object and
Optimizer). cardinality
It decides
on the plan
Costs with
with
different join
respect to
orders
the
statistics of
the objects Creation of
structures
involved
for
execution

Some queries (e.g. those that


include views, subqueries)
have to be transformed to the
most suitable new sentence.
Some sentences can be
replaced with others that are
more optimal or simpler.
The costs of access to the
objects and their cardinality
(estimation of the number of
rows involved) are calculated.
Different execution plans are
evaluated by changing the join
order between the tables (2 to
2), choosing the least costly
order.
The structures necessary for
the execution (cursors, etc.)
are created.

EXECUTE

The memory necessary for the


bind variables is taken and
they are assigned and
executed in accordance with
the plan selected during the
parse.

FETCH

The queried data blocks are


fetched and, if applicable,
orderings are performed.

Unit 1
Module 1: Processing and Complexity of SQL
sentences
G1. Complexity of SQL sentences
It is not considered suitable to use extremely long and complex sentences, as they are more
costly to maintain and more difficult for people to understand. Moreover, these complex
sentences usually result in unsuitable execution plans.
It is advisable to break these sentences down into various SQL calls even if this means more
code in our applications. In the long run they are maintained better and give better response
times.

Unit 1
Module 2: The Optimizer
What is the optimizer?
Every time a sentence is executed against the database, one of the things that Oracle must
decide is the optimal execution plan (determine how to access each object, and in which order
the joins to the various tables involved are performed).
The optimizer is responsible for doing this. Currently there are two types of optimizer for Oracle:
Rule-based: the criteria for deciding on the plan are in accordance with a series of fixed rules,
regardless of the volume or distribution of data (for example, the existence of indexes and types
of indexes). Maintenance of the rule-based optimizer was discontinued in Oracle version 8 and
it will no longer be supported from version 10.
Cost-based: Oracle takes into account the volume and distribution of data in accordance with
previously gathered statistics. Depending on the volume, it decides on the join orders or the use
or not of indexes, for example.
It is clear that for the cost-based optimizer it is essential to have reliable statistics, so they must
be periodically updated.

Unit 1
Module 2: The Optimizer
Which optimizer shall I use?
Of the two optimizers (rule-based and cost-based) it is always advisable to use the cost-based
optimizer. (Maintenance of the rule-based optimizer was discontinued in Oracle version 8 and it
will no longer be supported from version 10.)
Propose the use of one type of optimizer
ALTER SESSION SET OPTIMIZER_GOAL=
Session From Oracle 9i:
ALTER SESSION SET OPTIMIZER_MODE=
Database optimizer_mode=
Sentence

Using Hints
/*+ RULE */, /*+ FIRST_ROWS */, /*+ ALL_ROWS */
Possible values:

RULE

Rule-based.

Cost-based, optimizing return of the first rows.


FIRST_ROWS From Oracle 9i:FIRST_ROWS,
FIRST_ROWS_[1|10|100|1000]
ALL_ROWS

Cost-based, optimizing return of all the rows.

CHOOSE

Oracle chooses; if there are statistics in the


objects, it tries cost-based (ALL_ROWS)

The use of the following Oracle characteristics will force the use of the cost-based optimizer
rather than the rule-base optimizer:
Table with fixed PARALLEL

Domain Indexes (Inter Media)

Partitioning

Parallel Create Table As Select

IOT (Index Organized Tables)

Function-based indexes

Inverted indexes

Query Rewrite activated

Any hint, other than RULE

Unit 1
Module 2: The Optimizer
Histograms
Histograms are more complete statistics that sometimes must be used to provide the costbased optimizer with more information.
The normal statistics that are executed with ANALYZE presuppose a uniform distribution of the
data. For example, in a table we have a field called status. The ANALYZE will obtain the total
number of records of the table (for example, 2,000,000) and the number of different values that
the status field contains (for example 2 values: PENDING and COMPLETED). Accordingly, with
Oracle statistics, when we perform a select on this table, asking only about the status field, it will
understand that in total we will be looking for 1,000,000 records (total records / # different
values).
This uniform distribution supposition performed by Oracle can cause many problems, if the
information distribution is not complied with. If in my table 90% of the fields have the value
"COMPLETED", it would be good for Oracle to know this and to know that when asked about
the "PENDING" status, far fewer records will be returned.
Accordingly, for fields in which there is a major deviation from the uniform distribution of data, it
is advisable to create histograms, which, in short, are responsible for determining the number of
existing records for the various values of the fields.
It must be taken into account that histograms are also Oracle statistics, so they must be
periodically updated. The creation of histograms slows down the creation of statistics, so they
must be used with moderation and only when really necessary (non-uniform distribution).

Unit 1
Module 3: Execution Plan
Execution plan: how to consult it and interpret it
For the optimization of SQL sentences it is essential to understand the access plan (execution
plan) that Oracle uses in the execution of the sentence.
There are various methods to understand the execution plan:
1. From SQL*Plus executing EXPLAIN PLAN FOR.
2. From SQL*Plus executing SET AUTOTRACE.
3. Getting the trace of the session and using tkprof.
Execution plan with EXPLAIN PLAN
It is necessary for there to be a table in the username called PLAN_TABLE, where the
execution plan is stored. If this table does not exist in the username executing the sentence,
and given that the structure of the PLAN_TABLE varies between the different versions of
Oracle, ask the Database Administration team to generate the table in your username.
From SQL*Plus, execute for example:

The execution plan will be stored in the PLAN_TABLE.


All versions of Oracle:
You can query it by performing a select. You can use the following select to visually interpret the
execution plan a little better:

The result shows the execution plan:

From Oracle 9i:


The execution plan can be obtained using this sentence:

The result shows the execution plan:

Execution plan with SET AUTOTRACE


It is necessary for there to be a table in the username called PLAN_TABLE, where the
execution plan is stored. If this table does not exist in the username executing the sentence,
and given that the structure of the PLAN_TABLE varies between the different versions of
Oracle, ask the Database Administration team to generate the table in your username.
From SQL*Plus, execute for example:

It will directly obtain the execution plan in the following format:

Execution plan with TKPROF


Oracle enables the registration in a trace file of all the sentences that are executed in a session,
and based on this file, using tkprof, you can query the execution plans. An example is provided
below. To find out in which directory, for the database you are using, the trace files are created,
contact the Database Administration team.

With the sentence ALTER SESSION SET SQL_TRACE= you can activate or deactivate
dumping in the trace file. Locate the trace file you have generated (tec01_ora_6857.trc):

Obtain the execution plans in file tec01_ora_6857.tkp, using tkprof:


The format of the tkprof command is:

tkprof <input file> <output file> sys=no explain=<username>/<password>


where the username and password correspond to the user who executed the SQL sentences to
be obtained from the execution plan.
Output file tec01_ora_6857.tkp contains the required information:

Unit 1
Module 3: Execution Plan
Example of the interpretation of a plan
To interpret the execution plan obtained by any of these three methods, it is necessary to read
from inside to outside to understand how access to the data is achieved. This execution plan
shows:

Step 3 (FULL of emp) is executed, returning record by record to 2 (NESTED LOOP).


For each record returned by 3, steps 5 and 4 are executed (search in PK_DEPT index (unique
scan) and search for this record in the DEPT table accessing by rowid).
Each one of the rows fetched by 3, and 4-5, is collected in the NESTED LOOP in a single
record (JOIN), being step 2.
The step 6 (FULL of the SALGRADE table) is executed.
Then step 1 (FILTER), which actually implements the NOT EXISTS of the select, is executed.
Accordingly, each record that 2 returns, and that is not in 6, will be included in the result.
It can be graphically represented as follows:

The execution order of this tree is read from bottom to top and left to right. If you have any
doubts when interpreting execution plans, you can consult the Database Administration team.

Unit 1
Module 3: Execution Plan
Types of Join
When in a SELECT sentence more than one table is indicated in the FROM clause, Oracle,
when it obtains the data, must join the data of each table, in accordance with the conditions of
the WHERE. This union is called a join.
Oracle, regardless of the number of tables in the FROM, always performs table joins two by two,
applying the results of joining two tables to the next table and so on until it has gone through all
the tables of the FROM.
There are currently three different types of join:
1. Nested Loop
2. Sort-merge Join
3. Hash join
The behaviour of the optimizer can be changed so that it chooses another join method, using
the hints USE_NL, USE_MERGE, USE_HASH.
Nested Loop Join
The optimizer chooses a table as the master or outer. For each record of the outer Oracle
searches in the inner for all the records that link to the records of the outer. In the example
below the dept table is the master, and for each record that is found in dept, matches are
searched for in emp.

Sort-merge Join
A sort-merge join can only be performed with equijoin (=). Oracle orders the two data sources (if
they are not already ordered) by the columns of the equijoin. Oracle joins the two data sources,
for each pair of rows, and returns those that match the join fields.
In the next example, Oracle first goes through the entire DEPT table using the index PK_DEPT.
As the data is already ordered by the index, it is not necessary to order the DEPT data. Then it
goes through the entire EMP table and orders the data by the deptno field (SORT by the field of
the equijoin).

Once the two groups of data have been ordered, Oracle joins one to the other (MERGE JOIN).

Hash Join
It can only be used with equijoin (=), and using the cost-based optimizer. Given the following
two data groups:
S={1,1,1,3,3,4,4,4,4,5,8,8,8,8,10}
B={0,0,1,1,1,1,2,2,2,2,2,2,3,8,9,9,9,10,10,11}
The process is as follows:
1. The smaller source table (S) is selected and fully read to form a hash table. If this hash table
does not fit in the memory, it is carried out by partitions or pieces (applying an Oracle internal
hash function), which are stored in the disk (fan-out). In addition to creating a hash table (with
the maximum number of existing partitions that fit in the memory), a vector bitmap is created
with the values existing in the join field.
Vector Bitmap of S: {1,3,4,5,8,10}
2. The other table (B) starts to be read.
2.1. Bit-Vector Filtering: it is checked whether the value of the join field exists in the created
vector bitmap. If it is not in that vector, that row is directly rejected.
The next rows of B are directly discarded: {0,0,2,2,2,2,2,2,9,9,9, 11}
2.2. The hash function is applied to the join field of B. If that partition is in the memory, the joined
row is directly returned.
2.3. If that hash partition is not in the memory, it is written in a temporary segment in the form of
a partition or piece as was done with table S. Accordingly, sets of partitions, which have the
records to be joined, will be obtained.
This process is repeated until the entire data source of B is read.
3. Comparison of partitions. Partitions will now be collected two by two, one from each data
source (S and B). The Hash table of the smaller partition is created in the memory and
compared in the memory with the other partition, returning the rows that meet the join.
Step 3 is repeated until all the partitions created have been processed.

Unit 1
Module 3: Execution Plan
Which join to use?
If the rows to be returned by the join are not many (fewer than 10,000 rows approx.):
Oracle tends to use Nested Loop.
If the rows to be returned by the join are many (more than 10,000 rows approx.):
It is advisable to use the Hash Join. (Provided that the cost-based optimizer is used. If not, there
is no other option but to use the sort-merge join.)
The sort-merge join is not advisable (for many rows), due to the ordering costs, so it is
preferable to use the hash join with cost-based optimizer.
Actually, the choice Oracle makes is much more complex and does not exclusively depend on
the number of records, taking into account the costs of the E/S operations, the ordering and
hash memory areas, the existing statistics, the data model, the access indexes, etc. Here only
an overview of the choice of one plan or another is provided, although it is the optimizer that has
more information to decide on the optimal join plan (and even so it does not always make the
best decision).

Unit 2
Module 1: In the SELECT
COUNT(*) versus COUNT(1)
In versions prior to 8i, the use of count(*), count(1) or count(c1) could have difference response
times when using indexes or not.
From Oracle 8, the use of count(*) or count(1) is the same as they can use accesses by index
(fast index scan), obtaining the same response times.

Unit 2
Module 1: In the SELECT
Order of the fields in the SELECT
The order of the fields in the SELECT clause does not affect the performance of the sentences
at all.

Unit 2
Module 1: In the SELECT
S1. Fields necessary in the select
It is advisable to avoid the use of SELECT *, restricting the fields search in the "select" to the
fields that are really necessary. This reduces the volume of information accessed and
transported from the server to the client. It also facilitates quick access to indexes, compared to
access to index->table (when there is an index that allows it).
If it is not necessary to use all the fields of the table, not using the SELECT * can also be
beneficial to give more clarity when reading SQL code and avoid errors in the cursors when
assigning values to variables.

Unit 3
Module 1: In the FROM
Number of tables in the FROM
The cost-based optimizer must choose the order in the joins between the tables must be
performed (joins are always performed in pairs of tables). This choice is made during the
PARSE of the sentence, trying all possible join combinations and choosing the one with the
least cost.
When the number of tables in the FROM (number of joins) is very high, the optimizer decides
not to test all the possible combinations as it would take a long time, only testing some of the
combinations, so there is a greater probability of not choosing the best execution plan.
When the total number of tables in the FROM is more than eight, Oracle can no longer
perform all the necessary checks and non-optimal plans are chosen.
It is advisable to try to avoid putting more than 8 tables in the FROM, performing the query in
various separate SELECTs. If it is completely impossible, as a last resort, the "ORDERED" hint
could be used, ordering the tables in the FROM in accordance with the order most suitable for
performing the JOINs.

Unit 3
Module 1: In the FROM
Order of the tables in the FROM
For the cost-based optimizer (which is the one that should be used), the order of the tables in
the FROM does not affect the execution plans nor the performance of the sentence.
The only exception is when the "ORDERED" hint is used.

Unit 3
Module 1: In the FROM
Distributed sentences
Oracle enables, transparently for the user, queries of objects that are in remote databases.
However, this transparency is not maintained in the execution plans or in the performance, so
special care must be taken with sentences over distributed databases.
How distributed sentences are processed
1. Remote SQL: when all the tables in the FROM belong to remote tables in a single database,
the sentence is sent as is to the remote database, and the execution plan is obtained and
executed as if it were in local. The resulting data must go from the remote database to the local
database.
2. Distributed SQL: when in the FROM there are local and remote tables, or remote tables in
various databases, Oracle must break down the sentence, to execute the part that corresponds
to each database and also to execute the local part in the local database.
The local database "becomes the master" (if not specified to the contrary with a Hint), so it is the
database that receives all the data and is responsible for the joins, groupings and orderings.
Recommendations for distributed sentences

Always use the cost-based optimizer.

Although distributed access is transparent for the application, it is important to know that
the execution plans are strongly modified, and it is a good idea to analyse how the
division of the query is performed in the various databases.

It is not true that when moving from a non-distributed database to a distributed one we
should not worry. Rather the opposite is true, all mixed plans (local/remote) must be
reviewed.

Special care must be taken with the data volume to be "moved" over the network, as all
the data is fetched to the "master" server.

This data volume is not the final volume the query returns, but the volume required to
perform the joins. (Although the select returns a single record, it can make thousands of
records move to make the joins.)

Take into account the topology of the network, knowing its bandwidth and its availability.
In accordance with these parameters, evaluate the data movement and network
crashes.

In some cases, the use of views in remote databases helps to improve response times,
due to the simplification of the processes to "break down" the sentences, obtaining in
some cases better plans.

Hints in remote tables


The only hints that work over remote tables are the definition hints of the join order and the type
of join to be performed. Access method and parallelism hints, etc. do not work over remote
tables.

Unit 4
Module 1: In the Data Model
Excess of indexes
Indexes can allow us to improve the time taken to access information stored in tables. However,
an excess of indexes worsens the INSERT, UPDATE and DELETE operations over the table, as
in addition to storing the information in the table, the index must be updated.
This worsening of the times depends to a great extent on the degradation of the index, of the
data volume, of the quantity and frequency of the modifications in the table, of the size of the
fields to be indexed, etc. Even so, it is advisable to be concerned about the tables that
have more than six indexes, re-approaching it if truly necessary.

Unit 4
Module 1: In the Data Model
Redundant indexes
An index is considered to be redundant and should be eliminated when there is another index
that has the same columns on the left. The following indexes would be redundant:
This index is redundant because
it is contained in the one on the
right.

Index that should remain

c1

c1, c2

c1, c2

c1, c2, c3

Unit 4
Module 1: In the Data Model
Indexes and Foreign Keys
If in the "parent" table update or delete operations are to be performed, an index must be
defined in the "child" table by the same fields of the foreign key, for two reasons:
1. Performance, improving, when performing update or delete in the "parent", the search in the
child that assures that the update or delete can be performed.
2. By blocks between the "parent" and "child" tables.
Operation
Without index in FK fields
performed on
of the "child" table
the table
"parent"

Child "table" blocks


Up to Oracle 8i:
The entire table, to avoid
inconsistent changes. No
insert, update or delete can
be performed in the child

DELETE
UPDATE

table.
From Oracle 9i:
Only the records affected by
the change of the parent table
are blocked, not allowing the
fields of the foreign key to be
changed.

With index in FK fields of


the "child" table
Child "table" blocks
Versions 7, 8, 8i and 9i:
The affected records.

Unit 4
Module 1: In the Data Model
Use of composite indexes
A composite index is an index that is formed by various fields. It is important to remember that
for the composite index to be used at least the first field of the field of the index must appear in
the WHERE.
For example, I have an index formed by fields c1, c2 and c3.
In the WHERE I ask Could I use the
for:
index?
c1, c2, c3
c1, c2
c1
c2
c3
c2, c3
c1, c3
Exceptions:
1. Index Fast Full Scan: when there is an index that has all the fields specified in the
SELECT, so instead of performing a Full Table Scan, it performs a complete scan of the
index (Index Fast Full Scan).

2. An ORDER BY is requested and the fields belong to an index that, moreover, does not
allow null.

3. From Oracle 9i, there is the option of using the index by means of an "INDEX SKIP
SCAN", which in some cases can improve performance with respect to "FULL TABLE
SCAN". In any case, the use of "INDEX SKIP SCAN" is not considered to be very
effective. It is advisable to continue defining the right order in the fields of the index (a
new feature of Oracle 9i).

Unit 4
Module 1: In the Data Model
Effective indexes
Indexes are important to make data searches quicker. It is important to always create the most
effective indexes possible, in accordance with the following rules:
1. They must be created in accordance with the fields used in the WHERE.

2. They will be useful when a small set of data is searched for, within the total volume of
the table. (If a lot of data is searched for, a "Full Scan" of the table may be more
effective.)

3. It is advisable not to abuse the number of indexes over tables. ("Excess of indexes".)

4. It must not be redundant with respect to another existing index. ("Redundant indexes".)

5. For composite indexes, the fields with the greatest selectivity must be attempted to be
used as the first fields of the index. The selectivity of a field (and therefore of the index)
is measured in accordance with the repetition or not of this field within the table. For
example, in a table of employees, the selectivity of the sex, town and ID fields will be as
follows:
Field
ID

Selectivity
Very
selective

town
sex

Less selective
Not very
selective

Take into account "Use of composite indexes".

Unit 4
Module 1: In the Data Model
Obsolete data types
The following data types are obsolete for Oracle (updated for version 9.0.2):

Obsolete data
type

It must be
replaced with
the data type

Move to
obsolete in
version

Support of the obsolete data type

LONG (char up CLOB (char up


to 2Gb)
to 4Gb)

8.0.

9.0.2. Supported by compatibility. In


future versions it could disappear.

RAW (binary up BLOB (binary


to 2000 bytes) up to 4Gb)

8.0.

9.0.2. Supported by compatibility. In


future versions it could disappear.

LONG RAW
(binary up to
2Gb)

8.0.

9.0.2. Supported by compatibility. In


future versions it could disappear.

BLOB (binary
up to 4Gb)

For data types that are still supported by compatibility, their migration to the new data types is
very important to assure the operation of applications in future versions of Oracle and to be able
to use the improvements that the new data types provide.

Unit 4
Module 1: In the Data Model
My index is not used
Some of the reasons why my index is not used are covered below.
1. Does the index exist?

2. Are the statistics up to date?

3. At least the first header of the index must appear in the WHERE (Rule "M4. Use of
composite indexes"). Exceptions:
- Index Fast Full Scan: when there is an index that has all the fields specified in the
SELECT, so instead of performing a Full Table Scan, it performs a complete scan of the
index (Index Fast Full Scan).
- An ORDER BY is requested and the fields belong to an index that, moreover, does not
allow nulls.
- From Oracle 9i, there is the option of using the index by means of an "INDEX SKIP
SCAN", which in some cases can improve performance with respect to "FULL TABLE
SCAN". In any case, the use of "INDEX SKIP SCAN" is not considered to be very
effective. It is advisable to continue defining the right order in the fields of the index (a
new feature of Oracle 9i).
4. Are the fields of the index used for the join? If so, the use of the index depends on:
- The type of JOIN that is performed. (NESTED LOOP is the only one that enables the
use of an index.)
- The order in which the JOINs are performed. Specifically, with NESTED LOOP,
depending on which table is the incoming one and which is the outgoing, the index is
used or not.
5. Is a function being applied over the indexed field? (Rule "W4. Comparers and indexes")
A function of the SUBSTR type (field1,1,2) prevents the index from being used, unless
an index based on a function is used (the index is created applying the SUBSTR
function).

6. Is an implicit conversion of types being performed? (Rule "W5. Implicit conversions")


Implicit conversions can disable the indexes. Do not use implicit conversions.

7. It uses an index, but not the one I want.


The use of one index compared to another is chosen by the cost-based optimizer: it
chooses the one with the lesser cost. It is extremely difficult for two indexes to have the
same cost (same size, same number of sheets, depth, blocks, etc.). If this happens, the
cost-based optimizer chooses in alphabetical order. There is always the option of Hints
to ensure the use of the index that we want. (Rule "H1. Considerations about Hints")

8. My index is not good.


It may be that my index is not very good (not very restrictive). (Rule "M5. Effective
indexes")
It may be that the data is not uniformly distributed, and as there are no histograms, the
optimizer draws the wrong conclusions. (See "Overview of SQL-> The Optimizer ->
Histograms".)

9. Are you asking about null values?


Null values are not stored in the index (for composite indexes if all the values are null,
they are not stored in the index). An index cannot be used when asking about null
values (Rule "W4. Comparers over indexed fields.")

10. A remote table is being used.


The use of remote tables (distributed queries) is complicated for the optimizer. We will
publish recommendations for distributed queries at a future date.

11. Is Parallel Query being used?


Make sure that Parallel Query is not being used and check whether the tables have a
degree of parallelism activated. The use of parallel query causes a tendency to perform
"Full Table Scan in parallel" rather than access by indexes.

12. Are bind variables being used?


In some circumstance the use of bind variables causes the non-use of indexes, as the
optimizer does not know the value of the data used and performs fixed pre-calculations
that might be erroneous.

Unit 4
Module 1: In the Data Model
Blocks in bitmap indexes
Bitmap indexes are a special type of Oracle index, which might be more beneficial than the
B*Trees, when the number of different values that the field can have is very small.
These indexes, for each one of the different values, will internally store a 1 or a 0 per record,
depending on whether this record has this value or not. They might occupy much more space,
but they might be much quicker.
However, they have a significant disadvantage for environments in which modifications or
insertions are made concurrently by various users.
Any DML (Data Manipulation Language) operation, such as insert, update or delete, will cause
an exclusive block in "part" of the index, which might impede modifications of that table,
provided it affects the fields that form part of the bitmap index.
Normally, they are used in a very beneficial manner in decision-making environments (data
warehouse) and in a not very beneficial manner in online transactional environments (OLTP).

Unit 5
Module 1: In the
WHERE

Use of Bind Variables


The execution of an SQL sentence requires some preliminary "compilation" steps (parse) that
analyse its syntax, check the existence of objects, verify the permissions and decide its
execution plan.
Since version 7 of Oracle, a cache of SQL sentences has been included, housed within the
Shared Pool, which is responsible for storing all the latest SQL sentences that have been
executed in the database. The purpose of this cache is to improve response times when
executing an SQL sentence. If this sentence is in the cache, the preliminary checking and
execution plan steps can be avoided, moving directly on to executing the SQL sentence, based
on the information stored in the cache.
Hard Parse
When a sentence is not in the Shared Pool cache, it must carry out a full or hard parse, which
means obtaining memory for the sentence, checking the syntax, the objects, the data types, and
obtaining its execution plan. This process is a complex process, requiring significant use of the
CPU.
Soft Parse
If the sentence is already in the Shared Pool and is "shareable", Oracle only needs to perform a
"soft parse", which is much less costly.
Shareable sentences in the Shared Pool.
Two sentences are considered to be shareable in the Shared Pool, enabling quicker execution
using the information of the cache, when the following conditions are met:
- They are written exactly the same on the ASCII level (including letters, blank spaces, upper
case letters, etc.).
- The objects referenced in both sentences are the same (same user).
- If bind variables are used, they are of the same type and size for both sentences.

- The optimizer set for the two sessions that execute the two sentences must be exactly the
same (rule, first_rows, all_rows).
- The configuration of the NLS (National Language Support) must be the same for the two
sessions that execute the two sentences.
What are bind variables?
It consists of replacing the values used in the searches of the WHERE conditions with variables.
For example the following sentences:
SELECT name, surname, address FROM employees WHERE dept_id = 20;
SELECT name, surname, address FROM employees WHERE dept_id = 30;
They express in the WHERE the literals 20 and 30 as department identifiers (dept_id). These
two sentences on the ASCII level are not the same, which means they are not "shareable".
When executing them it will be compulsory to carry out a hard parse, if they were previously not
already in the cache.
However, if this sentence is changed to:
SELECT name, surname, address FROM employees WHERE dept_id = :bindA;
Every time this sentence is executed, it can be shared, increasing the probability of performing a
soft parse, regardless of the valour assigned to the variable ":bindA".
Advantages of bind variables and shareable sentences
They favour the use of the cache of sentences in the Shared Pool, reducing the consumption of
memory and CPU in the database. Especially useful in very common sentences.
Disadvantages of bind variables
In some cases when performing the parse of the sentence and obtaining the execution plan, not
knowing the value that we are asking for (the variable) makes the execution plans worse. These
cases are:
- Using histograms. The histogram helps the optimizer to know the distribution of data for a
specific value. Using bind variables, hiding the value sought, prevents the use of histograms.
- Partitioning. In partitioned tables or indexes, the search values define the partitions to be
used (going through all the partitions or only those that are necessary). When using bind
variables, the optimizer does not know the values I am looking for, so it cannot restrict the
number of partitions to be used, tending therefore to perform processes that go through all the
partitions, with the consequent degradation of the searches.
- Pseudo-dynamic sentences. It is a typical example of sentences used on free-search
screens, which are increasingly used in applications. Accordingly, these screens allow the user
to freely search in a significant number of different fields. However, the SQL sentence that the
application generates is fixed and does not adapt to what the user is searching for at all times.
This practice is not at all advisable because of the complexity of the sentences executed, the

quantity of LIKE conditions generated and the impossibility of optimizing searches in


accordance with fixed rules. Our advice:
- Do not use these fixed sentences and make them dynamically depending on the search
criteria.
- In addition to being dynamic, try to divide the search on various screens grouping and limiting
the search conditions (for example, if you search by name, force the entry of the surname; if you
search by address, force the entry of the city or post code, etc.), forcing the use of compulsory
fields in the search. (These compulsory fields will be candidates for inclusion in the indexes of
the database.)
- Finally, the least advisable recommendation: the use of pseudo-dynamic sentences. These
sentences usually use a binary field that serves as a flag to know if I am asking for a specific
field or not. An example:

This sentence uses the bind variables "vsurname1", "vsurname2" and "vname" correctly to
identify the search values by surnames and name.
The bind variables "flagsurname2" and "flagname" are used to determine whether we are
searching or not by second surname and by name. If these flags have a value of 0, it means
they do not search by this field, and thanks to the OR they impede the search in the rest of the
condition (surname2, customer_name). If they have a value of 1, they search by the condition
after the OR.
These flags partially improve the performance of these sentences, provided that these flags are
not used with bind variables. If we search by name of the customer, for example, the sentence
should be executed as follows:

- Queries by ranges. In some queries that ask for ranges (>, <, between), or when LIKE is
used, for example, the use of bind variables can result in worse plans. This is because the
optimizer does not know the exact value of the variable, and it cannot calculate how big the

range to be queried is. The optimizer assumes fixed estimated rules and it can result in worse
plans.

As a general rule, it is advisable to use bind variables due to their significant impact on the
performance of the database. If partitioning, histograms or pseudo-dynamic sentences are
used, they should be used in fields not affected by the partitioning, histogram or flag,
respectively. If in doubt, you can consult the Database Administration team.

Unit 5
Module 1: In the WHERE
Long INLISTs
Long INLISTs are considered to be IN clauses in which a long list of values is detailed. It is
advisable to replace long INLISTs with search tables (performing a join with a table that contains
values to be searched for).

Unit 5
Module 1: In the WHERE
Operators over indexed fields
The use in the WHERE of operators over indexed fields (operators to the left of the comparer)
may make the use of indexes impossible. Here are some examples of good and bad practices:

Bad practice in WHERE

Good practice in WHERE

WHERE amount + 5000 = advance


payment

WHERE amount = advance payment 5000

WHERE TRUNC(date) =
TRUNC(sysdate)

WHERE date BETWEEN


TRUNC(sysdate) AND TRUNC(sysdate)+
0.99999

WHERE account_name = NVL(


:acc_name, account_name)

WHERE account_name LIKE NVL(


:acc_name,'%')

Unit 5
Module 1: In the WHERE
Comparers over indexed fields
The use in the WHERE of some comparers may make the use of indexes impossible. Here are
some examples:
Comparer

Does it use the index?

WHERE field = value


WHERE field <> value
WHERE field > value
WHERE field < value
WHERE field BETWEEN value1 AND
value2
WHERE field LIKE 'va%'
WHERE field LIKE '%lor'
WHERE field LIKE '%alo%'
WHERE field NOT LIKE 'val%'
WHERE field IS NULL
WHERE SUBSTR(field,1,2) = 'MA'

*
**

* Null values not stored in the index.


** If an index based on the same function (SUBSTR(field,1,2)) is
not used, the index will not be used.

Unit 5
Module 1: In the WHERE
Implicit conversions
When in a WHERE clause the data type of the column does not match the data type of the
value asked about, Oracle can not generate an error, and internally perform a conversion
(implicit conversions).
These implicit conversions in many cases cause the non-use of the indexes.
Letting Oracle perform the implicit conversion is considered to be a bad practice. We must try to
make the field types of the variables and the fields match, or expressly perform the conversion.
For example, in the DEPT table there is an index with the field "description" (varchar2). This is
what happens:
Condition
WHERE description = '1'
WHERE description =
TO_CHAR(1)
WHERE description = 1

Does it use the index?

Unit 5
Module 1: In the WHERE
Having and Where
It is important to correctly distinguish and use the HAVING and the WHERE.
WHERE is used to filter the records to be returned.
HAVING is applied after the WHERE, with grouping functions (GROUP BY) and is used to filter groups.
It is good to always filter with the WHERE the greatest number of records so that the grouping
functions can manage fewer records, rather than not filtering with WHERE and overloading the
grouping.

Bad practice with HAVING

Good practice with HAVING

Unit 5
Module 1: In the WHERE
ROWNUM
The ROWNUM clause in a WHERE is used to return only a number of records of the total of
records that meet the rest of the WHERE condition.
The ROWNUM clause can only be used with the operators "<" and "<=".
With respect to the suitable use or not of ROWNUM, in our opinion it must be used as it is the
only way we have of "cutting down" the return of records to a given total, but we must be
very clear that it is what really happens internally in Oracle, as although ROWNUM is
used, Oracle may use many more records.
First example
In the example the optimizer manages to drag the number of records from the ROWNUM in
almost all the execution plan, so it does not seem to be very damaging.

This is its execution plan with its estimation of cost and of records:

And this is the actual execution (of records), obtained with tkprof:

Let's study the execution plan obtained. As explained in SQL Overview -> Execution plan ->
Interpreting the plan, Oracle executes the sentences performing operations two by two, so:

1. Oracle performs "TABLE ACCESS FULL" on the entire VUELOS table, going through a
total of 57,711 records (not limited by ROWNUM). Then by it goes through the primary
key of the PLAZAS table, called PLA_PK, but in this case it does limit the number of
ROWNUM records (9). A Hash Join is performed on these two data blocks, returning a
total of 9 records.

2. A "TABLE ACCESS FULL" is performed on the entire COMPANIAS table, without being
limited by ROWNUM, finding a total of 8 records. Then it performs a Hash Join on the 8
COMPANIAS records and the 9 records returned in step 1. The hash join returns 9
records.

3. The "COUNT STOPKEY" is performed, which is really counting up to the 9 records that
must be returned to respect the ROWNUM<10 that appears in the WHERE clause.
Actually this COUNT in this execution does not filter, because in the previous step only
the 9 desired records were returned.

As observed in this plan, the "FULL TABLE SCAN" operations have not been limited by the
ROWNUM, whereas the operation "INDEX (FAST FULL SCAN)" has limited the analysed
records, and returning 9 records, has managed to return what was requested in the ROWNUM.
Second example

Let's see the same sentence, but in this case we have added a pair of Hints to achieve a much
worse execution plan, but it does allow us to show how the plan is not completely limited by the
ROWNUM condition.

This is its execution plan with its estimation of cost and of records:

And this is the actual execution (of records), obtained with tkprof:

This execution plan shows that to return the nine records requested by ROWNUM<10, a total of
3,313,493 records have been gone through (in various tables).

Unit 5
Module 1: In the WHERE
Antijoin
Anti-Join Overview
An antijoin is a query that returns records from a table that do not correspond to records from
another table. It is in some way the opposite operation to a join.
In Oracle there are the following ways of performing an Anti-Join:

With NOT IN

With NOT EXISTS

With MINUS

With OUTER-JOIN

Use of NOT EXISTS versus NOT IN


Use NOT EXISTS rather than NOT IN when there is an index in the join field of the table of the
subselect.
We have the following sentence with NOT IN with its corresponding execution plan:

Oracle first performs the subselect (SELECT deptno FROM emp), so as it does not have any
WHERE it goes through the entire emp table. Then it goes through the dept table, looking for
the departments that do not have any employee associated.
However, reconstructing the sentence with NOT EXISTS obtains the following result:

Oracle now performs a NESTED LOOP, so for each department (DEPT) it performs a search
BY INDEX in the employees table (EMP). So we have avoided performing the FULL SCAN of
emp, accessing now by the index of the join field.
Take into account that we assume that the volume of the EMP table is big enough to make it
worth using an index rather than a full scan of the entire table.

Unit 6
Module 1: Orderings/Groupings
Overview of ordering
Orderings are costly operations for the database, and they may be one of the major reasons
why our sentences are not as quick as we would like.
As a general rule they must be avoided when not essential.
The operations that require an ordering are:

Creation of an index

Use of the grouping clauses GROUP BY, DISTINCT

Use of the clause ORDER BY

Join using SORT-MERGE

Use of the set operators UNION, INTERSECT and MINUS

Unit 6
Module 1: Orderings/Groupings
ORDER BY
Use the ORDER BY clause only when it is important for the application to obtain ordered data.
ORDER BY is frequently used without ordering being necessary.

Unit 6
Module 1: Orderings/Groupings
DISTINCT and GROUP BY
Use the DISTINCT clause when sure that duplicated records can be returned and they are to be
eliminated. Distinct is often used when it is not necessary, forcing an ordering.
The same happens with the GROUP BY clause, unnecessarily using it without grouping
functions.

Unit 6
Module 1: Orderings/Groupings
UNION ALL versus UNION
The UNION set operators aggregate the results of various SELECTs. UNION versus UNION
ALL performs an additional ordering to eliminate repeated records.
Use UNION ALL versus UNION when we want to receive repeated records or we are sure that
we will not receive any repeated records.

Unit 7
Module 1: In the INSERT

Specify Fields in the INSERT


Performing INSERT indicating directly the VALUES and presupposing the order of the fields in
the table is considered to be a very bad practice. Changes in the data model (eliminating
columns, or changes in order) can cause our programs to fail, or worse, not to fail but inserting
values in the wrong columns.
INSERT INTO employees
VALUES ('Garca','Aranda', 'Oscar','979123456');
INSERT INTO employees (surname1, surname2,
name, telephone1)
VALUES ('Garca','Aranda', 'Oscar','979123456');

Unit 8
Module 1: Hints
General considerations about the use of Hints
The optimizer does not always obtain the best execution plans. In some cases, knowledge of
the business and of the information stored can help to choose the best execution plans.
Hints are proposals made to the Oracle Optimizer on how to perform execution plans. The
optimizer considers them to be proposals and in some cases they might not be taken into
account by the optimizer.
The use of Hints implies the following risks:

Their use is proposed with a current volume, in which it is checked that benefits are
contributed. However, changes in volume due to the evolution of the system can cause
Hints entered at a given moment to be not suitable in a few months in accordance with
the volume changes.

The evolution of Oracle Software, modifying the hints or expanding functionality, can
cause major changes in the optimizer. It may be that in new versions of Oracle, certain
Hints will no longer be used, so we must review the sentences that use them.

In cases in which hints are used it must never be forgotten that:

Hints must be used with moderation, in cases when really essential, after discarding
other possibilities.

They force the maintenance of clear documentation on where they are used, as
periodically the sentences that use them must be reviewed, to check whether they are
still suitable (due to Oracle version changes or due to the evolution of the volume and
the data).

Unit 8
Module 1: Hints
Why does my Hint not work?
Hints are proposals made to the optimizer and in some cases they might not be taken into
account by the optimizer.
In any case, if your Hint does not work, do not forget to check that you are respecting the
following rules:

They must be written like this: /*+ HINT HINT */

Using any Hint (except RULES) means that the cost-based optimizer must be used.
Remember that the statistics must be executed and updated.

The use of the Hint RULES (rule-based optimizer) is annulled if a functionality that
requires the cost-based optimizer is used.

Hints must not refer to the name of the schema. It can be solved by using an alias in the
table. The following example is incorrect:

SELECT /*+ index(scott.emp emp1) */ ...

If an alias is used, the hint must use the alias and not the name of the table. For
example:

SELECT /*+ FULL ( myalias ) */ empno FROM emp myalias WHERE empno > 10;

Just after the "+" of the hint there must be a space, in PL/SQL blocks.

Do not use hints that do not make sense in the sentence. For example, using the hint
"FIRST_ROWS" in a sentence that has an ORDER BY.

For the INDEX hint:


o

Its format is: SELECT /*+ index(TABLE_NAME INDEX_NAME) */ col1...

It is compulsory to use the name of the table (the alias, if used)

The name of the index is optional. If not specified, the optimizer chooses the
index. If specified, it must be the correct name, as if not, the hint is invalidated.

For distributed queries, in remote tables, the only hints that work are join order and join
type.

Unit 8
Module 1: Hints
Hints for the use of the optimizer
The hints related to the use of optimizer are:

ALL_ROWS

FIRST_ROWS

CHOOSE

RULE

ALL_ROWS
It specifies that the cost-based optimizer is used, trying to achieve the best throughput.
(Minimum consumption of resources, maybe penalising the response time.)
Format:

FIRST_ROWS
It specifies that the cost-based optimizer is used, trying to achieve the best response time,
perhaps penalising the consumption of resources.
The optimizer in FIRST_ROWS mode is more likely to use indexes, although it might mean an
increase in disk accesses. (Not always advisable.)
Format

CHOOSE
It specifies that the cost-based optimizer is used if a table has statistics. If not, the rule-based
optimizer will be used.
Format

RULE
It specifies that the rule-based optimizer is used.
Format

Unit 8
Module 1: Hints
Hints for the access method
The hints related to the table access method are:

FULL

ROWID

CLUSTER

HASH

INDEX

INDEX_ASC

INDEX_COMBINE

INDEX_JOIN

INDEX_DESC

INDEX_FFS

NO_INDEX

AND_EQUAL

USE_CONCAT

NO_EXPAND

REWRITE

NOREWRITE

FULL
It specifies that an access is performed going through the entire table (Full table scan)
Format:

ROWID
It specifies that an access is performed by means of the RowId
Format

INDEX
It specifies that a certain index is used.
Format

Unit 8
Module 1: Hints
Hints for the performance order of the JOIN
The hints related to the performance order of the JOIN are:

ORDERED

STAR

Unit 8
Module 1: Hints
Hints for the JOIN operations
The hints related to the JOIN operations are:

USE_NL

USE_MERGE

USE_HASH

DRIVING_SITE

LEADING

HASH_AJ

MERGE_AJ

HASH_SJ

MERGE_SJ

Unit 8
Module 1: Hints
Hints for parallel executions
The hints related to parallel executions are:

PARALLEL

NOPARALLEL

PQ_DISTRIBUTE

APPEND

NOAPPEND

PARALLEL_INDEX

NOPARALLEL_INDEX

Unit 8
Module 1: Hints
Other hints
Other hints, with varying uses, are provided below:

CACHE

NOCACHE

MERGE

NO_MERGE

UNNEST

NO_UNNEST

PUSH_PRED

NO_PUSH_PRED

PUSH_SUBQ

STAR_TRANSFORMATION

ORDERED_PREDICATES

Unit 9
Module 1: Programming and SQL
Commits in loop
The commit operation is a complex, but not necessarily slow, operation for the database. The
inclusion in a loop of the execution of a commit ("in each cycle") is not suitable as it overloads
the database, which causes the performance to worsen.
It should only be done when the business itself needs to assure that in each cycle of the loop
the changes have been made and are visible to the rest of the users. The use of commits after a
major volume of changes is usual, but not for every cycle.

WHILE (i< num_elements) LOOP


..... SELECT ...
...... INSERT .....
COMMIT
END LOOP;

WHILE (i< num_elements) LOOP


..... SELECT ...
...... INSERT .....
END LOOP;
COMMIT

Unit 9
Module 1: Programming and SQL
Commits and closing of cursors
In some programming languages, such as Pro*C, performing a commit in our program can
mean a closure of the cursors that are open.
It is advisable to not produce this closure of the cursors, because the process of reopening the
cursors may be costly and cause unnecessary delays.
How could it be solved?

Not performing a commit during the process and in the final


commit the cursors are closed.
When the cursor processes many records, it consumes a lot of
the rollback segment, which is not suitable.

Parameterising our programming language so that the cursors


are not closed.

Parameterising our programming language:

Pro*C (compilation parameters)


MODE= oracle
CLOSE_ON_COMMIT=no
(Be careful when changing from ANSI to ORACLE: it changes the
value of the variable SQLCODE as there is no more data in a cursor.
For ANSI it is 100 and for ORACLE it is 1403.

Unit 10
Module 1: PL/SQL
Use of "Cursor FOR" instead of CURSOR loop.
Up to version 8 of Oracle, the treatment of cursors consisted of an OPEN of the cursor, a loop
that processes the data and a CLOSE that closes the cursor.
From version 8i of Oracle, there is another simpler and better performing way. It is a special
version of the FOR loop designed for cursors. The FOR performs OPEN, FETCH and CLOSE of
the cursors.

Unit 10
Module 1: PL/SQL
Use of %TYPE and %ROWTYPE.
When defining our variables in PL/SQL, to store the data of our tables, we can either indicate in
our code the data type or make it dynamic and in execution have it take the data type of the
data model.
This second option using %TYPE and %ROWTYPE enables us to ensure that a change in the
data model does not mean having to make changes to all our PL/SQL programs.

Unit 10
Module 1: PL/SQL
Unnecessary use of the DUAL
Using DUAL unnecessarily to obtain function results causes unwanted accesses to the
database. These problems are usually aggravated when this use is done within loops.

Unit 10
Module 1: PL/SQL
Improper use of SELECT COUNT
Some programmers are used to performing a SELECT COUNT to check whether a cursor will
return data. If data is returned, the cursor is opened with the data. This way of programming
forces the cursor to be executed twice.
It is preferable to open the cursor the first time and check whether there is data in the first
FETCH.

Unit 10
Module 1: PL/SQL
Explicit Cursors versus Implicit Cursors
Implicit cursors are those that are not expressly declared as CURSOR in the code. Internally
Oracle is responsible for performing all the open, fetch and close steps. They can be used in
PL/SQL only when they return a single record.
Explicit cursors are those that we declare expressly as CURSOR.
A cursor that returns a single record could be used in PL/SQL as explicit or implicit. It is
advisable to use cursors explicitly as they have better performance although their syntax is a
little more complex. Implicit cursors always force the execution of two fetches. The first to fetch
the first record and the second to make sure there are no more values (as they should only
return one record).
On the contrary, in the explicit ones I fetch the first record and close the cursor. I do not perform
two fetches (I "assume" that I am sure that the cursor does not return more than one record,
and I avoid a new fetch).

Other advantages of explicit cursors:

They can be closed by code, previously releasing resources, without having to wait until
the PL/SQL block completely finishes.

If the cursor has Bind Variables, they can be reopened, taking on the new values. It is
quicker to use Oracle's internal structures created in the definition of the cursor.

Unit 11
Module 1: SQL Tricks
Ranking Queries
As of Oracle 8i there are extensions to SQL, with the inclusion of analytical functions.
They include RANK and DENSE_RANK, which allow us to rank the records obtained. An
example:
We want to rank the salaries of the employees of a company:

The difference between RANK and DENSE_RANK is that with DENSE_RANK "gaps" are not
allowed in the ranking values when the ranked values are repeated.

6. Using the cost-based optimizer, would these sentences have the same execution plan?
SELECT /*+ ORDERED */ e.name, d.name , c.office_name FROM emp e, dept d, office o
WHERE e.deptid = d.deptid AND
e.officeid = o.officeid; SELECT /*+ ORDERED */
e.name, d.name , c.office_name FROM office o, emp e, dept d WHERE e.deptid = d.deptid
AND
e.officeid = o.officeid;

Yes, as the order of the tables in the FROM does not affect the execution plan.
No, as the plan always varies when the order of the tables is changed in the FROM.
It could vary as it is using the ORDERED hint.

7. Which characteristic pertains to the cost-based optimizer?

The execution plan does not vary and is based on Oracle's internal statistics, in accordance with the
data loaded in the tables.
It is based on Oracle's internal statistics, in accordance with the data loaded in the tables.
The execution plan is exclusively based on the existing data model.

8. Which of the following SQL clauses causes a join of the anti-join type?

NOT IN
UNION
Both.

9. From how many indexes over a table is it advisable to reframe the data model and the forms
of accessing the data, seeking alternative solutions with fewer indexes?

12.
4.
6.

10. What enables assurance of the use of %TYPE and %ROWTYPE when defining variables in
PL/SQL?

A change in the data of the table does not mean we have to touch up all our PL/SQL programs.
A change in the data model does not mean we have to touch up all our PL/SQL programs.
A change in the data model means we have to touch up all our PL/SQL programs.

S-ar putea să vă placă și