Documente Academic
Documente Profesional
Documente Cultură
Transformation
PARSE
RBO (Rulebased
Optimizer).
It decides
on the plan
in
accordance
with rules
Calculation
of the cost
CBO (Cost- of the
based
object and
Optimizer). cardinality
It decides
on the plan
Costs with
with
different join
respect to
orders
the
statistics of
the objects Creation of
structures
involved
for
execution
EXECUTE
FETCH
Unit 1
Module 1: Processing and Complexity of SQL
sentences
G1. Complexity of SQL sentences
It is not considered suitable to use extremely long and complex sentences, as they are more
costly to maintain and more difficult for people to understand. Moreover, these complex
sentences usually result in unsuitable execution plans.
It is advisable to break these sentences down into various SQL calls even if this means more
code in our applications. In the long run they are maintained better and give better response
times.
Unit 1
Module 2: The Optimizer
What is the optimizer?
Every time a sentence is executed against the database, one of the things that Oracle must
decide is the optimal execution plan (determine how to access each object, and in which order
the joins to the various tables involved are performed).
The optimizer is responsible for doing this. Currently there are two types of optimizer for Oracle:
Rule-based: the criteria for deciding on the plan are in accordance with a series of fixed rules,
regardless of the volume or distribution of data (for example, the existence of indexes and types
of indexes). Maintenance of the rule-based optimizer was discontinued in Oracle version 8 and
it will no longer be supported from version 10.
Cost-based: Oracle takes into account the volume and distribution of data in accordance with
previously gathered statistics. Depending on the volume, it decides on the join orders or the use
or not of indexes, for example.
It is clear that for the cost-based optimizer it is essential to have reliable statistics, so they must
be periodically updated.
Unit 1
Module 2: The Optimizer
Which optimizer shall I use?
Of the two optimizers (rule-based and cost-based) it is always advisable to use the cost-based
optimizer. (Maintenance of the rule-based optimizer was discontinued in Oracle version 8 and it
will no longer be supported from version 10.)
Propose the use of one type of optimizer
ALTER SESSION SET OPTIMIZER_GOAL=
Session From Oracle 9i:
ALTER SESSION SET OPTIMIZER_MODE=
Database optimizer_mode=
Sentence
Using Hints
/*+ RULE */, /*+ FIRST_ROWS */, /*+ ALL_ROWS */
Possible values:
RULE
Rule-based.
CHOOSE
The use of the following Oracle characteristics will force the use of the cost-based optimizer
rather than the rule-base optimizer:
Table with fixed PARALLEL
Partitioning
Function-based indexes
Inverted indexes
Unit 1
Module 2: The Optimizer
Histograms
Histograms are more complete statistics that sometimes must be used to provide the costbased optimizer with more information.
The normal statistics that are executed with ANALYZE presuppose a uniform distribution of the
data. For example, in a table we have a field called status. The ANALYZE will obtain the total
number of records of the table (for example, 2,000,000) and the number of different values that
the status field contains (for example 2 values: PENDING and COMPLETED). Accordingly, with
Oracle statistics, when we perform a select on this table, asking only about the status field, it will
understand that in total we will be looking for 1,000,000 records (total records / # different
values).
This uniform distribution supposition performed by Oracle can cause many problems, if the
information distribution is not complied with. If in my table 90% of the fields have the value
"COMPLETED", it would be good for Oracle to know this and to know that when asked about
the "PENDING" status, far fewer records will be returned.
Accordingly, for fields in which there is a major deviation from the uniform distribution of data, it
is advisable to create histograms, which, in short, are responsible for determining the number of
existing records for the various values of the fields.
It must be taken into account that histograms are also Oracle statistics, so they must be
periodically updated. The creation of histograms slows down the creation of statistics, so they
must be used with moderation and only when really necessary (non-uniform distribution).
Unit 1
Module 3: Execution Plan
Execution plan: how to consult it and interpret it
For the optimization of SQL sentences it is essential to understand the access plan (execution
plan) that Oracle uses in the execution of the sentence.
There are various methods to understand the execution plan:
1. From SQL*Plus executing EXPLAIN PLAN FOR.
2. From SQL*Plus executing SET AUTOTRACE.
3. Getting the trace of the session and using tkprof.
Execution plan with EXPLAIN PLAN
It is necessary for there to be a table in the username called PLAN_TABLE, where the
execution plan is stored. If this table does not exist in the username executing the sentence,
and given that the structure of the PLAN_TABLE varies between the different versions of
Oracle, ask the Database Administration team to generate the table in your username.
From SQL*Plus, execute for example:
With the sentence ALTER SESSION SET SQL_TRACE= you can activate or deactivate
dumping in the trace file. Locate the trace file you have generated (tec01_ora_6857.trc):
Unit 1
Module 3: Execution Plan
Example of the interpretation of a plan
To interpret the execution plan obtained by any of these three methods, it is necessary to read
from inside to outside to understand how access to the data is achieved. This execution plan
shows:
The execution order of this tree is read from bottom to top and left to right. If you have any
doubts when interpreting execution plans, you can consult the Database Administration team.
Unit 1
Module 3: Execution Plan
Types of Join
When in a SELECT sentence more than one table is indicated in the FROM clause, Oracle,
when it obtains the data, must join the data of each table, in accordance with the conditions of
the WHERE. This union is called a join.
Oracle, regardless of the number of tables in the FROM, always performs table joins two by two,
applying the results of joining two tables to the next table and so on until it has gone through all
the tables of the FROM.
There are currently three different types of join:
1. Nested Loop
2. Sort-merge Join
3. Hash join
The behaviour of the optimizer can be changed so that it chooses another join method, using
the hints USE_NL, USE_MERGE, USE_HASH.
Nested Loop Join
The optimizer chooses a table as the master or outer. For each record of the outer Oracle
searches in the inner for all the records that link to the records of the outer. In the example
below the dept table is the master, and for each record that is found in dept, matches are
searched for in emp.
Sort-merge Join
A sort-merge join can only be performed with equijoin (=). Oracle orders the two data sources (if
they are not already ordered) by the columns of the equijoin. Oracle joins the two data sources,
for each pair of rows, and returns those that match the join fields.
In the next example, Oracle first goes through the entire DEPT table using the index PK_DEPT.
As the data is already ordered by the index, it is not necessary to order the DEPT data. Then it
goes through the entire EMP table and orders the data by the deptno field (SORT by the field of
the equijoin).
Once the two groups of data have been ordered, Oracle joins one to the other (MERGE JOIN).
Hash Join
It can only be used with equijoin (=), and using the cost-based optimizer. Given the following
two data groups:
S={1,1,1,3,3,4,4,4,4,5,8,8,8,8,10}
B={0,0,1,1,1,1,2,2,2,2,2,2,3,8,9,9,9,10,10,11}
The process is as follows:
1. The smaller source table (S) is selected and fully read to form a hash table. If this hash table
does not fit in the memory, it is carried out by partitions or pieces (applying an Oracle internal
hash function), which are stored in the disk (fan-out). In addition to creating a hash table (with
the maximum number of existing partitions that fit in the memory), a vector bitmap is created
with the values existing in the join field.
Vector Bitmap of S: {1,3,4,5,8,10}
2. The other table (B) starts to be read.
2.1. Bit-Vector Filtering: it is checked whether the value of the join field exists in the created
vector bitmap. If it is not in that vector, that row is directly rejected.
The next rows of B are directly discarded: {0,0,2,2,2,2,2,2,9,9,9, 11}
2.2. The hash function is applied to the join field of B. If that partition is in the memory, the joined
row is directly returned.
2.3. If that hash partition is not in the memory, it is written in a temporary segment in the form of
a partition or piece as was done with table S. Accordingly, sets of partitions, which have the
records to be joined, will be obtained.
This process is repeated until the entire data source of B is read.
3. Comparison of partitions. Partitions will now be collected two by two, one from each data
source (S and B). The Hash table of the smaller partition is created in the memory and
compared in the memory with the other partition, returning the rows that meet the join.
Step 3 is repeated until all the partitions created have been processed.
Unit 1
Module 3: Execution Plan
Which join to use?
If the rows to be returned by the join are not many (fewer than 10,000 rows approx.):
Oracle tends to use Nested Loop.
If the rows to be returned by the join are many (more than 10,000 rows approx.):
It is advisable to use the Hash Join. (Provided that the cost-based optimizer is used. If not, there
is no other option but to use the sort-merge join.)
The sort-merge join is not advisable (for many rows), due to the ordering costs, so it is
preferable to use the hash join with cost-based optimizer.
Actually, the choice Oracle makes is much more complex and does not exclusively depend on
the number of records, taking into account the costs of the E/S operations, the ordering and
hash memory areas, the existing statistics, the data model, the access indexes, etc. Here only
an overview of the choice of one plan or another is provided, although it is the optimizer that has
more information to decide on the optimal join plan (and even so it does not always make the
best decision).
Unit 2
Module 1: In the SELECT
COUNT(*) versus COUNT(1)
In versions prior to 8i, the use of count(*), count(1) or count(c1) could have difference response
times when using indexes or not.
From Oracle 8, the use of count(*) or count(1) is the same as they can use accesses by index
(fast index scan), obtaining the same response times.
Unit 2
Module 1: In the SELECT
Order of the fields in the SELECT
The order of the fields in the SELECT clause does not affect the performance of the sentences
at all.
Unit 2
Module 1: In the SELECT
S1. Fields necessary in the select
It is advisable to avoid the use of SELECT *, restricting the fields search in the "select" to the
fields that are really necessary. This reduces the volume of information accessed and
transported from the server to the client. It also facilitates quick access to indexes, compared to
access to index->table (when there is an index that allows it).
If it is not necessary to use all the fields of the table, not using the SELECT * can also be
beneficial to give more clarity when reading SQL code and avoid errors in the cursors when
assigning values to variables.
Unit 3
Module 1: In the FROM
Number of tables in the FROM
The cost-based optimizer must choose the order in the joins between the tables must be
performed (joins are always performed in pairs of tables). This choice is made during the
PARSE of the sentence, trying all possible join combinations and choosing the one with the
least cost.
When the number of tables in the FROM (number of joins) is very high, the optimizer decides
not to test all the possible combinations as it would take a long time, only testing some of the
combinations, so there is a greater probability of not choosing the best execution plan.
When the total number of tables in the FROM is more than eight, Oracle can no longer
perform all the necessary checks and non-optimal plans are chosen.
It is advisable to try to avoid putting more than 8 tables in the FROM, performing the query in
various separate SELECTs. If it is completely impossible, as a last resort, the "ORDERED" hint
could be used, ordering the tables in the FROM in accordance with the order most suitable for
performing the JOINs.
Unit 3
Module 1: In the FROM
Order of the tables in the FROM
For the cost-based optimizer (which is the one that should be used), the order of the tables in
the FROM does not affect the execution plans nor the performance of the sentence.
The only exception is when the "ORDERED" hint is used.
Unit 3
Module 1: In the FROM
Distributed sentences
Oracle enables, transparently for the user, queries of objects that are in remote databases.
However, this transparency is not maintained in the execution plans or in the performance, so
special care must be taken with sentences over distributed databases.
How distributed sentences are processed
1. Remote SQL: when all the tables in the FROM belong to remote tables in a single database,
the sentence is sent as is to the remote database, and the execution plan is obtained and
executed as if it were in local. The resulting data must go from the remote database to the local
database.
2. Distributed SQL: when in the FROM there are local and remote tables, or remote tables in
various databases, Oracle must break down the sentence, to execute the part that corresponds
to each database and also to execute the local part in the local database.
The local database "becomes the master" (if not specified to the contrary with a Hint), so it is the
database that receives all the data and is responsible for the joins, groupings and orderings.
Recommendations for distributed sentences
Although distributed access is transparent for the application, it is important to know that
the execution plans are strongly modified, and it is a good idea to analyse how the
division of the query is performed in the various databases.
It is not true that when moving from a non-distributed database to a distributed one we
should not worry. Rather the opposite is true, all mixed plans (local/remote) must be
reviewed.
Special care must be taken with the data volume to be "moved" over the network, as all
the data is fetched to the "master" server.
This data volume is not the final volume the query returns, but the volume required to
perform the joins. (Although the select returns a single record, it can make thousands of
records move to make the joins.)
Take into account the topology of the network, knowing its bandwidth and its availability.
In accordance with these parameters, evaluate the data movement and network
crashes.
In some cases, the use of views in remote databases helps to improve response times,
due to the simplification of the processes to "break down" the sentences, obtaining in
some cases better plans.
Unit 4
Module 1: In the Data Model
Excess of indexes
Indexes can allow us to improve the time taken to access information stored in tables. However,
an excess of indexes worsens the INSERT, UPDATE and DELETE operations over the table, as
in addition to storing the information in the table, the index must be updated.
This worsening of the times depends to a great extent on the degradation of the index, of the
data volume, of the quantity and frequency of the modifications in the table, of the size of the
fields to be indexed, etc. Even so, it is advisable to be concerned about the tables that
have more than six indexes, re-approaching it if truly necessary.
Unit 4
Module 1: In the Data Model
Redundant indexes
An index is considered to be redundant and should be eliminated when there is another index
that has the same columns on the left. The following indexes would be redundant:
This index is redundant because
it is contained in the one on the
right.
c1
c1, c2
c1, c2
c1, c2, c3
Unit 4
Module 1: In the Data Model
Indexes and Foreign Keys
If in the "parent" table update or delete operations are to be performed, an index must be
defined in the "child" table by the same fields of the foreign key, for two reasons:
1. Performance, improving, when performing update or delete in the "parent", the search in the
child that assures that the update or delete can be performed.
2. By blocks between the "parent" and "child" tables.
Operation
Without index in FK fields
performed on
of the "child" table
the table
"parent"
DELETE
UPDATE
table.
From Oracle 9i:
Only the records affected by
the change of the parent table
are blocked, not allowing the
fields of the foreign key to be
changed.
Unit 4
Module 1: In the Data Model
Use of composite indexes
A composite index is an index that is formed by various fields. It is important to remember that
for the composite index to be used at least the first field of the field of the index must appear in
the WHERE.
For example, I have an index formed by fields c1, c2 and c3.
In the WHERE I ask Could I use the
for:
index?
c1, c2, c3
c1, c2
c1
c2
c3
c2, c3
c1, c3
Exceptions:
1. Index Fast Full Scan: when there is an index that has all the fields specified in the
SELECT, so instead of performing a Full Table Scan, it performs a complete scan of the
index (Index Fast Full Scan).
2. An ORDER BY is requested and the fields belong to an index that, moreover, does not
allow null.
3. From Oracle 9i, there is the option of using the index by means of an "INDEX SKIP
SCAN", which in some cases can improve performance with respect to "FULL TABLE
SCAN". In any case, the use of "INDEX SKIP SCAN" is not considered to be very
effective. It is advisable to continue defining the right order in the fields of the index (a
new feature of Oracle 9i).
Unit 4
Module 1: In the Data Model
Effective indexes
Indexes are important to make data searches quicker. It is important to always create the most
effective indexes possible, in accordance with the following rules:
1. They must be created in accordance with the fields used in the WHERE.
2. They will be useful when a small set of data is searched for, within the total volume of
the table. (If a lot of data is searched for, a "Full Scan" of the table may be more
effective.)
3. It is advisable not to abuse the number of indexes over tables. ("Excess of indexes".)
4. It must not be redundant with respect to another existing index. ("Redundant indexes".)
5. For composite indexes, the fields with the greatest selectivity must be attempted to be
used as the first fields of the index. The selectivity of a field (and therefore of the index)
is measured in accordance with the repetition or not of this field within the table. For
example, in a table of employees, the selectivity of the sex, town and ID fields will be as
follows:
Field
ID
Selectivity
Very
selective
town
sex
Less selective
Not very
selective
Unit 4
Module 1: In the Data Model
Obsolete data types
The following data types are obsolete for Oracle (updated for version 9.0.2):
Obsolete data
type
It must be
replaced with
the data type
Move to
obsolete in
version
8.0.
8.0.
LONG RAW
(binary up to
2Gb)
8.0.
BLOB (binary
up to 4Gb)
For data types that are still supported by compatibility, their migration to the new data types is
very important to assure the operation of applications in future versions of Oracle and to be able
to use the improvements that the new data types provide.
Unit 4
Module 1: In the Data Model
My index is not used
Some of the reasons why my index is not used are covered below.
1. Does the index exist?
3. At least the first header of the index must appear in the WHERE (Rule "M4. Use of
composite indexes"). Exceptions:
- Index Fast Full Scan: when there is an index that has all the fields specified in the
SELECT, so instead of performing a Full Table Scan, it performs a complete scan of the
index (Index Fast Full Scan).
- An ORDER BY is requested and the fields belong to an index that, moreover, does not
allow nulls.
- From Oracle 9i, there is the option of using the index by means of an "INDEX SKIP
SCAN", which in some cases can improve performance with respect to "FULL TABLE
SCAN". In any case, the use of "INDEX SKIP SCAN" is not considered to be very
effective. It is advisable to continue defining the right order in the fields of the index (a
new feature of Oracle 9i).
4. Are the fields of the index used for the join? If so, the use of the index depends on:
- The type of JOIN that is performed. (NESTED LOOP is the only one that enables the
use of an index.)
- The order in which the JOINs are performed. Specifically, with NESTED LOOP,
depending on which table is the incoming one and which is the outgoing, the index is
used or not.
5. Is a function being applied over the indexed field? (Rule "W4. Comparers and indexes")
A function of the SUBSTR type (field1,1,2) prevents the index from being used, unless
an index based on a function is used (the index is created applying the SUBSTR
function).
Unit 4
Module 1: In the Data Model
Blocks in bitmap indexes
Bitmap indexes are a special type of Oracle index, which might be more beneficial than the
B*Trees, when the number of different values that the field can have is very small.
These indexes, for each one of the different values, will internally store a 1 or a 0 per record,
depending on whether this record has this value or not. They might occupy much more space,
but they might be much quicker.
However, they have a significant disadvantage for environments in which modifications or
insertions are made concurrently by various users.
Any DML (Data Manipulation Language) operation, such as insert, update or delete, will cause
an exclusive block in "part" of the index, which might impede modifications of that table,
provided it affects the fields that form part of the bitmap index.
Normally, they are used in a very beneficial manner in decision-making environments (data
warehouse) and in a not very beneficial manner in online transactional environments (OLTP).
Unit 5
Module 1: In the
WHERE
- The optimizer set for the two sessions that execute the two sentences must be exactly the
same (rule, first_rows, all_rows).
- The configuration of the NLS (National Language Support) must be the same for the two
sessions that execute the two sentences.
What are bind variables?
It consists of replacing the values used in the searches of the WHERE conditions with variables.
For example the following sentences:
SELECT name, surname, address FROM employees WHERE dept_id = 20;
SELECT name, surname, address FROM employees WHERE dept_id = 30;
They express in the WHERE the literals 20 and 30 as department identifiers (dept_id). These
two sentences on the ASCII level are not the same, which means they are not "shareable".
When executing them it will be compulsory to carry out a hard parse, if they were previously not
already in the cache.
However, if this sentence is changed to:
SELECT name, surname, address FROM employees WHERE dept_id = :bindA;
Every time this sentence is executed, it can be shared, increasing the probability of performing a
soft parse, regardless of the valour assigned to the variable ":bindA".
Advantages of bind variables and shareable sentences
They favour the use of the cache of sentences in the Shared Pool, reducing the consumption of
memory and CPU in the database. Especially useful in very common sentences.
Disadvantages of bind variables
In some cases when performing the parse of the sentence and obtaining the execution plan, not
knowing the value that we are asking for (the variable) makes the execution plans worse. These
cases are:
- Using histograms. The histogram helps the optimizer to know the distribution of data for a
specific value. Using bind variables, hiding the value sought, prevents the use of histograms.
- Partitioning. In partitioned tables or indexes, the search values define the partitions to be
used (going through all the partitions or only those that are necessary). When using bind
variables, the optimizer does not know the values I am looking for, so it cannot restrict the
number of partitions to be used, tending therefore to perform processes that go through all the
partitions, with the consequent degradation of the searches.
- Pseudo-dynamic sentences. It is a typical example of sentences used on free-search
screens, which are increasingly used in applications. Accordingly, these screens allow the user
to freely search in a significant number of different fields. However, the SQL sentence that the
application generates is fixed and does not adapt to what the user is searching for at all times.
This practice is not at all advisable because of the complexity of the sentences executed, the
This sentence uses the bind variables "vsurname1", "vsurname2" and "vname" correctly to
identify the search values by surnames and name.
The bind variables "flagsurname2" and "flagname" are used to determine whether we are
searching or not by second surname and by name. If these flags have a value of 0, it means
they do not search by this field, and thanks to the OR they impede the search in the rest of the
condition (surname2, customer_name). If they have a value of 1, they search by the condition
after the OR.
These flags partially improve the performance of these sentences, provided that these flags are
not used with bind variables. If we search by name of the customer, for example, the sentence
should be executed as follows:
- Queries by ranges. In some queries that ask for ranges (>, <, between), or when LIKE is
used, for example, the use of bind variables can result in worse plans. This is because the
optimizer does not know the exact value of the variable, and it cannot calculate how big the
range to be queried is. The optimizer assumes fixed estimated rules and it can result in worse
plans.
As a general rule, it is advisable to use bind variables due to their significant impact on the
performance of the database. If partitioning, histograms or pseudo-dynamic sentences are
used, they should be used in fields not affected by the partitioning, histogram or flag,
respectively. If in doubt, you can consult the Database Administration team.
Unit 5
Module 1: In the WHERE
Long INLISTs
Long INLISTs are considered to be IN clauses in which a long list of values is detailed. It is
advisable to replace long INLISTs with search tables (performing a join with a table that contains
values to be searched for).
Unit 5
Module 1: In the WHERE
Operators over indexed fields
The use in the WHERE of operators over indexed fields (operators to the left of the comparer)
may make the use of indexes impossible. Here are some examples of good and bad practices:
WHERE TRUNC(date) =
TRUNC(sysdate)
Unit 5
Module 1: In the WHERE
Comparers over indexed fields
The use in the WHERE of some comparers may make the use of indexes impossible. Here are
some examples:
Comparer
*
**
Unit 5
Module 1: In the WHERE
Implicit conversions
When in a WHERE clause the data type of the column does not match the data type of the
value asked about, Oracle can not generate an error, and internally perform a conversion
(implicit conversions).
These implicit conversions in many cases cause the non-use of the indexes.
Letting Oracle perform the implicit conversion is considered to be a bad practice. We must try to
make the field types of the variables and the fields match, or expressly perform the conversion.
For example, in the DEPT table there is an index with the field "description" (varchar2). This is
what happens:
Condition
WHERE description = '1'
WHERE description =
TO_CHAR(1)
WHERE description = 1
Unit 5
Module 1: In the WHERE
Having and Where
It is important to correctly distinguish and use the HAVING and the WHERE.
WHERE is used to filter the records to be returned.
HAVING is applied after the WHERE, with grouping functions (GROUP BY) and is used to filter groups.
It is good to always filter with the WHERE the greatest number of records so that the grouping
functions can manage fewer records, rather than not filtering with WHERE and overloading the
grouping.
Unit 5
Module 1: In the WHERE
ROWNUM
The ROWNUM clause in a WHERE is used to return only a number of records of the total of
records that meet the rest of the WHERE condition.
The ROWNUM clause can only be used with the operators "<" and "<=".
With respect to the suitable use or not of ROWNUM, in our opinion it must be used as it is the
only way we have of "cutting down" the return of records to a given total, but we must be
very clear that it is what really happens internally in Oracle, as although ROWNUM is
used, Oracle may use many more records.
First example
In the example the optimizer manages to drag the number of records from the ROWNUM in
almost all the execution plan, so it does not seem to be very damaging.
This is its execution plan with its estimation of cost and of records:
And this is the actual execution (of records), obtained with tkprof:
Let's study the execution plan obtained. As explained in SQL Overview -> Execution plan ->
Interpreting the plan, Oracle executes the sentences performing operations two by two, so:
1. Oracle performs "TABLE ACCESS FULL" on the entire VUELOS table, going through a
total of 57,711 records (not limited by ROWNUM). Then by it goes through the primary
key of the PLAZAS table, called PLA_PK, but in this case it does limit the number of
ROWNUM records (9). A Hash Join is performed on these two data blocks, returning a
total of 9 records.
2. A "TABLE ACCESS FULL" is performed on the entire COMPANIAS table, without being
limited by ROWNUM, finding a total of 8 records. Then it performs a Hash Join on the 8
COMPANIAS records and the 9 records returned in step 1. The hash join returns 9
records.
3. The "COUNT STOPKEY" is performed, which is really counting up to the 9 records that
must be returned to respect the ROWNUM<10 that appears in the WHERE clause.
Actually this COUNT in this execution does not filter, because in the previous step only
the 9 desired records were returned.
As observed in this plan, the "FULL TABLE SCAN" operations have not been limited by the
ROWNUM, whereas the operation "INDEX (FAST FULL SCAN)" has limited the analysed
records, and returning 9 records, has managed to return what was requested in the ROWNUM.
Second example
Let's see the same sentence, but in this case we have added a pair of Hints to achieve a much
worse execution plan, but it does allow us to show how the plan is not completely limited by the
ROWNUM condition.
This is its execution plan with its estimation of cost and of records:
And this is the actual execution (of records), obtained with tkprof:
This execution plan shows that to return the nine records requested by ROWNUM<10, a total of
3,313,493 records have been gone through (in various tables).
Unit 5
Module 1: In the WHERE
Antijoin
Anti-Join Overview
An antijoin is a query that returns records from a table that do not correspond to records from
another table. It is in some way the opposite operation to a join.
In Oracle there are the following ways of performing an Anti-Join:
With NOT IN
With MINUS
With OUTER-JOIN
Oracle first performs the subselect (SELECT deptno FROM emp), so as it does not have any
WHERE it goes through the entire emp table. Then it goes through the dept table, looking for
the departments that do not have any employee associated.
However, reconstructing the sentence with NOT EXISTS obtains the following result:
Oracle now performs a NESTED LOOP, so for each department (DEPT) it performs a search
BY INDEX in the employees table (EMP). So we have avoided performing the FULL SCAN of
emp, accessing now by the index of the join field.
Take into account that we assume that the volume of the EMP table is big enough to make it
worth using an index rather than a full scan of the entire table.
Unit 6
Module 1: Orderings/Groupings
Overview of ordering
Orderings are costly operations for the database, and they may be one of the major reasons
why our sentences are not as quick as we would like.
As a general rule they must be avoided when not essential.
The operations that require an ordering are:
Creation of an index
Unit 6
Module 1: Orderings/Groupings
ORDER BY
Use the ORDER BY clause only when it is important for the application to obtain ordered data.
ORDER BY is frequently used without ordering being necessary.
Unit 6
Module 1: Orderings/Groupings
DISTINCT and GROUP BY
Use the DISTINCT clause when sure that duplicated records can be returned and they are to be
eliminated. Distinct is often used when it is not necessary, forcing an ordering.
The same happens with the GROUP BY clause, unnecessarily using it without grouping
functions.
Unit 6
Module 1: Orderings/Groupings
UNION ALL versus UNION
The UNION set operators aggregate the results of various SELECTs. UNION versus UNION
ALL performs an additional ordering to eliminate repeated records.
Use UNION ALL versus UNION when we want to receive repeated records or we are sure that
we will not receive any repeated records.
Unit 7
Module 1: In the INSERT
Unit 8
Module 1: Hints
General considerations about the use of Hints
The optimizer does not always obtain the best execution plans. In some cases, knowledge of
the business and of the information stored can help to choose the best execution plans.
Hints are proposals made to the Oracle Optimizer on how to perform execution plans. The
optimizer considers them to be proposals and in some cases they might not be taken into
account by the optimizer.
The use of Hints implies the following risks:
Their use is proposed with a current volume, in which it is checked that benefits are
contributed. However, changes in volume due to the evolution of the system can cause
Hints entered at a given moment to be not suitable in a few months in accordance with
the volume changes.
The evolution of Oracle Software, modifying the hints or expanding functionality, can
cause major changes in the optimizer. It may be that in new versions of Oracle, certain
Hints will no longer be used, so we must review the sentences that use them.
Hints must be used with moderation, in cases when really essential, after discarding
other possibilities.
They force the maintenance of clear documentation on where they are used, as
periodically the sentences that use them must be reviewed, to check whether they are
still suitable (due to Oracle version changes or due to the evolution of the volume and
the data).
Unit 8
Module 1: Hints
Why does my Hint not work?
Hints are proposals made to the optimizer and in some cases they might not be taken into
account by the optimizer.
In any case, if your Hint does not work, do not forget to check that you are respecting the
following rules:
Using any Hint (except RULES) means that the cost-based optimizer must be used.
Remember that the statistics must be executed and updated.
The use of the Hint RULES (rule-based optimizer) is annulled if a functionality that
requires the cost-based optimizer is used.
Hints must not refer to the name of the schema. It can be solved by using an alias in the
table. The following example is incorrect:
If an alias is used, the hint must use the alias and not the name of the table. For
example:
SELECT /*+ FULL ( myalias ) */ empno FROM emp myalias WHERE empno > 10;
Just after the "+" of the hint there must be a space, in PL/SQL blocks.
Do not use hints that do not make sense in the sentence. For example, using the hint
"FIRST_ROWS" in a sentence that has an ORDER BY.
The name of the index is optional. If not specified, the optimizer chooses the
index. If specified, it must be the correct name, as if not, the hint is invalidated.
For distributed queries, in remote tables, the only hints that work are join order and join
type.
Unit 8
Module 1: Hints
Hints for the use of the optimizer
The hints related to the use of optimizer are:
ALL_ROWS
FIRST_ROWS
CHOOSE
RULE
ALL_ROWS
It specifies that the cost-based optimizer is used, trying to achieve the best throughput.
(Minimum consumption of resources, maybe penalising the response time.)
Format:
FIRST_ROWS
It specifies that the cost-based optimizer is used, trying to achieve the best response time,
perhaps penalising the consumption of resources.
The optimizer in FIRST_ROWS mode is more likely to use indexes, although it might mean an
increase in disk accesses. (Not always advisable.)
Format
CHOOSE
It specifies that the cost-based optimizer is used if a table has statistics. If not, the rule-based
optimizer will be used.
Format
RULE
It specifies that the rule-based optimizer is used.
Format
Unit 8
Module 1: Hints
Hints for the access method
The hints related to the table access method are:
FULL
ROWID
CLUSTER
HASH
INDEX
INDEX_ASC
INDEX_COMBINE
INDEX_JOIN
INDEX_DESC
INDEX_FFS
NO_INDEX
AND_EQUAL
USE_CONCAT
NO_EXPAND
REWRITE
NOREWRITE
FULL
It specifies that an access is performed going through the entire table (Full table scan)
Format:
ROWID
It specifies that an access is performed by means of the RowId
Format
INDEX
It specifies that a certain index is used.
Format
Unit 8
Module 1: Hints
Hints for the performance order of the JOIN
The hints related to the performance order of the JOIN are:
ORDERED
STAR
Unit 8
Module 1: Hints
Hints for the JOIN operations
The hints related to the JOIN operations are:
USE_NL
USE_MERGE
USE_HASH
DRIVING_SITE
LEADING
HASH_AJ
MERGE_AJ
HASH_SJ
MERGE_SJ
Unit 8
Module 1: Hints
Hints for parallel executions
The hints related to parallel executions are:
PARALLEL
NOPARALLEL
PQ_DISTRIBUTE
APPEND
NOAPPEND
PARALLEL_INDEX
NOPARALLEL_INDEX
Unit 8
Module 1: Hints
Other hints
Other hints, with varying uses, are provided below:
CACHE
NOCACHE
MERGE
NO_MERGE
UNNEST
NO_UNNEST
PUSH_PRED
NO_PUSH_PRED
PUSH_SUBQ
STAR_TRANSFORMATION
ORDERED_PREDICATES
Unit 9
Module 1: Programming and SQL
Commits in loop
The commit operation is a complex, but not necessarily slow, operation for the database. The
inclusion in a loop of the execution of a commit ("in each cycle") is not suitable as it overloads
the database, which causes the performance to worsen.
It should only be done when the business itself needs to assure that in each cycle of the loop
the changes have been made and are visible to the rest of the users. The use of commits after a
major volume of changes is usual, but not for every cycle.
Unit 9
Module 1: Programming and SQL
Commits and closing of cursors
In some programming languages, such as Pro*C, performing a commit in our program can
mean a closure of the cursors that are open.
It is advisable to not produce this closure of the cursors, because the process of reopening the
cursors may be costly and cause unnecessary delays.
How could it be solved?
Unit 10
Module 1: PL/SQL
Use of "Cursor FOR" instead of CURSOR loop.
Up to version 8 of Oracle, the treatment of cursors consisted of an OPEN of the cursor, a loop
that processes the data and a CLOSE that closes the cursor.
From version 8i of Oracle, there is another simpler and better performing way. It is a special
version of the FOR loop designed for cursors. The FOR performs OPEN, FETCH and CLOSE of
the cursors.
Unit 10
Module 1: PL/SQL
Use of %TYPE and %ROWTYPE.
When defining our variables in PL/SQL, to store the data of our tables, we can either indicate in
our code the data type or make it dynamic and in execution have it take the data type of the
data model.
This second option using %TYPE and %ROWTYPE enables us to ensure that a change in the
data model does not mean having to make changes to all our PL/SQL programs.
Unit 10
Module 1: PL/SQL
Unnecessary use of the DUAL
Using DUAL unnecessarily to obtain function results causes unwanted accesses to the
database. These problems are usually aggravated when this use is done within loops.
Unit 10
Module 1: PL/SQL
Improper use of SELECT COUNT
Some programmers are used to performing a SELECT COUNT to check whether a cursor will
return data. If data is returned, the cursor is opened with the data. This way of programming
forces the cursor to be executed twice.
It is preferable to open the cursor the first time and check whether there is data in the first
FETCH.
Unit 10
Module 1: PL/SQL
Explicit Cursors versus Implicit Cursors
Implicit cursors are those that are not expressly declared as CURSOR in the code. Internally
Oracle is responsible for performing all the open, fetch and close steps. They can be used in
PL/SQL only when they return a single record.
Explicit cursors are those that we declare expressly as CURSOR.
A cursor that returns a single record could be used in PL/SQL as explicit or implicit. It is
advisable to use cursors explicitly as they have better performance although their syntax is a
little more complex. Implicit cursors always force the execution of two fetches. The first to fetch
the first record and the second to make sure there are no more values (as they should only
return one record).
On the contrary, in the explicit ones I fetch the first record and close the cursor. I do not perform
two fetches (I "assume" that I am sure that the cursor does not return more than one record,
and I avoid a new fetch).
They can be closed by code, previously releasing resources, without having to wait until
the PL/SQL block completely finishes.
If the cursor has Bind Variables, they can be reopened, taking on the new values. It is
quicker to use Oracle's internal structures created in the definition of the cursor.
Unit 11
Module 1: SQL Tricks
Ranking Queries
As of Oracle 8i there are extensions to SQL, with the inclusion of analytical functions.
They include RANK and DENSE_RANK, which allow us to rank the records obtained. An
example:
We want to rank the salaries of the employees of a company:
The difference between RANK and DENSE_RANK is that with DENSE_RANK "gaps" are not
allowed in the ranking values when the ranked values are repeated.
6. Using the cost-based optimizer, would these sentences have the same execution plan?
SELECT /*+ ORDERED */ e.name, d.name , c.office_name FROM emp e, dept d, office o
WHERE e.deptid = d.deptid AND
e.officeid = o.officeid; SELECT /*+ ORDERED */
e.name, d.name , c.office_name FROM office o, emp e, dept d WHERE e.deptid = d.deptid
AND
e.officeid = o.officeid;
Yes, as the order of the tables in the FROM does not affect the execution plan.
No, as the plan always varies when the order of the tables is changed in the FROM.
It could vary as it is using the ORDERED hint.
The execution plan does not vary and is based on Oracle's internal statistics, in accordance with the
data loaded in the tables.
It is based on Oracle's internal statistics, in accordance with the data loaded in the tables.
The execution plan is exclusively based on the existing data model.
8. Which of the following SQL clauses causes a join of the anti-join type?
NOT IN
UNION
Both.
9. From how many indexes over a table is it advisable to reframe the data model and the forms
of accessing the data, seeking alternative solutions with fewer indexes?
12.
4.
6.
10. What enables assurance of the use of %TYPE and %ROWTYPE when defining variables in
PL/SQL?
A change in the data of the table does not mean we have to touch up all our PL/SQL programs.
A change in the data model does not mean we have to touch up all our PL/SQL programs.
A change in the data model means we have to touch up all our PL/SQL programs.