Documente Academic
Documente Profesional
Documente Cultură
Course Guide
Informix 12.10 Database Administration
Course code IX223 ERC 1.0
IBM Training
Preface
September, 2017
NOTICES
This information was developed for products and services offered in the USA.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for
information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to
state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any
non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive, MD-NC119
Armonk, NY 10504-1785
United States of America
The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND,
EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in
certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these
changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the
program(s) described in this publication at any time without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any manner serve as an endorsement of
those websites. The materials at those websites are not part of the materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information
concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available
sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the
examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and
addresses used by an actual business enterprise is entirely coincidental.
TRADEMARKS
IBM, the IBM logo, ibm.com, and Informix are trademarks or registered trademarks of International Business Machines Corp., registered in many
jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is
available on the web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.
Adobe, and the Adobe logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other
countries.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
PuTTY is copyright 1997-2017 by Simon Tatham.
Contents
Preface................................................................................................................. P-1
Contents ............................................................................................................. P-3
Course overview............................................................................................... P-15
Document conventions ..................................................................................... P-17
Exercises.......................................................................................................... P-18
Additional training resources ............................................................................ P-19
IBM product help .............................................................................................. P-20
Creating databases and tables ............................................................ 1-1
Unit objectives .................................................................................................... 1-3
Prerequisites ...................................................................................................... 1-4
Before you create a database............................................................................. 1-5
Database names ................................................................................................ 1-6
Database logging ............................................................................................... 1-7
No logging .......................................................................................................... 1-8
NO LOGGING still has logging ......................................................................... 1-10
Unbuffered logging ........................................................................................... 1-11
Buffered logging ............................................................................................... 1-12
MODE ANSI databases .................................................................................... 1-13
The database dbspace ..................................................................................... 1-15
Creating a database ......................................................................................... 1-16
Creating a table ................................................................................................ 1-17
Select a valid table name ................................................................................. 1-18
Extents ............................................................................................................. 1-20
Estimating row and extent sizes ....................................................................... 1-22
Managing extents ............................................................................................. 1-24
Table lock modes ............................................................................................. 1-26
Tables and dbspaces ....................................................................................... 1-28
Creating a table ................................................................................................ 1-29
Creating a table: Simple large objects .............................................................. 1-30
Creating a table: Smart large objects................................................................ 1-31
Creating a temporary table ............................................................................... 1-32
DBCENTURY ................................................................................................... 1-34
The DBSCHEMA utility ..................................................................................... 1-35
Using ONCHECK and ONSTAT ....................................................................... 1-37
Course overview
Preface overview
In this course, students will learn the basic concepts of data management with
Informix 12.10. They will learn how to create, manage, and maintain tables and
indexes; how the Informix Optimizer works; and how to use the SET EXPLAIN feature
to determine query effectiveness.
Intended audience
The main audience for this course is Informix Database Administrators. It is also
appropriate for Informix System Administrators and Informix Application Developers.
Topics covered
Topics covered in this course include:
Creating, altering, and dropping databases
Creating, altering, and dropping tables
Creating, altering, and dropping indexes
Table and index partitioning
The Informix query optimizer and access plans
Updating statistics and data distributions
Referential and entity integrity
Creating and managing constraints
Modes and violation detection
Concurrency control and locking mechanisms
Data security
Views
Triggers
Course prerequisites
Students in this course should satisfy the following prerequisites:
• IX101 - Introduction to Informix terminology and data types (or equivalent
experience or knowledge)
• Knowledge of Structured Query Language (SQL)
• Experience using basic Linux functionality
Course Environment
The environment provided in this course is implemented as a virtual image deployed in
Skytap.
Document conventions
Conventions used in this guide follow Microsoft Windows application standards, where
applicable. As well, the following conventions are observed:
• Bold: Bold style is used in demonstration and exercise step-by-step solutions to
indicate a user interface element that is actively selected or text that must be
typed by the participant.
• Italic: Used to reference book titles.
• CAPITALIZATION: All file names, table names, column names, and folder names
appear in this guide exactly as they appear in the application.
To keep capitalization consistent with this guide, type text exactly as shown.
Exercises
Exercise format
Exercises are designed to allow you to work according to your own pace. Content
contained in an exercise is not fully scripted out to provide an additional challenge.
Refer back to the material in the unit and to your instructor if you need assistance
with a particular task. The exercises are structured as follows:
The purpose section
This section presents a brief description of the purpose of the exercise, followed by
a series of tasks. These tasks provide information to help guide you through the
exercise. Within each task, there may be numbered questions relating to the task.
Complete the tasks by using the skills you learned in the unit. If you need more
assistance, you can refer to your instructor or to the solutions section for more
detailed instruction.
The task sections
Each task section has a title with the overall goal of the task. This is followed by the
steps that describe what you need to do to meet the goal.
The solutions sections
The solutions section contains the solution to each task / task step that is to be
performed. You can refer to this section to get further information on how to complete a
task or task step.
Task- You are working in the product and IBM Product - Help link
oriented you need specific task-oriented help.
Informix (v12.10)
Unit objectives
• Review prerequisites
• Create databases and tables
• Determine database logging and storage requirements
• Locate where the database server stores a table on disk
• Create temporary tables
• Locate where the database server stores temporary tables
• Use the system catalog tables to gather information
• Use the dbschema utility
Unit objectives
Prerequisites
• Basic UNIX knowledge
• Basic Structure Query Language (SQL) knowledge
• dbaccess
• Informix terminology
• Informix data types
Prerequisites
Before you start working with your Informix database, you should have basic UNIX
knowledge. You should be familiar with basic SQL, as well as working with dbaccess.
You should also be familiar with the terminology used with Informix and various Informix
data types. Some of these terms and data types are summarized in Appendix A
(Terminology) and Appendix B (Data types). Please review this material. If you have
any questions, or need any further information, please contact your instructor.
Database names
• Maximum of 128 bytes (characters)
• Valid names can consist of:
Letters: A to Z, a to z
Digits: 0 - 9
Underscore ( _ )
• Must be unique among all databases within the database server
Database names
When you select a database name, you can choose any combination of letters, digits,
and the underscore character. If you use a non-default locale, database names can
contain any alphabetic characters that the locale supports. (Note: The first character of
a database name cannot be a digit (0-9) or a dollar sign ($)).
Database names cannot include hyphens, spaces, or other non-alphanumeric
characters. Database names cannot exceed 128 characters in length.
Each database name must be unique within its database server.
Database names are not case sensitive.
Database logging
• Logging involves:
Recording information about transactions in the logical log buffers in
memory
Flushing the logical log buffers to logical log files located on disk
• Types of Logging:
No Logging
TRANSACTION
Unbuffered
Buffered
Mode ANSI
begin work;
insert record;
begin work;
update record; insert record;
commit work; update record;
commit work;
Database logging
Every database that the database server manages has a logging status. The logging
status indicates whether the database uses transaction logging and, if so, which log-
buffering mechanism the database uses.
The four types of database logging are:
• No logging
• Buffered logging
• Unbuffered logging
• Mode-ANSI
No logging
• UPDATE, INSERT, DELETE records are NOT written to logical logs
• Data definition language (DDL) is written to logical log
CREATE TABLE T1;
INSERT INTO T1;
UPDATE T1;
ALTER TABLE T1;
DELETE FROM T1;
INSERT INTO T1;
DROP TABLE T1;
No logging
Informix allows you to create a database with no transaction logging.
If you create a database with no transaction logging, data manipulation language (DML)
records, such as UPDATE, INSERT, and DELETE, are not written to the logical logs.
Data definition language (DDL) records, such as CREATE, ALTER, and DROP, are
written to the logical log.
While it is recommended that production databases always use transaction logging, you
might want to create your database without logging, load all of its tables, and then turn
on logging. This significantly reduces the time required to load the database and
prevent long transactions.
A database without logging cannot be fully recovered when you have to restore the
system from a backup. Normally, after you apply a backup to recover a system, you
apply the logical-log files to recreate any transactions that committed after the backup.
Since the logs do not contain any record of the data manipulation operations that were
performed after the backup in a no-logging database, these transactions are lost.
You can enable logging for a no-logging database using either the ontape utility or the
ondblog and onbar utilities.
For example, if the database stores currently uses no-logging, you can change it to
buffered logging as follows:
ontape -s -B stores
or
ondblog buf stores
onbar -b -F
Unbuffered logging
• All statements are written to the logical logs
• COMMIT flushes the logical log buffer
INSERT;
INSERT;
COMMIT;
UPDATE;
COMMIT;
INSERT; INSERT;
INSERT;
COMMIT;
INSERT;
COMMIT;
UDPATE; COMMIT; UPDATE;
COMMIT;
Unbuffered logging
When a database is created with unbuffered logging, all transaction activity is written to
the logical log buffers in shared memory and then flushed to disk when the COMMIT (or
COMMIT WORK) statement is executed. This ensures that all completed work is saved
on disk and guarantees all committed transactions can be successfully recovered
following any type of system failure.
Buffered logging
• All statements are written to the logical logs
• Full logical log buffer flushes the logical log buffer
INSERT;
INSERT;
COMMIT;
UPDATE;
COMMIT;
Buffered logging
If the database is created with buffered logging, all transaction activity is written to the
logical log buffers in shared memory. These buffers in memory are flushed to logical
logs on disk as they become full.
The advantage of flushing the logical-log buffer when it becomes full is to reduce the
number of physical I/Os performed. Because a physical I/O is a relatively expensive
operation, this can improve database performance. The disadvantage of buffered
logging is, should a system crash occur, whatever is contained in the logical-log buffer
was not written to disk and is lost.
You can change the logging mode for a database from buffered to unbuffered at any
time.
If an instance contains both unbuffered log databases and buffered log databases, the
logical log buffers are flushed each time a COMMIT (or COMMIT WORK) statement is
executed for an unbuffered log database, or when they become full. They are also
flushed to disk whenever a checkpoint occurs, and whenever a connection is closed.
• Owner naming is enforced for MODE ANSI databases. Unless you are the table
or synonym owner, you must qualify the owner name in every SQL statement, for
example:
SELECT tabname
FROM 'informix'.systables
WHERE tabid > 99
• The default read isolation level for a MODE ANSI database is repeatable read.
This must be considered carefully by database administrators as repeatable read
requires every row that is read to be locked and can negatively affect concurrent
user access to data.
• No default table or synonym privileges are granted to the user PUBLIC.
Environment variable: NODEFDAC
For databases that are not MODE ANSI, the database server automatically grants table
level SELECT, INSERT, UPDATE, DELETE, and INDEX privileges to group PUBLIC.
To prevent default table privileges from being granted, set the NODEFDAC
environment variable before you create the table. A korn shell example is shown here:
export NODEFDAC=yes
Isolation levels and database and table-level privileges are discussed in greater detail in
a later unit.
db1 rootdbs
database
db2
database dbspace1
dbspace2
Creating a database
• Create a no-logging database in dbspace db_dbs:
CREATE DATABASE db IN db_dbs;
• Create a database with unbuffered logging in dbspace db_dbs:
CREATE DATABASE db IN db_dbs WITH LOG;
• Create a database with buffered logging in dbspace db_dbs:
CREATE DATABASE db IN db_dbs WITH BUFFERED LOG;
• Create a MODE ANSI database in dbspace db_dbs:
CREATE DATABASE db IN db_dbs WITH LOG MODE ANSI;
Creating a database
The examples show how to create a database using different logging modes.
Creating a table
rootdbs
database stores
dbspace2
Creating a table
When creating a new table, the administrator must:
• Select a valid table name
• Identify appropriate columns and data types
• Determine first and next extent sizes
• Select the lock mode
• Choose dbspace location
• Determine if the table is to be logged
To use delimited identifiers, Informix requires that you set the DELIMIDENT
environment variable.
For example, using the UNIX korn shell, execute the command:
export DELIMIDENT=ON
Extents
• An extent is a collection of physically contiguous pages on a disk
• Space for tables is allocated in extents
• Extent sizes for a table are specified when the table is created
page 0 page 1
bitmap page free page
page 2 page 3
free page free page
page 4 page 5
free page free page
page 6 page 7
free page free page
Extents
Disk space for a table is allocated in units called extents. An extent is an amount of
contiguous space on disk; the amount is specified for each table when the table is
created. Each table has two extent sizes associated with it:
EXTENT SIZE The size of the first extent allocated for the table.
This first extent is allocated when the table is
created. The default is eight pages.
When an extent is added, all pages are flagged as FREE except for one or more
bitmap pages. When the first extent allocated has no more space (that is, all pages
contain data), another extent is allocated for the table; when this extent is filled, another
extent is allocated, and so on.
Regardless of the logging mode, extent allocation, extent merging, and other extent
operations are always logged.
Tblspace
All the extents allocated in a specific dbspace for a given table are logically grouped
together and are referred to as a tblspace. While the space within an extent is
guaranteed to be contiguous, the space represented by the tblspace might not be
contiguous as extents can be spread across a device as space permits.
Extent size
The minimum size for an extent is four pages. There is no maximum size. An extent
size must be an even multiple of the page size for the system.
It is important to calculate extent requirements for your tables.
If rowsize is greater than pageuse, the database server divides the row
between pages. The initial portion of the row is the homepage. Subsequent
portions are stored in remainder pages.
The size of the table is calculated as:
number_of_data_pages = number_of_homepages +
number_of_remainder_pages
5. Calculate the total space required in kilobytes:
(number_of_data_pages * pagesize)/1024
To calculate an appropriate NEXT SIZE value for successive extents allocated for the
table, apply steps 1 through 5, but instead of using the initial number of rows, use the
number of rows by which you anticipate the table will grow over time. Also, be sure to
consider how much disk space you have available and whether you plan to add
additional disks to the system at a specific time.
Example
Assume that your table initially has 1,000,000 rows and is expected to grow
between 10 percent and 30 percent per year. Also, assume that you have
budgeted to purchase more disks in 12 months and that you will reload your
database to distribute it over existing and new devices at that time.
Given these assumptions, you might want to size additional extents to hold
100,000 rows. If your table grows at 10 percent per year, the database server
only allocates one extent during the year. If your table grows at 30 percent, the
database server might have to allocate 3 or 4 additional extents. In either
case, the number of extents allocated will be small enough to avoid affecting
performance, or the need to reorganize your table, before the scheduled
maintenance period.
Variable length rows (VARCHAR, LVARCHAR, and NVARCHAR) introduce uncertainty
into the calculations. When Informix allocates space to rows of varying size, it considers
a page to be full when no room exists for an additional row of the maximum size.
Managing extents
existing extent
Concatenation expanded extent
new extent
Managing extents
Informix implements several features to simplify extent management for the database
administrator.
Concatenation
The first of these features is automatic extent concatenation. When an extent can be
allocated that is next to (contiguous with) the most recently allocated extent, Informix
automatically concatenates the new extent to the existing extent to make one extent.
You most often see automatic extent concatenation when you perform batch loads on
one table at a time. Since each new extent allocated is contiguous to the previous
extent, the two extents are concatenated. This makes it possible to load a very large
table by using a default extent size and end up with the entire table contained in a
single extent.
Doubling
Another feature that database server implements to ease the burden of extent
management is automatic extent doubling. Each time the number of extents allocated
for a particular tblspace reaches a multiple of 16, the database server doubles the size
of each successive extent allocated.
Manual modification
If the Database Administrator (DBA) detects that the initial extent-size specification for a
table was inadequate and that many extents are being allocated, the size of successive
extents can be increased by using the ALTER TABLE command. This does not alter
any existing extents, so if a very large number of extents was allocated it might be wise
to rebuild the table.
Space limitation
When Informix allocates a new extent, it always attempts to locate contiguous free
space equal to or larger than the requested extent size. If no contiguous free space
remains in the dbspace large enough to accommodate the extent size, database server
allocates the largest remaining segment of continuous free space, even though it might
be less than the current extent size for the tblspace.
SYSEXTENTS
To collect information about the extents allocated for a given table or index, query the
system catalog table sysmaster:sysextents.
Page-level locking
Page-level locking locks an entire data page and an entire index page. Page-level
locking can reduce the level of concurrency, but might be beneficial for tables that are
always processed in physical order. For example, a temporary table built for processing
month-end financial data is a likely candidate for page-level locking. If the rows are
processed sequentially by a single batch application, concurrency is not an issue, and
page-level locking reduces the number of locks that must be acquired.
Changing the lock mode
To change the lock mode for a table, execute an ALTER TABLE statement. No physical
change is made to the table, but the locklevel column in systables catalog table is
updated to reflect the new mode.
Setting the default lock mode
You can set a default lock mode to use for all newly created tables by setting either a
configuration parameter or an environment variable.
To override the server default of page-level locking, add a configuration parameter
called DEF_TABLE_LOCKMODE and set it to ROW.
If you only want to override the server settings for the current session, set an
environment variable called IFX_DEFAULT_TABLE_LOCKMODE to ROW or PAGE,
as shown by this example:
$ export IFX_DEF_TABLE_LOCKMODE=ROW
The IFX_DEFAULT_TABLE_LOCKMODE overrides the default set by the
DEF_TABLE_LOCKMODE configuration parameter, and the LOCK MODE option of
the CREATE TABLE or ALTER TABLE commands overrides both the configuration
parameter and the environment variable settings.
Page-level locking is the default mode if no lock mode is specified, either explicitly or by
setting the default lock mode.
instance
Creating a table
CREATE TABLE orders(
order_num SERIAL NOT NULL,
customer_num INTEGER, Columns and data types
order_date DATE
)
IN dbspace1 Location of the table
Creating a table
The CREATE TABLE statement:
• Assigns a name to the table that is unique within the database
• Inserts the table and column information into the systables and syscolumns
system catalog tables
• Allocates contiguous storage space, as specified by the EXTENT SIZE clause, in
the database dbspace or the dbspace specified
• Sets the lock level for the table
The example CREATE TABLE statement creates a table named orders with three
columns (order_num of data type SERIAL, customer_num of data type INTEGER, and
order_date of data type DATE). The order_num column is a required value (NOT
NULL); the customer_num column and the order_date column are optional. The table is
placed in the dbspace dbspace1. Initially, 64 kilobytes are allocated for the first extent.
Each successively added extent is 32 kilobytes in size. Table locking is performed at
the row level.
The DBSPACETEMP environment variable can be set to one or more of the specifically
designated temporary dbspaces. If the DBSPACETEMP environment variable is not
set, the database server uses the value of the DBSPACETEMP configuration
parameter.
DBCENTURY
• Today’s date: 10/31/1998
• Dates stored with different DBCENTURY settings
DBCENTURY
The environment variable DBCENTURY allows selection of the appropriate century
for two-digit year DATE and DATETIME values.
Acceptable values for DBCENTURY are: P, F, C, or R.
P Past. The year is expanded with both the current and past centuries.
The closest date before today’s date is chosen.
F Future. The year is expanded with both the current and future
centuries. The closest date after today’s date is chosen.
C Closest. The past, present, and next centuries are used to expand the
year value. The date closest to today's date is used.
-p Print only GRANT statements for the user listed. Specify ALL
username in the place of username for all users.
If you specify a filename at the end of the command, all output is redirected to that file,
which can then be executed as an SQL script. Otherwise, output is sent to the standard
output destination.
Exercise 1
Create databases and tables
• create databases in specific locations
• create regular and temporary tables
• create tables for storing large objects
• use the dbschema utility for generating database and table schemas
Exercise 1:
Create databases and tables
Purpose:
In this exercise, you will create databases and tables, including tables with
large object data types. You will also use the dbschema utility for generating
database and table schemas.
8. The SQL will be executed. Since it includes an error, an error message will be
displayed at the bottom of the screen. (There is no table in the database named
systable.) To correct the error, choose Modify from the menu. You can either
highlight it and press Enter, or just type in the letter m.
9. The cursor will be displayed in the typing area, as close to the error as possible.
Refer to the MODIFY menu at the top of the screen for editing options. Correct
the error so the SQL reads as follows:
SELECT * FROM SYSTABLES;
10. Press the ESC key on the keyboard. The SQL menu will be displayed.
11. Execute the SQL statement by choosing Run. If there are no further errors, the
first page of the results will be displayed.
12. Page through the output by choosing Next from the DISPLAY menu. You can
either highlight it and press enter, or just type in the letter n.
13. Continue paging through the output. When you have seen enough, choose Exit
from the DISPLAY menu. You can either highlight it and press enter, or just
type the letter e. The SQL menu will be displayed.
14. To save the SQL, choose Save from the SQL menu. You can either highlight it
and press Enter, or just type the letter s. The SAVE>> prompt will be displayed.
15. At the SAVE>> prompt, type in the filename into which you want to save the
SQL statement, in this case EXAMPLE, and then press Enter. The SQL menu
will be displayed.
16. To exit out of the editing options, choose the Exit option from the SQL menu.
You can either highlight it and press Enter, or just type the letter e. The
DBACCESS menu will be displayed.
17. To exit out of dbaccess, choose the Exit option from the SQL menu. You can
either highlight it and press Enter, or just type the letter e. The Unix prompt will
be displayed.
18. Enter dbaccess again. Choose the sysmaster database. Choose the Query-
language option. The SQL menu will be displayed.
19. To open an existing SQL file, select the Choose option from the SQL menu.
You can either highlight it and press Enter, or just type the letter c. A list of your
saved files will be displayed.
20. From the file list, choose the EXAMPLE file by highlighting it and pressing
Enter. The contents of the file will be loaded into the typing area. Use the
previously discussed commands to run the SQL and view the output.
21. Exit dbaccess and return to the UNIX prompt.
7. Use oncheck -pe to determine the location of the table extents. Save the output
to a file to compare later.
Hint: If you do not specify the dbspace when creating a table, where is the table
stored?
Hint: When using oncheck to create an output file of your tables for a specific
dbspace, use the following command:
$ oncheck -pe dbspace_name > filename
Have the tables been efficiently located and loaded in the best possible
manner?
Since your tables are already created, you can query systables table to get
information that is needed to calculate extent sizes. What information would that
be?
8. Drop each of your tables using the following command:
DROP TABLE table_name;
9. Use the following information to recreate your customer table:
• Calculate the first extent size to initially store 1200 rows.
• Calculate a next extent size assuming that your table will grow approximately
10 percent per year for one year.
• Use row-level locking.
• Locate the table in dbspace2.
10. Use the following information to recreate your orders table:
• Calculate the first extent size to initially store 500 rows.
• Calculate a next extent size assuming that your table will grow approximately
10 percent per year for one year.
• Use row-level locking.
• Locate the table in dbspace3.
Exercise 1:
Create databases and tables - Solutions
Purpose:
In this exercise, you will create databases and tables, including tables with
large object data types. You will also use the dbschema utility for generating
database and table schemas.
12. At the "IDS-12.10 dev:" prompt, enter any valid Linux or Informix command. For
example,
dbaccess
13. When you are finished using the terminal window, enter the following at the
"IDS-12.10 dev:" prompt:
exit
Then enter the following at the "docker@default:~$" prompt:
exit
14. When you are finished using the virtual machine, click the X in the upper right-
hand corner. The "VMWare Workstation" dialog box will be displayed. Click on
either the "Suspend" button or the "Power off" button.
Task 2. Using dbaccess.
In this task, you will familiarize yourself with the dbaccess functions for entering,
running, saving, and loading SQL statements.
1. At the UNIX prompt, enter the following command:
dbaccess
This will open the dbaccess editior, with the menu at the top.
2. From the dbaccess menu, choose Query-language. You can either highlight it
and press enter, or just type in the letter q. The SELECT DATABASE prompt
will be displayed.
3. At the SELECT DATABASE prompt, select sysmaster@dev by highlighting it
and pressing Enter. The SQL menu will be displayed.
4. From the SQL menu, choose New. You can either highlight it and press Enter,
or just type in the letter n. The cursor will move to the typing area.
5. In the typing area, type in the following SQL statement:
SELECT * FROM SYSTABLE;
Refer to the top of the screen for editing options (ESC done editing, CTRL-X
delete a character, and so on.
6. After entering the SQL, press the ESC key on the keyboard. The SQL menu will
be displayed.
7. To execute the SQL, choose Run. You can either highlight it and press enter, or
just type in the letter r.
8. The SQL will be executed. Since it includes an error, an error message will be
displayed at the bottom of the screen. (There is no table in the database named
systable.) To correct the error, choose Modify from the menu. You can either
highlight it and press Enter, or just type in the letter m.
9. The cursor will be displayed in the typing area, as close to the error as possible.
Refer to the MODIFY menu at the top of the screen for editing options. Correct
the error so the SQL reads as follows:
SELECT * FROM SYSTABLES;
10. Press the ESC key on the keyboard. The SQL menu will be displayed.
11. Execute the SQL statement by choosing Run. If there are no further errors, the
first page of the results will be displayed.
12. Page through the output by choosing Next from the DISPLAY menu. You can
either highlight it and press enter, or just type in the letter n.
13. Continue paging through the output. When you have seen enough, choose Exit
from the DISPLAY menu. You can either highlight it and press enter, or just
type the letter e. The SQL menu will be displayed.
14. To save the SQL, choose Save from the SQL menu. You can either highlight it
and press Enter, or just type the letter s. The SAVE>> prompt will be displayed.
15. At the SAVE>> prompt, type in the filename into which you want to save the
SQL statement, in this case EXAMPLE, and then press Enter. The SQL menu
will be displayed.
16. To exit out of the editing options, choose the Exit option from the SQL menu.
You can either highlight it and press Enter, or just type the letter e. The
DBACCESS menu will be displayed.
17. To exit out of dbaccess, choose the Exit option from the SQL menu. You can
either highlight it and press Enter, or just type the letter e. The Unix prompt will
be displayed.
18. Enter dbaccess again. Choose the sysmaster database. Choose the Query-
language option. The SQL menu will be displayed.
19. To open an existing SQL file, select the Choose option from the SQL menu.
You can either highlight it and press Enter, or just type the letter c. A list of your
saved files will be displayed.
20. From the file list, choose the EXAMPLE file by highlighting it and pressing
Enter. The contents of the file will be loaded into the typing area. Use the
previously discussed commands to run the SQL and view the output.
21. Exit dbaccess and return to the UNIX prompt.
3. Create a stock table (type the code and Run) in your database with the
following columns:
Column Description
name
stock_num Manufacturer stock number that identifies the specific item. It is a
number, no greater than 1,000.
manu_code Manufacturer code for the item. The code is 3 characters in length.
description Item description. Allow for a length of 15.
unit_price Item price per unit. It has a maximum of 6 digits, including 2
decimal places.
unit Unit by which the item is ordered. Allow for a length of 4.
unit_descr Description of the unit. Allow for a length of 15.
CREATE TABLE stock (
stock_num SMALLINT,
manu_code CHAR(3),
description CHAR(15),
unit_price MONEY(6,2),
unit CHAR(4),
unit_descr CHAR(15)
)
;
4.Create an orders table (type the code and Run) in your database with the
following columns:
Column name Description
order_num Number starting at 1001 and increasing by 1 for every order.
order_date Date the order is placed.
customer_num Customer number this order belongs to. This refers to the
customer number in the customer table.
po_num Customer purchase order number. It can contain letters and
numbers. Allow for a length of 10.
ship_date Date the order is shipped.
ship_weight Shipping weight of the order. It contains a maximum of 8 digits,
including 2 decimal places.
ship_charge Shipping charge amount. It contains a maximum of 6 digits,
including 2 decimal places.
paid_date Date the order is paid.
CREATE TABLE orders (
order_num SERIAL(1001),
order_date DATE,
customer_num INTEGER,
po_num CHAR(10),
ship_date DATE,
ship_weight DECIMAL(8,2),
ship_charge MONEY(6,2),
paid_date DATE
)
;
5. Create an items table (type the code and Run) in your database with the
following columns:
Column Description
name
item_num Numeric identifying the individual line number of this item. It has a
maximum value of 1,000.
order_num Order number this item belongs to. This refers to the order number
in the orders table.
stock_num Stock number for the item. This refers to the stock number in the
stock table.
manu_code Manufacturer code for the item ordered. This refers to the
manufacturer code in the stock table.
quantity Quantity ordered. This has a maximum value of 2,000.
total_price Quantity ordered times unit price (from the stock table). Allow for a
maximum of 8 digits including 2 decimal places.
CREATE TABLE items (
item_num SMALLINT,
order_num INTEGER,
stock_num SMALLINT,
manu_code CHAR(3),
quantity SMALLINT,
total_price MONEY(8,2)
)
;
6. Run the load script load.sql. This script loads the tables you created. Use the
following command to execute the load script:
$ dbaccess stores_demo load.sql
Use oncheck -pe to determine the location of the table extents. Save the output
to a file to compare later.
Hint: If you do not specify the dbspace when creating a table, where is the table
stored? It is stored in the same dbspace as the database (in this case,
dbspace1).
Hint: When using oncheck to create an output file of your tables for a specific
dbspace, use the following command:
$ oncheck -pe dbspace_name > filename
For example:
$ oncheck –pe dbspace1 > D1T3S6 (represents Demo 1, Task 3, Step 6)
Have the tables been efficiently located and loaded in the best possible
manner?
No, the tables are all stored in dbspace1, and may have multiple
fragments interleaved with fragments of other tables.
Since your tables are already created, you can query systables table to get
information that is needed to calculate extent sizes. What information would that
be? Note: nrows may show up as 0 since UPDATE STATISTICS has not been
run yet.
Assuming the structure given above and a 2-kilobyte page size, the initial extent
calculation is:
-Initial rows = 500
-Rowsize = 2 + 4 + 2 + 3 + 2 + 5 + 4 (slot) = 22
-Pageuse = 2048 - 28 = 2020
Since rowsize is less than the page size:
-Rows per page = (pageuse/rowsize) = 2020 / 22 = 91.81 ==> 91
-Number of data pages required = 500 / 91 = 5.494 ==> 6
-Number of kilobytes required = (6 * 2048) / 1024 = 12
The EXTENT SIZE is 12.
Assuming that the table will grow by 10 percent per year, the DBA can calculate
the NEXT SIZE by applying the calculations above for growth rows:
-Number of rows at 10 percent growth = 500 / 10 = 50
-Number of data pages required = 50 / 91 = 0.549 ==> 1
However, the minimum extent size is 4 pages, so the NEXT SIZE must be:
-Number of kilobytes required = (4 * 2048) / 1024 = 8
The NEXT SIZE is 8.
Assuming the structure given above and a 2-kilobyte page size, the initial extent
calculation is:
-Initial rows = 74
-Rowsize = 2 + 3 + 15 + 4 + 4 + 15 + 4 (slot) = 47
-Pageuse = 2048 - 28 = 2020
Since rowsize is less than the page size:
-Rows per page = (pageuse/rowsize) = 2020 / 47 = 42.98 ==> 42
-Number of data pages required = 74 / 42 = 1.761 ==> 2
-Number of kilobytes required = (2 * 2048) / 1024 = 4
However, the minimum extent size is 4 pages, so the EXTENT SIZE must be:
-Number of kilobytes required = (4 * 2048) / 1024 = 8
The EXTENT SIZE is 8.
Assuming that the table will grow by 10 percent per year, the DBA can calculate
the NEXT SIZE by applying the calculations above for growth rows:
-Number of rows at 10 percent growth = 74 / 10 = 7.4
-Number of data pages required = 7.4 / 42 = 0.176 ==> 1
However, the minimum extent size is 4 pages, so the NEXT SIZE must be:
-Number of kilobytes required = (4 * 2048) / 1024 = 8
For example:
$ oncheck –pe dbspace1 > D1T3S6 (represents Demo 1, Task 3, Step 13)
• Query the systables system catalog for the customer table for the
following:
• First extent size
• Next extent size
• Type of table locking
• Row size
Is the row size correct? Why or why not?
The rowsize is correct for the data columns, but it does not include the
extra four bytes for the slot table entry.
SELECT * FROM systables
WHERE tabname = "customer";
Assuming the structure given above and a 2-kilobyte page size, the initial extent
calculation is:
Initial rows = 74
Rowsize = 4 + 2 + 56 + 56 + 256 + 4 (slot) = 378
Use the maximum length value for the cat_advert column. In this case it is 255
bytes, plus 1 for the length byte for the VARCHAR.
Pageuse = 2048 - 28 = 2020
Since rowsize is less than the page size:
Rows per page = (pageuse/rowsize) = 2020 / 378= 5.34 ==> 5
Number of data pages required = 74 / 5= 14.8 ==> 15
Number of kilobytes required = (15 * 2048) / 1024 = 30
The EXTENT SIZE is 30;
Assuming that the table will grow by 10 percent per year, the DBA can calculate
the NEXT SIZE by applying the calculations above for growth rows:
Number of rows at 10 percent growth = 74 / 10 = 7.4
Number of data pages required = 7.4 / 5 = 1.48 ==> 2
However, the minimum extent size is 4 pages, so the NEXT SIZE must be:
Number of kilobytes required = (4 * 2048) / 1024 = 8
The NEXT SIZE is 8.
This only calculates the extent sizes for the data rows, and does not include the
storage space required for the cat_descr and cat_picture objects themselves.
These objects will be stored in the same tablespace as the data rows, so their
size estimates must also be included.
2. Load the catalog table using the load script loadcat.sql.
$ dbaccess stores_demo loadcat.sql
Unit summary
• Review prerequisites
• Create databases and tables
• Determine database logging and storage requirements
• Locate where the database server stores a table on disk
• Create temporary tables
• Locate where the database server stores temporary tables
• Use the system catalog tables to gather information
• Use the dbschema utility
Unit summary
Informix (v12.10)
Unit objectives
• Drop a database
• Drop a table
• Alter a table
• Convert a simple large object to a smart large object
Altering and dropping databases and tables © Copyright IBM Corporation 2017
Unit objectives
Altering a table
• The ALTER TABLE statement allows you to:
Add new columns to the end of a table
Add new columns before another column in the table
Add a new column with a NOT NULL constraint or a default value
Drop columns
Add and drop integrity constraints
Modify the definition of an existing column
Change the size of successive extent allocations
Change the lock mode for a table
• Must have exclusive access to table:
Places exclusive lock
Duration of lock depends on type of ALTER TABLE
Altering and dropping databases and tables © Copyright IBM Corporation 2017
Altering a table
Informix offers a robust set of ALTER TABLE capabilities. It uses one of three
sophisticated algorithms to execute ALTER TABLE statements. The three algorithms
are:
• Fast alter
• In-place alter
• Slow alter
All ALTER TABLE statements require an exclusive lock on the table being altered, but
the duration of the lock depends on which alter method is used.
Fast ALTER
A fast alter is performed when the ALTER TABLE command:
Modifies the lock mode
ALTER TABLE orders
LOCK MODE (ROW);
Changes the next extent size
ALTER TABLE customer
NEXT SIZE 20;
Adds or drops a constraint
ALTER TABLE manufact
DROP CONSTRAINT con_name;
Altering and dropping databases and tables © Copyright IBM Corporation 2017
Fast ALTER
When the ALTER TABLE statement performs an alter operation that does not affect
the table data, Informix performs a fast alter. Only the system catalog tables are
updated, since there is no need to modify any data pages. The table is unavailable
to users only for the brief time required to execute the update operation on the
system catalog tables.
Internally, a fast alter is performed when the ALTER TABLE statement modifies the lock
mode of an existing table, changes the next extent size of an existing table, or adds or
drops a constraint. (Adding and dropping a constraint are covered in a future unit.)
In-place ALTER
Generally, an in-place alter is performed when the ALTER TABLE
command:
Adds a column or list of columns
ALTER TABLE customer
ADD birthday DATE;
Drops a column
ALTER TABLE customer
DROP birthday;
Modifies the data type of a column
ALTER TABLE customer
MODIFY birthday DATETIME YEAR TO MINUTE;
Modifies a column that is part of a fragmentation expression
Altering and dropping databases and tables © Copyright IBM Corporation 2017
In-place ALTER
For most ALTER TABLE statements that actually modify rows and affect data pages,
in-place alter logic is applied. This sophisticated algorithm allows the server to simply
record the alterations in the system catalog tables, delaying the overhead of rewriting
data pages until other modifications necessitate page updates.
Table definition versions
The in-place alter table algorithm accomplishes the alter operation by creating a
new version of the table definition. Each data page is associated with a version.
After the in-place ALTER TABLE statement, new rows are inserted into data pages
with the new version only. When rows on old pages are updated, all the rows on the
data page are updated to the new version, if there is enough room. If there is not
enough room, the row is deleted from the old page and inserted into a page with the
new version. Up to 255 versions of a table definition are allowed by the database
server. Information about versioning is available by using the oncheck utility:
oncheck -pT database:table
Each subsequent in-place ALTER TABLE statement on the same table takes more
time to execute. Informix recommends no more than 50 to 60 outstanding alters on
a table. If you want to eliminate multiple versions of a table, force an immediate
change to all rows. For example, use a dummy UPDATE statement that sets the
value of a column to itself.
Logging and in-place alter
The ALTER TABLE statement, like all DDL statements, creates log entries even
when the database is not logged. Using the in-place alter algorithm, each data page
is logged at the time that the change physically takes place (that is, when a row is
inserted or updated).
An in-place alter DOES NOT occur on fragmented tables that use ROWIDs.
Slow ALTER
A slow ALTER is performed when the ALTER TABLE command:
• Adds or drops a column created with the ROWIDS or CRCOLS
keyword
• Drops a column of data type TEXT or BYTE
• Modifies the data type of a column in such a way that possible values
of the old type cannot be converted to the new type
• Modifies the data type of a column in a FRAGMENT clause requires
value conversion that might cause rows to move to another fragment
Altering and dropping databases and tables © Copyright IBM Corporation 2017
Slow ALTER
There are a number of situations where Informix must perform a slow alter instead
of a fast ALTER or an in-place ALTER, including the ones listed here.
There are important considerations which must be taken into account when doing a
slow ALTER. These considerations are discussed on the following page.
Altering and dropping databases and tables © Copyright IBM Corporation 2017
Altering and dropping databases and tables © Copyright IBM Corporation 2017
Altering and dropping databases and tables © Copyright IBM Corporation 2017
Altering and dropping databases and tables © Copyright IBM Corporation 2017
When a table is renamed, references to the table within any views are changed. The
table name is replaced if it appears in a trigger definition. It is not replaced if it is inside
any triggered actions. The RENAME TABLE command operates on synonyms as well
as tables.
Column and table names within the text of routines are not changed by RENAME
COLUMN or RENAME TABLE. The routine returns an error when it references a non-
existent column or table.
Altering and dropping databases and tables © Copyright IBM Corporation 2017
Altering and dropping databases and tables © Copyright IBM Corporation 2017
Exercise 2
Alter and drop databases and tables
• create and drop databases
• create, drop, and alter tables
• convert simple large objects to smart large objects
Altering and dropping databases and tables © Copyright IBM Corporation 2017
Exercise 2:
Alter and drop databases and tables
Purpose:
In this exercise, you will alter and drop databases and tables. You will also
convert a simple large object to a smart large object.
3. Alter the orders table to add the following columns before the po_num column.
The columns must appear in this order:
Column Description
name
ship_instruct The special shipping instructions. Allow for a length of 40.
backlog Flag to indicate whether the order has been filled or not (values
will be ‘y’ or ‘n’).
4. Alter the catalog table to add the following column after the stock_num
column:
Column Description
name
manu_code Manufacturer code for the item ordered. This refers to the
manufacturer code in the stock table.
5. Run the load script called alterload.sql. This script loads data into the new
columns for each table. To execute the load script, run the following command:
$ dbaccess stores_demo alterload.sql
6. Execute the oncheck -pT commands again and notice any differences in the
version information.
Hint: Since your tables are already created, you could query the systables table
to get information that is needed to calculate extent sizes. What information
would that be? Since your tables are already created, you could query the
systables table to get information that is needed to calculate extent sizes. What
information would that be?
Task 4. Convert a simple large object to a smart large object.
In this task, you alter the catalog table to change the cat_descr column from TEXT
to CLOB and relocate the column in an sbspace.
1. Alter the catalog table and change the cat_descr column from TEXT to CLOB
and relocate the column in the sbspace named s9_sbspc.
2. Verify that the table has been altered using dbaccess.
Results:
In this exercise, you altered and dropped databases and tables. You also
converted a simple large object to a smart large object.
Exercise 2:
Alter and drop databases and tables - Solutions
Purpose:
In this exercise, you will alter and drop databases and tables. You will also
convert a simple large object to a smart large object.
The customer table has only one version with all 73 data pages listed in
version 0, the current version.
$ oncheck -pT stores_demo:orders
The orders table has only one version with all 11 data pages listed in
version 0, the current version.
The catalog table has only one version with all 9 data pages listed in
version 0, the current version.
2. Alter the customer table to add the following column after the zipcode column:
Column name Description
phone The phone number of the customer. Allow for a length of 18.
ALTER TABLE customer
ADD phone CHAR(18);
3. Alter the orders table to add the following columns before the po_num column.
The columns must appear in this order:
Column Description
name
ship_instruct The special shipping instructions. Allow for a length of 40.
backlog Flag to indicate whether the order has been filled or not (values
will be ‘y’ or ‘n’).
ALTER TABLE orders
ADD ship_instruct CHAR(40)
BEFORE po_num;
ALTER TABLE orders
ADD backlog CHAR(1)
BEFORE po_num;
4. Alter the catalog table to add the following column after the stock_num
column:
Column Description
name
manu_code Manufacturer code for the item ordered. This refers to the
manufacturer code in the stock table.
ALTER TABLE catalog
ADD manu_code CHAR(3)
BEFORE cat_descr;
5. Run the load script called alterload.sql. This script loads data into the new
columns for each table. To execute the load script, run the following command:
$ dbaccess stores_demo alterload.sql
6. Execute the oncheck -pT commands again and notice any differences in the
version information.
customer table:
The customer table now has two versions, but all 84 data pages are listed
in version 1, the current version.
orders table:
Notice that the orders table now has three versions, while the others have
only two. This is because the changes to the orders table were done in
two separate ALTER TABLE steps instead of being combined into one
ALTER TABLE statement, each creating a new version. All 22 data pages
are shown in version 2, the current version.
A much better way of doing this alter, resulting in only one new version,
would have been:
ALTER TABLE orders
ADD (ship_inst CHAR(40)
BEFORE po_num,
backlog CHAR(1)
BEFORE po_num);
catalog table:
The catalog table now has two versions, but all nine data pages are
listed in version 1, the current version.
Results:
In this exercise, you altered and dropped databases and tables. You also
converted a simple large object to a smart large object.
Unit summary
• Drop a database
• Drop a table
• Alter a table
• Convert a simple large object to a smart large object
Altering and dropping databases and tables © Copyright IBM Corporation 2017
Unit summary
Informix (v12.10)
Unit objectives
• Build an index
• Alter, drop, and rename an index
• Identify the four index characteristics
Unit objectives
Leaf Node
56
Branch Node 57
59
59
292
95
Root Node
97 D
292 292 A
>
T
293 A
387 294
Level 0 > 387
393
394
397
401
Level 1
Level 2
If you run oncheck -pT databasename:tablename on a table with an index that has two
levels, you can see that level 1 is the root node and level 2 is the leaf node. These are
actually nodes 0 and 1, but are displayed as levels 1 and 2.
When you access a row through an index, you read the B+ tree starting at the root
node and follow the nodes down to the lowest level, which contains the pointer to the
data. In the example above, three read operations are required to find the pointer to the
data.
Keep key size to a minimum for two reasons:
• A smaller key size means that one page in memory holds more key values, which
potentially reduces the number of read operations necessary to look up several
rows.
• A smaller key size can cause fewer B+ tree levels to be used. This is important
from a performance standpoint. An index with a 4-level tree requires one more
read per row than an index with a 3-level tree. If 100,000 rows are read in an
hour, this means that 100,000 fewer reads are required to obtain the same data.
For Informix, the size of a node is the size of one page.
R-tree indexes
Informix also provides R-tree indexes as a registered secondary access method for
tables. R-tree indexes are useful for searching multidimensional spatial data. They are
not covered in this course.
B+ tree splits
Level 1
Creating, altering, and dropping indexes © Copyright IBM Corporation 2017
B+ tree splits
When a new index item is inserted into a full index node, the node must split. B+
trees grow toward the root. Attempting to add a key into a full node forces a split into
two nodes and promotes the middle key value to a node at a higher level. If the key
value that causes the split is greater than the other keys in the node, it is put into a
node by itself during the split. The promotion of a key to the next higher level can
also cause a split in the higher level node. If the full node at this higher level is the
root, it also splits. When the root splits, the tree grows by one level and a new root
node is created.
In the example, key 88 needs to be added, but the node is full. A split forces half the
keys (378 and 414) into one node and half the keys (88, 150, and 292) into the
other node on the same level. Key 292 is promoted to the next highest level.
Unique Index
Data
customer_num
105 105 Anthony Higgins Play Ball!
106 113 Lana Beatty Sportstown
113 114 Frank Albertson Sporting Place
114 115 Alfred Higgins Gold Medal
115
106 Philip Currie Phil's Sports
Duplicate Index
Data
lname
Albertson 105 Anthony Higgins Play Ball!
Beatty 113 Lana Beatty Sportstown
Currie 114 Frank Albertson Sporting Place
Higgins 115 Alfred Higgins Gold Medal
106 Philip Currie Phil's Sports
Composite index
Index limitations:
Maximum 16 columns
Maximum length 380 bytes
Composite index
An index on two or more columns is called a composite index.
The principal functions of a composite index are to:
• Facilitate multiple column joins
• Increase uniqueness of indexed values
Index limitations
The maximum number of columns that you can use in a composite index is 16.
In addition to the 16-column limit, the maximum size of an Informix index key is 380
bytes. This size is calculated by summing up the lengths of the data types in the index
columns.
Cluster indexes
Cluster indexes
When you create a cluster index or alter an existing index to a cluster index, the server
rewrites the data rows in the table to match the order of the index. Since the data is
physically written in the order of the cluster index, each table can have only one cluster
index.
When you use or consider cluster indexes, you need to know that:
• Informix does not maintain clustering of the data rows as new rows are inserted
or as existing key values are updated. Therefore, cluster indexes are most
effectively used on relatively static tables and are less effective on very dynamic
tables.
• You can recluster an index and the data rows at any time with the statement:
ALTER INDEX index_name TO CLUSTER;
ALTER INDEX requirements
When the ALTER INDEX index_name TO CLUSTER statement is executed, the server
makes a copy of the entire table on disk in the order of the index before dropping the
old table. You must have sufficient space available in the dbspace to hold a copy of the
table. ALTER INDEX also requires exclusive access to the table.
Since Informix B+ tree indexes can be traversed in either direction, you do not need to
specify the ASC (ascending) or DESC (descending) keyword when you create an index
on a single column. However, you might find it useful to use the DESC keywords for
specific columns in multicolumn indexes. For example, perhaps your applications
frequently retrieve order information sorted by order number and order date in
descending order. An index, such as defined in the following example, eliminates
repeated sorts by the database server:
CREATE INDEX order_ix1 ON orders (order_num, order_date desc);
Detached indexes
• A detached index is one that does not follow the table:
Created in different dbspace from table
Uses different fragmentation strategy
• Index extents are stored separately from table extents
• By default, index extents are created in the dbspace that holds the
data extents
• An index can be placed in a separate dbspace
Detached indexes
The database server automatically determines the extent size for a detached index.
For a detached index, the database server uses the ratio of the index key size, plus
some overhead bytes, to the row size to assign the extent size for the index. The
server-generated index extent size is calculated as follows:
Index extent_size = ( (index_key_size + 9) / table row size ) * table_extent_size
Examples:
ALTER INDEX ix_man_cd TO CLUSTER;
Traditional b-tree
index
Root node
Branches
Leaves
Leaves
Exercise 3
Create, alter, and drop indexes
• Create indexes
• Alter indexes
• Drop indexes
Exercise 3:
Create, alter, and drop indexes
Purpose:
In this exercise, you will create, alter, and drop indexes.
Exercise 3:
Create, alter, and drop indexes - Solutions
Purpose:
In this exercise, you will create, alter, and drop indexes.
or…
4. Use the oncheck utility to examine the index growth. Use the following
command and answer the following questions:
$ oncheck -pT stores_demo:customer | more
• How many indexes are on the customer table?
Count the number of Index Usage Reports displayed by the
command. You should have only one index on the customer table.
• How many pages are allocated to the table and index? What do they
contain?
Add up the 'Number of pages allocated' values from each TBLSpace
Usage Report. You should have around 489 pages for the customer
table and its index, including data pages, bit-map pages, and index
pages. (Your number may vary.)
7. Query the system catalog tables to find information about the customer_dup
index (use one or the other, below - you have 2 options).
SELECT i.* FROM sysindices i, systables t
WHERE tabname = "customer"
AND t.tabid = i.tabid;
or…
3. Use the oncheck utility to examine the index growth. Use the following
command and answer the following questions:
oncheck -pT stores_demo:orders | more
How many indexes are on the orders table?
Count the number of Index Usage Reports in the oncheck output.
There should be only one index.
• How many pages are allocated to the table and index? What do they
contain?
Add up the "Number of pages allocated" values from each TBLSpace
Usage Report. You should have around 39 pages for the orders table
and its index including data pages, free pages, bit-map pages, and
index pages.
• How many levels and average free bytes?
Obtain this information from the Index Usage Report section. You
should see two levels and about 504 average free bytes. (Your
number may vary.)
4. To see the most recent orders for a customer first, drop and recreate the
orders_ix index to put the order_date in descending order.
DROP INDEX orders_ix;
CREATE INDEX orders_ix
ON orders(customer_num, order_date desc, order_num);
5. Rename the orders_ix index to dateorder_ix.
RENAME INDEX orders_ix TO dateorder_ix
6. Query the system catalog tables again to find if your index name has changed.
SELECT ix.* FROM sysindexes ix, systables t
WHERE tabname = "orders"
AND t.tabid = ix.tabid;
What is the difference between the two system catalog table query results for
the orders table?
The part2 column value is now a negative value for the descending
order_date column.
Task 3. Create an index on the items table in dbspace3.
In this task, you will create an index for the items table in dbspace3. You will verify
that the index was created using the sysindices system catalog table and use the
oncheck utility to examine the index growth.
1. The items table needs an index on the item_num and order_num columns.
Create an index called items_ix and place it in dbspace3.
CREATE INDEX items_ix
ON items(item_num, order_num)
IN dbspace3;
2. Query the system catalog tables to find information about the items_ix index.
SELECT i.* FROM sysindices i, systables t
WHERE tabname = "items"
AND t.tabid = i.tabid;
3. Use the oncheck utility to examine the index growth. Use the following
command to answer the following questions:
$ oncheck -pT stores_demo:items | more
• Where is the items_ix index located?
The dbspace location is indicated in the following report header:
Index items_ix fragment partition dbspace3 in DBspace dbspace3
• How many pages are allocated to the index? What do they contain?
Note the "Number of pages allocated" value from each TBLSpace
Usage Report. You should have about 23 pages allocated for the table
and 16 pages for the index including bit-map, data, and index pages.
(Your numbers may vary.)
Results:
In this exercise, you created, altered, and dropped indexes.
Unit summary
• Build an index
• Alter, drop, and rename an index
• Identify the four index characteristics
Unit summary
Informix (v12.10)
Unit objectives
• Explain the benefits of indexing
• Evaluate the costs involved when indexing
• Explain the maintenance necessary with indexes
• Describe effective management of indexes
• Enable and disable indexes
Unit objectives
Benefits of indexing
• Use filtering to reduce the number of pages read (I/O)
• Eliminate sorts
• Ensure uniqueness of key values
• Reduce the number of pages read by using key-only reads
Benefits of indexing
Filtering with indexes
An index on a column or columns can be used to filter the data to identify which data
pages must be read to complete the query.
Sorting with indexed reads
An index on a column or columns can be used to retrieve data in sorted order. By
reading the data using the index, the database server can return the data in the order
requested, in either ascending or descending order, while eliminating the need to
perform a sort operation.
Enforcing uniqueness
When you create an index on a column with the UNIQUE keyword, only one row in the
table can have a column with that value. This prevents the need to perform any
uniqueness checking through the application program.
Key-only selects
When all columns listed in the query are part of the same index, Informix does not read
the data rows (pages), as all of the data is already available in the index.
Costs of indexing
table
insert
update
data index
index delete
Costs of indexing
Disk space costs
The first cost associated with an index is disk space. An index contains a copy of every
unique data value in the indexed columns and an associated 4-byte slot table entry. It
also contains a 4-byte pointer for every row in the table and a 1-byte delete flag. For
indexes on fragmented tables, the 4-byte pointer is expanded to 8 bytes to
accommodate a fragment ID. This can add many pages to the space requirements of
the table. It is not unusual to have as much disk space dedicated to index data as to
row data.
Processing time costs
The second cost is the processing time required while the table is modified. Before a
row is inserted, updated, or deleted, the index key must be located in the B+ tree.
Assume that you need an average of two I/O operations to locate an index entry. Some
index nodes might be in shared memory, while other indexes that need modification
might have to be read from disk. Under these assumptions, index maintenance requires
more time to handle different kinds of modifications as the following sections show.
Delete overhead
When a row is deleted from a table, delete flags are set for the keys in all indexes for
the row for later deletion (more overhead). The slot entry in the data page is set to zero.
Insert overhead
When a row is inserted, the related entries are inserted in all indexes. The node for the
inserted row entry is found and rewritten for each index.
Update overhead
When a row is updated, the related entries are located in each index that applies to a
column that was altered. The index entry is rewritten to eliminate the old entry; the new
column value is then located in the same index or a new entry is made.
Many insert and delete operations can also cause a major restructuring of the B+ tree
index, which requires more I/O activity.
B+ tree maintenance
Inserting 'Brown'
Adams
Before After Brown
Downing
Adams Downing
Downing >
Johnson
Smith
Johnson
Smith
B+ tree maintenance
B+ tree maintenance requires that nodes are split, merged, and shuffled to maintain an
efficient tree while accommodating inserts, updates, and deletes of key items. Informix
has built sophisticated node-management techniques to minimize the performance
effect of B+ tree management.
Delete compression
To free index pages and maintain a compact tree, Informix evaluates each node after
physical deletes of index items to determine if the node is a candidate for compression.
If the node is not a root node and it has fewer than three index items, the node
becomes a compression candidate.
Merging
If either the right or left sibling node can accommodate the index keys that remain in the
compression candidate, the index items are merged to the sibling node and the
candidate node is freed for reuse.
Shuffling
If neither sibling can accommodate the index items, the server attempts to balance the
nodes by selecting the sibling node with the most items and shuffling some of those
items to the compression candidate so that both nodes have an equal or nearly equal
number of keys.
Shuffling helps maintain a balanced tree and prevents a node from becoming full if its
adjacent nodes are not. This in turn helps reduce the likelihood that node splitting will
be required.
Splits
When a new key value must be added to a full index node, the node must split to make
room for the new key. To perform a node split, the database server must write three
pages.
Goal of index management
The goal of index management is to minimize splits, merges, and shuffles because:
• Extra processing is required.
• The affected index pages must be locked and written by the database server.
• Increased disk space is required.
• After a split, there are new pages that are not full. The partially full nodes increase
the disk space requirements for storing your index.
• Many splits reduce caching effectiveness. If one full node holds 200 key values;
after a split, it might be necessary to cache three pages in memory to have
access to the same 200 keys.
Indexing guidelines
• Create an index on:
Join columns
Selective filter columns
Columns frequently used for ordering
• Avoid highly duplicate indexes
• Limit the number of indexes on tables used primarily for data entry
• Keep key size small
• Use composite indexes to increase uniqueness
• Use clustered indexes to speed up retrieval
• Disable indexes before large update, delete, or insert operations
Indexing guidelines
The following visuals review some general guidelines that you can apply to help
determine which indexes to create to ensure optimal query performance without placing
unnecessary overhead on the system.
customer orders
… zipcode
94086
94117
94303 index
94115
94062
92117
92117 94062
95086 94086
94115
…
index orders
table
Volatile tables
Avoid heavy indexing of
volatile tables
table
updates updates
inserts inserts
deletes deletes
Volatile tables
Because of the extra reads that must occur when indexes are updated, some
degradation occurs when many indexes are placed on a table that is updated
frequently.
A volatile table should not be heavily indexed unless the amount of querying on the
table outweighs the overhead of maintaining the index file.
During periods of heavy querying (for example, reports), you can improve
performance by creating an index on the appropriate column. Creating indexes for a
large table, however, can be a time-consuming process. Also, while the index is
being created, the table might be exclusively locked preventing other users from
accessing it.
Composite indexes
table
a b c
abc
ab
a
Composite indexes
When you create a composite index to improve query performance, some of the
component columns can also take advantage of this index.
If several columns of one table join with several columns in another table, create a
composite index on the columns of the table with the larger number of rows. If several
columns in a query have filter conditions placed on them regularly, create a composite
index corresponding to filter columns used in the query.
Use a composite index to speed up an INSERT into an indexed column that contains
many duplicate values. Adding a unique (or more unique) column to a column that has
many duplicate values increases the uniqueness of the keys and reduces the length of
the duplicate lists. The query can perform a partial key search by using the first (highly
duplicate) column, which is faster than searching the duplicate lists.
When a table is commonly sorted on several columns, a composite index
corresponding to those columns can help avoid repetitive sorts.
Clustered indexes
Clustered indexes can speed up retrieval
customer
After clustering by
customer_num lname lname
101 Pauli
…
customer_num lname
102 Sadler
103 Currie
…
104 Higgins
103 Currie 101 Pauli
… 102 Sadler
…
104 Higgins
Clustered indexes
Clustering is most useful for relatively static tables.
Clustering and reclustering take a lot of space and time. You can avoid some clustering
by loading data into the table in the desired order. The physical order of the rows is their
insertion order, so if the table is initially loaded with ordered data, no clustering is
needed.
Massive updates or
large loads
B-Tree Appender
Parallel Sort
Exchange
Parallel Scan
Chunks
4. Estimate the size of a typical index entry, entrysize, with one of the following
formulas, depending on whether the table is fragmented or not:
- For non-fragmented tables or fragmented tables with attached indexes,
use the following formula:
entrysize = keysize * propunique + 5
- For fragmented tables with detached indexes, use the following formula:
entrysize = keysize * propunique + 9
5. Estimate the number of entries per index page with the following formula:
pagents = trunc(pagefree / entrysize)
The trunc function notation indicates that you should round down to the
nearest integer value; pagefree is the pagesize minus the page header (2,020
for a 2-kilobyte pagesize).
6. Estimate the number of leaf pages with the following formula:
leaves = ceiling(rows/pagents)
The ceiling function notation indicates that you should round up to the nearest
integer value; rows is the number of rows that you expect to be in the table.
7. Estimate the number of branch pages at the second level of the index with the
following formula:
branches_0 = ceiling(leaves / pagents)
If the value of branches_0 is greater than 1, more levels remain in the index.
To calculate the number of pages contained in the next level of the index, use
the following formula:
branches_n+1 = ceiling(branches_n / pagents)
where:
- branches_n is the number of branches for the last index level that you
calculated.
- branches_n+1 is the number of branches in the next level.
8. Repeat the calculation in step 7 for each level of the index until the value of
branches_n+1 equals 1.
9. Add the total number of pages for all branch levels calculated in steps 7
through 8. This sum is called the branchtotal.
10. Use the following formula to calculate the number of pages in the compact
index:
compactpages = (leaves + branchtotal)
11. If necessary, incorporate the fill factor into your estimate for index pages:
indexpages = 100 * compactpages / FILLFACTOR
The default fill factor for indexes is determined by the value of the FILLFACTOR
configuration parameter. The fill factor for a specific index can be modified by
including the FILLFACTOR clause in the CREATE INDEX statement in SQL.
Exercise 4
Managing and maintaining indexes
manage database indexes
decide which columns to use for indexes
create and drop indexes
Exercise 4:
Managing and maintaining indexes
Purpose:
In this exercise, you will learn how to manage and maintain indexes.
Exercise 4:
Managing and maintaining indexes - Solutions
Purpose:
In this exercise, you will learn how to manage and maintain indexes.
Examine each table in your database and find the duplicate values that should
be indexed for that table.
Customer table: zipcode
Catalog table: stock_num, manu_code
Items table: order_num; stock_num, manu_code
Orders table: customer_num
Stock table: manu_code
3. Examine the following tables in your database and find the values that should
be indexed for reordering the table.
The following answers will vary depending on how the data is accessed.
• Customer table: cluster by lname, fname or by zipcode Catalog table:
cluster by catalog_num
• Items table: cluster by order_num
• Orders table: cluster by order_date or customer_num
• Stock table: cluster by stock_num, manu_code
4. For each table, consider the following questions. If in a classroom, answer these
questions as a group:
• Did I ensure uniqueness of key values?
• Did I index columns that will most likely be included in a SELECT
statement?
• Would a cluster index help with retrieval of data on the table?
• Did I index columns that will most likely be used for ordering the data?
Task 2. Drop and create indexes
In this task, you will drop the existing indexes and create the new indexes for your
database.
1. Drop any existing indexes on your tables.
DROP INDEX index_name;
Hint You can get a list of index names for your tables by running the
following query:
SELECT idxname FROM sysindexes WHERE tabid > 99;
2. Create your indexes for each table. For the customer table, create the unique
index on customer_num in dbspace4.
Customer table: This index was created to ensure unique values in the
customer_num column.
CREATE UNIQUE INDEX customer_num_ix
ON customer(customer_num) IN dbspace4;
Customer table: This index was created as the clustering index.
CREATE CLUSTER INDEX zipcode_ix
ON customer(zipcode);
Catalog table: This index was created to ensure unique values in the
catalog_num column, since the SERIAL data type will allow duplicates into the
column. It is also the clustering index.
CREATE UNIQUE CLUSTER INDEX catalog_num_ix
ON catalog (catalog_num);
Catalog table: This index was created to improve performance on queries.
CREATE INDEX catalog_stock_manu_ix
ON catalog(stock_num, manu_code);
Items table: This index was created to ensure unique values. Because the
item_num column can contain duplicates, you need the order_num column to
ensure a unique row is returned to the query.
CREATE UNIQUE INDEX item_num_ix
ON items(item_num,order_num);
Items table: This index was created as the clustering index.
CREATE CLUSTER INDEX item_order_num_ix
ON items(order_num);
Items table: This index was created to improve query performance.
CREATE INDEX item_stock_num_ix
ON items(stock_num, manu_code);
Orders table: This index was created to ensure unique values in the order_num
column.
CREATE UNIQUE CLUSTER INDEX ordernum_ix
ON orders(order_num);
Orders table: This index was created to improve query performance.
CREATE INDEX order_custnum_ix
ON orders(customer_num);
Stock table: This index was created to ensure unique values. Because the
stock_num column can contain duplicates, you need the manu_code
column to ensure a unique row is returned to the query. It is also the
clustering index.
CREATE UNIQUE CLUSTER INDEX stocknum_manucode_ix
ON stock(stock_num, manu_code);
Stock table: This index was created to improve query performance.
CREATE INDEX manu_code_ix
ON stock(manu_code);
Catalog table:
The following catalog_num_ix index output is from oncheck -pT.
Items table:
The following item_num_ix index output is from oncheck -pT.
Orders table:
Stock table:
Unit summary
• Explain the benefits of indexing
• Evaluate the costs involved when indexing
• Explain the maintenance necessary with indexes
• Describe effective management of indexes
• Enable or disable indexes
Unit summary
Informix (v12.10)
Unit objectives
• List the ways to fragment a table
• Create a fragmented table
• Create a detached fragmented index
• Describe temporary fragmented table and index usage
Unit objectives
What is fragmentation?
Fragmentation is the distribution of data from one table across separate
dbspaces
mytable
What is fragmentation?
Informix supports intelligent horizontal table and index partitioning, referring to it as table
and index fragmentation. Fragmentation allows you to create a table that is treated as a
single table in SQL statements, but consists of multiple tblspaces.
Normal fragmentation calls for one fragment per dbspace. This effectively breaks up the
larger table into multiple smaller table spaces (tables) since a table space cannot span
dbspaces.
The feature called partitioning allows multiple fragments from a fragmented table to co-
exist in the same dbspace. With partitioning, the dbspace name can no longer
represent the fragment since more than one fragment can be in the same dbspace.
Therefore, a partition name is added.
All fragments/partitions of a table must exist in dbspaces with the same pagesize.
tblspace1 tblspace2
EXTENT 1 EXTENT 1
EXTENT 2 EXTENT 2
dbspace_1 dbspace_2
Advantages of fragmentation
• Advantages of fragmentation include:
Parallel scans and other parallel operations
Balanced I/O
Finer granularity of archives and restores
Higher availability
Increased security
Joins, sorts, aggregates, groups, and inserts
Advantages of fragmentation
The primary advantages of fragmentation include:
Balanced I/O By balancing I/O across disks, you can reduce disk
contention and eliminate bottlenecks. This is
advantageous in OLTP systems where a high
degree of throughput is critical.
DSS queries
• Read many rows resulting in little or no transaction activity
• Read data sequentially
• Execute complex SQL operations
• Create large temporary files
• Measure response times in hours and minutes
• Relatively few concurrent queries
DSS queries
Since PDQ can take up more resources than non-PDQ, this feature should generally be
reserved for decision support (DSS) queries with the characteristics listed in the visual.
Always enable parallelism for DSS queries by using the SET PDQPRIORITY statement
or the PDQPRIORITY environment variable.
OLTP queries
• Relatively few tables are accessed and few rows are read
• Transaction activity (inserts, updates, and deletes)
• Data is accessed by using indexes
• Simple SQL operations
• Response times measured in seconds and fractions of a second
• Many concurrent queries
OLTP queries
OLTP queries are characterized by:
• Relatively few rows and tables are read
• Transaction activity (inserts, updates, and deletes)
• Data is accessed by using indexes
• Simple SQL operations
• Response times measured in seconds and fractions of a second
• Many concurrent queries
Be sure to use a PDQPRIORITY value of 0 for OLTP queries. This ensures that OLTP
queries are not limited by PDQ resource allocations. The database server is still able to
perform fragmentation elimination for these queries, which is the primary benefit of table
and index partitioning in an OLTP environment.
• Round robin
INSERT INTO t1 VALUES (…);
INSERT INTO t1 VALUES (…);
INSERT INTO t1 VALUES (…);
• Expression-based
INSERT INTO t1 (col1, col2)
VALUES (800,"Active");
INSERT INTO t1 (col1, col2)
VALUES (220,"Active");
INSERT INTO t1 (col1, col2)
VALUES (240,"Active");
col1 <= 100 col1 > 100 and remainder
and col1 < 500
col2 = "Active" and col2 =
"Active"
In the visual example, the row where col1 = 800 is put in the remainder
fragment because it does not match the criteria for the first (col1 < =100) or
second (col1 > 100 and col1 < 500) fragment.
Expression-based fragmentation
Each expression is evaluated in order. The row is placed in the first fragment where
the expression evaluates to true and the rest of the expressions are skipped.
Expression-based fragmentation
The FRAGMENT BY EXPRESSION option provides control in placing rows in
fragments. You specify a series of SQL expressions and designated dbspaces. If the
expression is evaluated to true, the row is placed in the corresponding dbspace.
A row should only evaluate to true for one expression. If a row evaluates to true for
more than one expression, it is placed in the dbspace for the first expression that is
true.
The REMAINDER IN clause specifies a dbspace that holds rows that do not evaluate
into any of the expressions.
You can use any column in the table as part of the expression. Columns in other local
or remote tables are not allowed. No subqueries or stored procedures are allowed as
part of the expression.
Using the syntax shown on the visual, only one fragment can exist in a dbspace,
meaning that a dbspace can only be listed once in the fragmentation scheme.
Using PARTITIONING
• Allows multiple PARTITIONs to be stored in same dbspace:
Use PARTITION keyword
Name the partition
• Example:
CREATE TABLE tab1(a int)
PARTITION BY EXPRESSION
PARTITION part1 (a >=0 AND a < 5)
IN dbspace1,
PARTITION part2 (a >=5 AND a < 10)
IN dbspace1,
... ;
Using PARTITIONING
Partitioning, an enhancement to fragmentation, allows multiple fragments to be stored
in the same dbspace. In order to use this feature, use the PARTITION keyword and
give the partition a name.
Partition information is stored in the sysfragments table. If a fragmented table is created
with partitions, each row in the sysfragments catalog contains a partition name in the
partition column. If a regular fragmented table without partitions is created, the name of
the dbspace appears in the partition column.
You can also use the syntax FRAGMENT BY instead of PARTITION BY when creating
a partitioned table.
The interval fragments are created in round-robin fashion in the dbspaces specified in
the STORE IN clause. If the dbspace selected for the interval fragment is full or down,
the system skips that dbspace and selects the next one in the list.
In the example, rows with a prod_price (after rounding) that is less than 100 are
assigned to a fragment in dbspace dbs0. If the next row inserted has a prod_price of
102, an interval fragment is created in dbs1 for rows with a prod_price in the range of
100–300 (base value + interval_value). If the next row has a prod_price of 502, then an
interval fragment is created in dbs2 for rows with a prod_price in the range of 500–700.
Fragmented/partitioned indexes
Fragment 1 Fragment 2
dbspace1
dbspace1 dbspace2
Fragmented/partitioned indexes
You can decide whether or not you want to fragment indexes. If you fragment your
indexes, you must use an expression-based fragmentation scheme. You cannot use
round robin fragmentation for indexes.
Non-fragmented indexes
If you do not fragment the index, you can put the entire index in a single dbspace. In
this strategy, the resulting index and data pages are separate.
When to use fragmented indexes
Since OLTP applications frequently use indexed access instead of sequential access, it
can be beneficial to fragment indexes in an OLTP environment. DSS applications
generally access data sequentially. Therefore, it is generally not recommended to
fragment indexes in a DSS environment.
System indexes, created to support constraints, remain unfragmented and are created
in the dbspace where the database is created.
ROWIDS
• Fragmented tables do not contain unique rowids
• To access a fragmented table by rowid, you must explicitly create a
rowid column:
CREATE TABLE orders(
order_num SERIAL,
customer_num INTEGER,
part_num CHAR(20))
WITH ROWIDS
FRAGMENT BY ROUND ROBIN IN dbs1,dbs2;
ROWIDS
Rowid in a nonfragmented table is an implicit column that uniquely identifies a row in
the table. In a fragmented table, rowids are no longer unique because they can be
duplicated in different fragments. In a fragmented table, rowids require a 4-byte rowid
column.
When you add rowids to a fragmented table, the database server creates an index that
maps the internal unique row address to the new rowid. Access to the table using rowid
is always through the index.
If your application uses rowids, performance can be affected when it accesses
fragmented tables because an index is required for rowid mapping. It is recommended
that you use primary keys instead of rowids for unique row access.
tempdbs1
SELECT * FROM table1
table1 tempdbs2
INTO TEMP temp_table tempdbs3
WITH NO LOG;
temp_table
Expression-based fragmentation:
tempdbs3
CREATE TEMP TABLE temp_table (
column1 INTEGER,
temp_table
column2 CHAR(10))
WITH NO LOG
FRAGMENT BY EXPRESSION
column1 < 1000 in tempdbs1,
column1 < 2000 in tempdbs2,
column1 >= 2000 in tempdbs3;
Table and index partitioning © Copyright IBM Corporation 2017
Fragmenting an index
Fragmenting an index
The database server has the following restrictions on temporary indexes:
• You cannot fragment indexes by round robin.
• You cannot fragment unique indexes by an expression that contains columns that
are not in the index key.
Case
Study
Exercise 5
Table and index partitioning
• work with fragmentation strategies on tables
• work with fragmentation strategies on indexes
Exercise 5:
Table and index partitioning
Purpose:
In this exercise, you will learn how to partition tables and indexes.
Exercise 5 :
Table and index partitioning - Solutions
Purpose:
In this exercise, you will learn how to partition tables and indexes.
5. Create the new orders table by running your orders.sql files in dbaccess.
$ dbaccess stores_demo orders.sql
6. Load the orders table using the orders.unl file.
LOAD FROM "orders.unl" INSERT INTO orders;
7. Use the oncheck -pt command to verify the information about your
fragmentation on the orders table.
$ oncheck -pt stores_demo:orders | more
Table orders:
Attempts to execute these commands should result in the following error on the
second statement:
872: Invalid fragment strategy or expression for the unique index.
5. Use the oncheck -pt command to verify the location of the order_custnum_ix
and ordernum_ix indexes on the orders table.
$ oncheck -pt stores_demo:orders | more
Index order_custnum_ix:
Index ordernum_ix:
Unit summary
• List the ways to fragment a table
• Create a fragmented table
• Create a detached fragmented index
• Describe temporary fragmented table and index usage
Unit summary
Informix (v12.10)
Unit objectives
• Change the fragmentation strategy of a table
• Change the fragmentation strategy of an index
• Explain how to skip inaccessible fragments
Unit objectives
Dropping a fragment
• Use the DROP clause to drop a fragment and move all the rows (or
index keys) in the dropped fragment to another fragment.
Dropping a fragment
When dropping a fragment, make sure the other fragments have enough space to hold
the rows that are to be moved there. For example, in an expression-based
fragmentation scheme, the rows in the dropped fragment are most likely to go to the
remainder fragment.
Dropping the number of fragments to less than two is not allowed.
To drop a partition, use the partition name instead of the dbspace in which it resides.
• The index must have the same properties (unique, duplicate) as the index of the
target table.
• The index for the newly added fragment cannot reside in any of the dbspaces
used by the index fragments of the target table.
DETACH
You can extract a fragment to create a separate table by using the DETACH clause.
Once a fragment is detached, the table that is created can be dropped. This is useful in
situations where a rolling set of fragments is maintained over time and new fragments
are added and old fragments removed.
Index rebuilds with DETACH
Index rebuilds on the original table are not necessary if the index fragmentation strategy
of the detached fragment is identical to or highly parallel with the table fragmentation. In
that case, the index fragments corresponding to the detached fragment are simply
dropped.
The DETACH command does not work on tables created WITH ROWIDS.
Use the same syntax to attach and detach partitions, except you would use the
PARTITION partition_name syntax instead of dbspace_name syntax, as shown in the
last example on the visual.
Defragmenting partitions
• After appending data to partitions, you might end up with many
extents; mapping a logical page number to a physical address
becomes slow
• Chunk allocation (allocating space from a chunk) is also much slower if
you have many small extents—this is a common operation
• Defragment by table name (call task or admin):
EXECUTE FUNCTION task("defragment",
"database:[owner.]table");
• Defragment by partition number (call task or admin)
EXECUTE FUNCTION task("defragment partnum",
partition_number_list);
• Example
EXECUTE FUNCTION task("defragment","oltr:tab1");
Defragmenting partitions
As rows are inserted into tables over time, the number of extents allocated for the table
increases and may become fragmented within a partition. This could lead to a decrease
in server performance.
The Partition Defragmenter feature addresses this problem by reorganizing the table
into fewer, larger extents. The Defragmenter can be performed while the database
server is online, so no downtime is required.
The Partition Defragmenter is initiated by executing an administrative function (task or
admin) and specifying defragment as the first argument. If you specify a table name as
the next argument, then all partitions for the table are defragmented. To defragment just
one partition of a table, or to defragment index partitions, specify the partition number of
the table or index partition.
Exercise 6
Maintaining table and index partitioning
Modify fragmentation schemes
Exercise 6:
Maintaining table and index partitioning
Purpose:
In this exercise, you will learn how to maintain table and index partitioning.
Exercise 6 :
Maintaining table and index partitioning - Solutions
Purpose:
In this exercise, you will learn how to maintain table and index partitioning.
2. Is the data distributed evenly for the balance of I/O across disks? No.
3. Modify the existing fragments using the ALTER FRAGMENT statement to
create an even distribution of data in the customer table. Hint: You can get an
idea of the distribution of data by using the following SQL statement:
SELECT lname[1,1], count(*)
FROM customer
GROUP BY 1
ORDER BY 1;
5. Is the data distributed evenly for the balance of I/O across disks?
6. Why or why not? No, the data is not redistributed in round robin
fragmentation when you have altered the schema to add another dbspace.
The newly added fragment only contains its portion of the data just
inserted.
Task 3. Alter fragmentation of an index on the customer table.
1. Initialize a fragmentation strategy on the customer_num_ix index of the
customer table using dbspaces dbspace2 and dbspace3. Use the following
distribution for the customer_num column:
• 100 - 2499 in dbspace2
• 2500 and greater in dbspace3
ALTER FRAGMENT ON INDEX customer_num_ix
INIT FRAGMENT BY EXPRESSION
customer_num < 2500 AND customer_num >= 100
IN dbspace2,
customer_num >= 2500
IN dbspace3;
2. Using the oncheck command, how many used pages are in each dbspace of
the customer_num_ix index.
$ oncheck -pt stores_demo:customer
Results: In this exercise, you learned how to maintain table and index
partitions.
Unit summary
• Change the fragmentation strategy of a table
• Change the fragmentation strategy of an index
• Explain how to skip inaccessible fragments
Unit summary
Informix (v12.10)
Unit objectives
• Understand query plans, access plans, and join plans
• Write queries that produce various index scans
Unit objectives
Access plan
• Which access method to use:
Index scan
Sequential scan
• Which order to process tables
Access plan
Informix can choose from various methods to retrieve data from disk. The method
chosen for retrieving data is called the access plan. Informix uses the following access
plans:
• Sequential scan: The database server reads all the rows of the table in physical
order.
• Index scan: The database server reads the index pages, applying filters where
possible, and uses the ROWID values stored in the index to retrieve qualifying
rows.
• Key-only index scan: When all of the data required to satisfy the query is
contained within the index, the database server retrieves the data requested from
the index, eliminating the need to read the associated data pages.
• Key-first index scan: A key-first index scan uses index-key filters, in addition to
upper and lower filters, to reduce the number of rows that a query must read.
• Auto-index scan: The auto-index scan is a feature that allows the database server
to automatically build a temporary index on one or more columns used as query
filters. The database server reads this temporary index to locate the required
data. The index is only available for the duration of the query. The auto-index
feature of Informix can be beneficial to some OLTP batch activities. It allows the
query to benefit from the index without the overhead of index maintenance during
insert, update, and delete activity. In active OLTP environments, the overhead of
modifying indexes when a table has rows inserted, updated, or deleted can be
significant.
Join plan
• Which join method to use:
Nested-loop join
Dynamic hash join
• Which order to join tables
Table B Table A
Join plan
When a query contains more than one table, query optimizer must determine how and
in which order to join the tables using filters in the query. The way that the optimizer
chooses to join the tables is the join plan. Informix uses the following join plans:
• Nested-loop join: In a nested-loop join, the database server scans the first, or
outer table, and then joins each of the rows that pass table filters to the rows
found in the second, or inner table.
• Dynamic hash join: In a dynamic hash join, the rows of the first table, or build
table, are processed through an internal hash algorithm, and then the hashed
rows are stored in a hash table. The second table, or probe table, is then read
sequentially, and each row is checked against the hash table to see if there are
matches from the build table.
Table 1 Table 2
hash
table
Table 2 (probe table)
hash
algorithm
Finally, the optimizer decides whether the table is to be scanned sequentially or with an
index.
FILTER SELECTIVITY ASSIGNMENTS
ab ac ad bc ....
abc abd acb acd adb adc
Windows:
− %INFORMIXDIR%\sqexpln\username.out
Query statistics
• Query statistics:
Statistics aggregated and printed out at iterator level
Only available after query completes
• Query statistics ON by default:
Set with ONCONFIG parameter EXPLAIN_STAT:
−1 = on (Default)
−0 = off
Configure instance using onmode –wf and onmode –wm
$ onmode –wf EXPLAIN_STAT=0
Configure session using onmode –Y <session_id> [0|1|2]
− Dynamic explain feature
− Prints out query statistics by default when enabled:
• 0 = turn off dynamic explain
• 1 = turn on dynamic explain, query plan and statistics
• 2 = turn on dynamic explain, query plan only
The Informix query optimizer © Copyright IBM Corporation 2017
Query statistics
The output from SET EXPLAIN ON can include detailed information about the tables
scanned in a query. The output displays the estimated and actual number of rows
produced and scanned, and the amount of time it took for each scan and join.
These statistics are provided at the iterator level (each process/scan/join), but the
values are only available after the query completes.
Query statistics are on by default. They can be disabled by setting the onconfig
parameter EXPLAIN_STAT to 0. Query stats can also be configured dynamically at the
instance level by using the onmode -wf and onmode -wm commands with the
parameter EXPLAIN_STAT with the values 0 (to turn off) or 1 (to turn on).
Query stats for an individual session can be dynamically managed using the
onmode -Y command with a session ID and value.
0 – Turns off Dynamic Explain
1 – Enables Dynamic Explain with a query plan and query statistics
2 – Enables Dynamic Explain with a query plan only
Estimated cost: 20
Estimated # of Rows Returned 74
Temporary Files Required For Order by
Temporary file
When a temporary table or file is created for the query, the reason for the temporary file
or table is listed in the sqexplain.out file. In the example, you can see that a sort was
required to process the ORDER BY clause. The sort requires space to hold
intermediate files. No temporary file is created if you can use an index to order the
tuples (rows).
Only the selected path is reported by the SET EXPLAIN command. You cannot
determine what alternate paths were considered.
Table access strategy
Finally, the access strategy for each table in the query is shown in the report. In the
example on the slide, the table is accessed by a SEQUENTIAL SCAN, which means
that the entire table is read from beginning to end.
Query statistics
If the EXPLAIN_STAT configuration parameter is enabled, a query statistics section is
also included in the explain output file. You can use this information to debug possible
performance problems with the SQL statement. For this query, the query statistics
section appears as follows:
Estimated cost: 5
Estimated # of Rows Returned: 25
Query statistics
The query statistics for this query are as follows:
Estimated cost: 2
Estimated # Rows Returned: 1
Estimated cost: 2
Estimated # Row Returned: 8
1) informix.customer INDEX PATH
(1) Index Name: informix.customer_num_ix
Index Keys: customer_num (Serial, fragments: 0)
Fragments scanned: (0) dbspace2
Lower Index Filter:
(informix.customer.customer_num >= 104)
Upper Index Filter:
(informix.customer.customer_num <= 111)
Note: The product and sales tables were created for this and following queries.
Note: An index containing the customer_num and zipcode was created on the customer
table for this and following queries, as follows:
CREATE INDEX customer_num_zip_ix on customer(customer_num, zipcode);
Optimizing subqueries
Subquery:
SELECT * FROM customer
WHERE customer_num IN (
SELECT customer_num FROM orders
WHERE order_date = TODAY);
Correlated subquery:
SELECT * FROM customer c
WHERE exists (
SELECT customer_num FROM orders
WHERE orders.customer_num = c.customer_num
AND order_date = TODAY;
Join:
SELECT customer.* FROM customer, orders
WHERE customer.customer_num = orders.customer_num
AND orders.order_date = TODAY;
Optimizing subqueries
A subquery is a SELECT statement that is contained in the WHERE clause of a
SELECT, INSERT, UPDATE, or DELETE statement. It is executed once and the
results passed back to the outer query.
Most subqueries can also be written as join statements, which are much more efficient.
The join query shown on the visual retrieves the same data as the two subquery
examples.
The EXPLAIN output for the join query would be as follows:
Multi-index scan
• Access method that allows optimizer to use multiple indexes
• Suppose two indexes are defined on table tab1
CREATE INDEX tab1_ix1 ON tab1(col1);
CREATE INDEX tab1_ix2 ON tab1(col2);
• Execute the following query on tab1
SELECT * FROM tab1
WHERE col1 = 1 and col2 BETWEEN 20 AND 30;
• Server performs index scans on idx1 and idx2, combines results, then
accesses only data rows that satisfy both index scans
• Use skip scan to access data rows; this looks like a sequential scan,
but it only retrieves necessary data rows
Multi-index scan
Queries with AND and OR predicates may benefit from a multi-index scan, which can
choose indexes on individual columns.
Suppose two indexes are defined on table tab1. Index tab1_ix1 is defined on column
col1, and index tab1_ix2 on column col2. Execute the following query on table tab1:
SELECT ... FROM tab1 WHERE col1 = 1 AND col2 BETWEEN 20 AND 30;
The server performs index scans on both tab1_idx1 and tab2_idx2, and gets rowid lists
for each index. It then merges these rowid lists based on the Boolean expression
(AND/OR) in the WHERE clause. It fetches data using the rowids from the merged and
sorted rowid list using a skip scan. Skip scan is a table scan method similar to a
sequential scan, but uses the rowid lists.
Skip scan
• Index scan retrieves rowids from the index and fetches data using the
rowid
• Skip scan is implemented on sorted rowid list
• Random I/O can be avoided; page is only read once
• Non-essential pages are skipped
• Significant CPU savings over full-table
• Can be used to retrieve rows from a single index or from multiple
indexes
Skip scan
The skip scan works like this:
• An index scan is performed on the index(es) involved in the query.
• For each index, a list of matching rowids is created.
• The resulting lists are sorted into rowid order.
• The lists are then logically joined together (merged with an OR, intersected with
an AND) to produce one list of rowids for rows that match the query.
• The rows are then accessed from the table in rowid order using the resulting
single list. The rows are fetched in sequential order; but since some intervening
rows may not be needed, a skip scan is performed.
A skip scan implements functionality similar to sequential scan, but uses a sorted rowid
list. Since the values in the rowid list are sorted, random I/O can be avoided and a page
does not need to be read more than once. Non-essential (that is, unneeded) pages are
skipped. This results in significant CPU savings over a full-table scan because the
number of data rows retrieved and costly expression evaluation is reduced.
A skip scan can be used to retrieve rows corresponding to any rowid list, either one
constructed from a single index or from multiple indexes.
Multi-index
The onstat -g sql command includes summary information about the last SQL
statement that each session executed. The fields included in onstat -g sql are:
Session ID Session ID of the user who executes the SQL statement.
Exercise 7
The Informix query optimizer
• use the SET EXPLAIN feature
• write SQL queries to illustrate various optimizer access and join plans
Exercise 7:
The Informix query optimizer
Purpose:
In this exercise, you will learn how to use the SET EXPLAIN feature of the
optimizer to determine query access and join plans.
Task 5. Joins.
In this task, you will run queries that will perform a join. You will examine the output
from the ex7.expl file. You will use the onstat -g sql utility to monitor your SQL
session.
1. To demonstrate an index scan using nested loop join, execute the following
SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND c.lname MATCHES "Ed*";
SET EXPLAIN OFF;
2. Examine the ex7.expl file and notice the nested loop join that was used on the
customer and orders tables. Which table was used as the outer table and which
was used as the inner table?
Task 6. Key-first index scan.
In this task, you will run queries that will use the key first index scan. You will then
examine the ex7.expl file generated by the query.
1. To demonstrate a key first index scan, execute the following SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM customer
WHERE (zipcode > "111")
AND ((zipcode = "20744")
OR (zipcode = "30127"));
SET EXPLAIN OFF;
2. Examine the ex7.expl file and notice that the key first index scan was
performed on the zipcode column.
What steps did the optimizer take to reduce the number of rows read?
Results:
In this demonstration, you learned how to use the SET EXPLAIN feature of the
optimizer to determine query access and join plans.
Exercise 7:
The Informix query optimizer - Solutions
Purpose:
In this exercise, you will learn how to use the SET EXPLAIN feature of the
optimizer to determine query access and join plans.
3. Write a SELECT statement that selects the minimum value in the zipcode
column of the customer table.
SET EXPLAIN FILE TO 'ex7.expl';
SELECT min(zipcode) FROM customer;
SET EXPLAIN OFF;
4. Examine the ex7.expl file and notice that a key-only index scan was used on
the zipcode column.
3. Examine the ex7.expl file and that a notice the lower index filter was used on
the order_num and customer_num columns. Notice also the fragment
elimination on the index. Only fragment 1 was scanned.
4. Write a SELECT statement that selects customer_num > 2000 from the
customer table and where the zipcode is between "09*" and "6*".
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM customer
WHERE customer_num > 2000
AND zipcode BETWEEN "09" AND "6";
SET EXPLAIN OFF;
5. Examine the ex7.expl file and notice that an index path on the customer table
was used, with filters applied.
6. To demonstrate that an index scan is using a lower and upper filter, execute the
following SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
OUTPUT TO /dev/null
SELECT * FROM catalog
WHERE catalog_num BETWEEN 10010 AND 10040;
SET EXPLAIN OFF;
7. Examine the ex7.expl file and notice that the lower and upper index filter was
used on the catalog_num column.
8. Write a SELECT statement that selects the fname, lname, and company
columns from the customer table where the company name starts with “Golf"
and the customer number is between 110 and 115.
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT fname, lname, company FROM customer
WHERE company MATCHES "Golf*"
AND customer_num BETWEEN 110 AND 115
ORDER BY lname;
SET EXPLAIN OFF;
9. Examine the ex7.expl file and notice the lower and upper index filters that were
used on the customer_num column.
Task 5. Joins.
In this task, you will run queries that will create a dynamic hash join. You will
examine the output from the ex7.expl file.You will use the onstat -g sql utility to
monitor your SQL session.
1. To demonstrate an index scan using dynamic hash join, execute the following
SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND c.lname MATCHES "Ed*";
SET EXPLAIN OFF;
2. Examine the ex7.expl file and notice the nested loop join that was used on the
customer and orders tables. Which table was used as the outer table and which
was used as the inner table?
The customer table was the outer table and the orders table was the
inner table.
What steps did the optimizer take to reduce the number of rows read?
The optimizer first applies the filter conditions where zipcode > 111 and
equal to 20744. Then the optimizer filters zipcode where the values are
> 111 and equal to 30127. Applying a key first index filter ensures that
the fewest number of data pages are read.
Results:
In this demonstration, you learned how to use the SET EXPLAIN feature of the
optimizer to determine query access and join plans.
Unit summary
• Understand query plans, access plans, and join plans
• Write queries that produce various index scans
Unit summary
Informix (v12.10)
Unit objectives
• Execute the UPDATE STATISTICS statement and explain the results
• Use the system catalog tables to monitor data distributions
Unit objectives
systables sysindices
sysdistrib
syscolumns
The sysdistrib table has columns for storing distribution and sampling information.
If SAMPLING SIZE is specified, either as a percentage or a number of rows, the
value is stored in the smplsize column.
The actual number of rows sampled is stored in the rowssmpld column.
With the keyword SAMPLING, a table named sampling cannot be used in an
UPDATE STATISTICS statement.
147
86 20
123 20 20
32 90
4. Scans each sorted value and retrieves the first value, last value, and every Nth
value where N is:
resolution / 100 * number_of_values.
This information is used to divide the data into bins with each bin containing an
equal number of values. If 10 bins are created, each bin holds one tenth of the
rows in the set. If the database server finds a value that has many duplicates of a
particular value, that value is placed in an overflow bin.
The first and last values are always obtained from the true data, not from the
sample.
• dbschema output
( 0)
Distribution for
1: ( 6005, 16, 16) branch_nbr
2: ( 6005, 9, 26)
3: ( 6005, 4, 31)
4: ( 6005, 4, 36)
5: ( 6005, 4, 41)
6: ( 6005, 4, 46) count distinct high value
7: ( 6005, 4, 51)
8: ( 6005, 4, 56)
count value
9: ( 6005, 13, 69)
10: ( 5957, 29, 99)
Distribution output
The hd option of dbschema displays the information kept for each bin, as well as the
overflow values and their frequency. The -hd option requires a table name or ALL for all
tables.
Only the owner of the table, users that have SELECT permission on the column, or the
DBA can list distributions for a column with dbschema.
The sample dbschema output shown has two sections: the distribution and the overflow
section.
The distribution section shows the values in each bin.
In the example, bin 1 represents 6005 instances of values between 0 and 16. Within
this interval are 16 unique values.
The overflow section shows each value that has many duplicates. For example, value
80 has 1946 duplicates.
Resolution
Resolution
You can use resolution to specify the number of bins to use.
The formula for calculating the number of bins is:
100/resolution = number of bins
A resolution of 1 means that one percent of the data goes into each bin (100/1 = 100
bins). A resolution of 10 means that 10 percent of the data goes into each bin
(100/10 = 10 bins). The resolution can be a number between 0.005 and 10. However,
you cannot specify a number less than 1/(rows in the table).
The lower the resolution value, the more bins are created. The more bins you have, the
more accurate the optimizer can regard the number of rows that satisfy the SELECT
filter. However, if too many bins are allocated, the optimization time can increase
slightly because the system catalog pages that hold the distribution must be read (from
memory if they are in the cache or from disk if they are not).
In actuality, the number of bins allocated in a data distribution can vary slightly from the
results of the formula in the example due to highly duplicate values in a column and the
degree to which column values are clustered.
Confidence
resolution
confidence
Confidence
Confidence is an estimate of the probability that you will stay within the resolution you
choose.
In the example, with a confidence value of 99 percent (confidence .99), your confidence
should be high that the results (that is, the number of rows per bin) of a sample taken to
create the distribution is roughly equivalent to what you would get if all the rows were
examined.
Default confidence
The confidence is expressed as a value between 0.80 and 0.99. The default value is
0.95. Confidence is used only when sampling data for a medium distribution (UPDATE
STATISTICS MEDIUM). The resolution and confidence are used to determine the
sample size for medium distributions.
Default sample size
By default, the size of the sample that is taken for UPDATE STATISTICS MEDIUM
depends on the resolution and confidence. By increasing the confidence or decreasing
the resolution value, the sample size increases.
The sample size does not depend on the size (population) of the table, so for larger
tables it might not be truly representative of the data.
For larger tables, the sample size can be specified using the SAMPLING SIZE
parameter.
BT Merger
BT BT BT
Appender Appender Appender
Queue
Mini-bin Mini-bin Mini-bin
builder builder builder
Mini-bin
Sorter
Problem queries (1 of 2)
• To resolve a problem query:
Run the query with SET EXPLAIN ON to record the query plan.
Run UPDATE STATISTICS HIGH for the columns listed in the WHERE
clause.
Run the query with SET EXPLAIN ON. Was the estimated cost less?
Compare the query plans.
Compare runtime statistics stored in syssqlcurses.
syssqlcurses
Problem queries
In most cases, the recommended strategy should yield a good enough sample size for
the optimizer to pick the correct path for most queries.
However, if a query is a problem (one that you perceive to be running slower than it
should), then take the following steps:
• Run the query with SET EXPLAIN ON to record the query plan.
• Run UPDATE STATISTICS HIGH for the columns listed in the WHERE clause.
• Run the query with SET EXPLAIN ON. Was the estimated cost less?
• Compare the query plans. If UDPATE STATISTICS HIGH produced a different
query plan and the estimated cost is less, the optimizer made a better choice and
the SELECT statement benefited from having more data available to the
optimizer.
• In addition to comparing before and after query plans, you might want to compare
the corresponding runtime statistics. To do so, set the SQLSTATS environment
variable to 2, run the query, and then use your session ID to query the
syssqlcurses table in the sysmaster database.
Problem queries (2 of 2)
• If UPDATE STATISTICS HIGH produced improved query
performance:
Run UPDATE STATISTICS MEDIUM with CONFIDENCE of .99 and an
increased RESOLUTION.
Rerun the query with SET EXPLAIN ON.
Check the query plan to see if it produced the same results as with
UPDATE STATISTICS HIGH.
syssqlcurses
You might have received better results with UPDATE STATISTICS HIGH. However,
it might not be feasible for you to take the extra time each day to run HIGH mode on
these columns. Instead, you can move back to UPDATE STATISTICS MEDIUM for
the columns involved in the query, but this time set the confidence to 0.99 and
adjust the resolution value slightly lower so that the sample size is higher. Then
rerun the query and check the query plan to see if it returned the same results as
HIGH mode. You can repeat this process until the query plan matches the query
plan of HIGH mode.
Dropping distributions
• To drop distributions while you update other statistics:
Dropping distributions
You can use the DROP DISTRIBUTIONS clause in an UPDATE STATISTICS LOW
statement to drop the existing distributions while you update other statistics such as
the number of levels of the B+ tree, the number of pages that the index uses, and so
on.
When you run UPDATE STATISTICS LOW without the DROP DISTRIBUTIONS
clause, only the statistics in systables, sysindexes, and syscolumns are updated.
The distributions are not dropped or altered in any way.
When you run UPDATE STATISTICS LOW on a table or specific column with the
DROP DISTRIBUTIONS clause, the statistics in systables, sysindexes, and
syscolumns for that table or specific column are updated and any distributions for
the table or specific column listed are dropped.
Who can drop distributions?
Only a DBA-privileged user or the owner of a table can drop distribution information.
sysdistrib
Space utilization
• DBUPSPACE
Limits the amount of disk space used for sorts during UPDATE STATISTICS
− Minimum used is 5 MB
Also limits amount of memory used for sorts to 4 MB
export DBUPSPACE=max_disk_space:max_memory
Space utilization
The UPDATE STATISTICS statement attempts to construct distributions
simultaneously for as many columns as possible. This minimizes the number of scans
needed for a table, and makes UPDATE STATISTICS run more efficiently. However,
with more distributions being created at once, the need for temporary disk space
increases.
Environment variable DBUPSPACE
You can set the DBUPSPACE environment variable before you run UPDATE
STATISTICS to constrain the amount of temporary disk space used for sorts. The
database server calculates how much disk space is needed for each sort and starts as
many distributions at once as can fit in the space allocated. At least one distribution is
created at one time, even if DBUPSPACE is set too low to accommodate it. If
DBUPSPACE is set to any value less than 1000 kilobytes, it is ignored and the value of
5000 kilobytes is used.
Memory utilization
In addition to limiting the amount of disk space used for sorts during UPDATE
STATISTICS, the database server limits the amount of memory used to 4 megabytes.
However, at least one distribution is created at one time, even if more than 4
megabytes are needed.
Thread use
A sort can occur for every column for which you are building a distribution. If
PSORT_NPROCS is set, each sort can use up to PSORT_NPROCS threads.
Fragment-level statistics
• Store statistics at the fragment level and aggregating table-level
statistics from the statistics of the constituent fragment
• Catalog table sysfragdist stores statistics for each fragment for each
table and column
• Fragment-level statistics of constituent fragments are merged to form
table-level statistics and stored in sysdistrib
• Fragment-level statistics are encrypted and stored in an sbspace
defined by SYSSBSPACENAME in the ONCONFIG file. Column
encdist in the sysfragdist catalog stores the large object specifications
• Controlled by table property STATLEVEL
Fragment-level statistics
When tables are stored across multiple partitions, statistics for the table can be
maintained at the fragment level. Fragment-level statistics are stored in the
sysfragdist catalog table.
STATLEVEL property
• STATLEVEL defines the granularity of statistics for a table
• Set using CREATE or ALTER TABLE
• Syntax:
CREATE TABLE...STATLEVEL {TABLE | FRAGMENT | AUTO}
TABLE – the entire table dataset is read and table-level statistics are stored
in sysdistrib catalog
FRAGMENT – the dataset of each fragment is read and fragment-level
statistics are stored in sysfragdist catalog (only allowed for fragmented
tables)
AUTO (default)– the system determines whether TABLE or FRAGMENT
statistics are created when UPDATE STATISTICS is run
STATLEVEL property
When you create or alter a table, you can set the granularity of statistics that are
maintained by specifying the STATLEVEL. You can specify a STATLEVEL of TABLE,
FRAGMENT, or AUTO.
Exercise 8
Updating statistics and data distributions
• create and run UPDATE STATISTICS commands
• compare optimizer query plans based on generated statistics
Exercise 8:
Updating statistics and data distributions
Purpose:
In this exercise, you will learn how to use the UPDATE STATISTICS statement
and how to create data distributions using UPDATE STATISTICS.
9. Query the sysdistrib system catalog table and gather the following information
for column 1 on the customer table.
SELECT *
FROM sysdistrib d, systables t
WHERE t.tabname = "customer"
AND t.tabid = d.tabid
AND d.colno = 1;
Table customer:
What is the resolution used to create the distribution?
What is the confidence used to create the distribution?
What is the mode?
10. Use dbschema to generate distribution information to a file for the items and
customer tables and answer the following questions about the stock_num
column from the items table and lname column from the customer table:
Table items.stock_num:
How many bins were created?
How many distinct values are represented by each regular bin?
How many overflow bins are created?
Do all the overflow bins contain the same number of values?
Table customer.lname:
How many bins were created?
How many distinct values are represented by each regular bin?
How many overflow bins are created?
Do all the overflow bins contain the same number of values?
Task 2. Understanding the optimizer choice.
1. Using the following query from the previous task, capture the Explain plan and
compare the optimizer’s choices with the choices from the previous exercise.
SET EXPLAIN FILE TO 'ex08.expl';
UNLOAD TO /dev/null
SELECT * FROM customer
WHERE fname = 'Douglas';
SET EXPLAIN OFF;
How have the optimizer’s choices changed?
How has the optimizer’s choice changed?
Result:
In this exercise, you learned how to use the UPDATE STATISTICS statement
and how to create data distributions using UPDATE STATISTICS.
Exercise 8:
Updating statistics and data distributions - Solutions
Purpose:
In this exercise, you will learn how to use the UPDATE STATISTICS statement
and how to create data distributions using UPDATE STATISTICS.
--HIGH for the indexed columns in the items table (or for
--those with a composite index, the first column in the
--index)
UPDATE STATISTICS HIGH FOR table items(item_num)
RESOLUTION 10;
UPDATE STATISTICS HIGH FOR table items(order_num)
RESOLUTION 10;
UPDATE STATISTICS HIGH FOR table items(manu_code)
RESOLUTION 10;
4. Create a duplicate index on the lname and fname columns of the customer
table. Name this index cust_name_idx.
CREATE INDEX cust_name_idx ON customer(lname, fname);
5. Drop all data distributions. Since data distributions were automatically built when
you created your indexes, they must be dropped in order to see the effect of
distributions on the optimizer. Execute the following SQL statement in your
stores_demo database to drop the existing distributions:
UPDATE STATISTICS LOW DROP DISTRIBUTIONS;
Table customer:
What is the resolution used to create the distribution? 10.0
What is the confidence used to create the distribution? 0
What is the mode? H for HIGH
10. Use dbschema to generate distribution information to a file for the items and
customer tables and answer the following questions about the stock_num
column from the items table and lname column from the customer table:
Table items.stock_num:
$ dbschema -hd items -d stores_demo
How many bins were created? 5
Table customer.lname:
$ dbschema -hd customer -d stores_demo
How many bins were created? 4
Unit summary
• Execute the UPDATE STATISTICS statement and explain the results
• Use the system catalog tables to monitor data distributions
Unit summary
Informix (v12.10)
Unit objectives
• Describe the effect on the engine of the different values of
OPTCOMPIND
• Describe the effects of setting the OPT_GOAL parameter
• Write optimizer directives to improve performance
Unit objectives
OPTCOMPIND
OPTCOMPIND
The OPTCOMPIND configuration parameter allows you to influence the behavior of the
optimizer for all queries executed against the database server. You can override this
global behavior by setting the OPTCOMPIND environment variable in a session
environment.
The OPTCOMPIND value influences both the access plan and join plan chosen by the
optimizer. An OPTCOMPIND setting of 0, or 1 with an active isolation level of
repeatable read, instructs the optimizer to consider:
• Only index-scan access paths only when an index is available
• A sequential-scan access path only when no index is available
• Nested-loop joins
When OPTCOMPIND is 2, or OPTCOMPIND is 1 and the isolation level is not
repeatable read, the optimizer chooses the lowest cost path. An OPTCOMPIND value
of 2 is the default onconfig.std setting.
The value entered using the SET ENVIRONMENT OPTCOMPIND command takes
precedence over the current setting specified in the configuration file. The default
setting of the OPTCOMPIND environment variable is restored when the current session
terminates. No other user sessions are affected by SET ENVIRONMENT
OPTCOMPIND statements that a session executes.
The quotes shown around the number are required.
Running SET ENVIRONMENT OPTCOMPIND DEFAULT reverts the session back to
the values specified by the configuration parameter.
SET OPTIMIZATION
SET OPTIMIZATION
The SET OPTIMIZATION SQL statement allows you to specify the optimization goal
and time that the optimizer spends considering alternative query paths.
FIRST_ROWS versus ALL_ROWS
The goal in optimizing the query might be to retrieve the first buffer of rows as quickly as
possible, or to retrieve all rows in the quickest manner. If the application is an end-user
query tool, you might choose FIRST_ROWS optimization.
Suppose the end user is a financial analyst performing a series of what-if scenarios. To
run each scenario, they submit a query that retrieves many rows. After viewing just a
few rows, they realizes that this scenario does not produce the needed result and they
move on to the next query. By choosing FIRST_ROWS optimization, the user might
receive a quicker response from the database server, which in turn, allows them to be
more productive.
For a batch application that processes payroll updates for several thousand employees,
however, ALL_ROWS optimization is probably the most desirable optimization method
and will likely produce the best performance.
OPTIMIZATION LOW
ab ac ad bc ....
acb acd
Examine only the lowest
acdb cost paths
OPTIMIZATION LOW
The example shows how the optimizer might choose a query path when optimization is
set to LOW. At each level (two-way join, three-way join, four-way join), the lowest-cost
join is chosen and the other paths are not examined further.
In the SET OPTIMIZATION LOW example, ac is chosen as the least-cost two-way join.
The other two-way joins are not examined any further. Next, the three-way joins
possible from the ac join are examined. Again, only the least-cost join is followed down
to the next level. As you can see, the number of joins that must be examined is
drastically reduced.
The SET OPTIMIZATION LOW statement reduces the time spent optimizing the query,
but increases the risk of a less efficient query plan being chosen.
FIRST_ROWS
• Useful for decision support environments and online query and
reporting activities
• Instructs the optimizer to choose the query path that returns the first
buffer of rows quickest (Note: Total query time might be longer!)
• To use statement by statement:
SQL statement:
SET OPTIMIZATION FIRST_ROWS;
Use FIRST_ROWS optimizer directive
• OPT_GOAL (-1 = ALL_ROWS(Default), 0 = FIRST_ROWS)
Place in ONCONFIG file to set as default for the database server
FIRST_ROWS
The SET OPTIMIZATION FIRST_ROWS statement is useful for decision support
environments and online query and reporting activities. It instructs the optimizer to
choose a query path that returns the first buffer of rows most quickly, even if the
time for retrieving all rows increases.
When you use the SQL statement:
SET OPTIMIZATION FIRST_ROWS;
the optimization goal remains in effect until the end of the process or until you
execute a statement to set ALL_ROWS optimization.
You can also specify FIRST ROWS optimization as the default for your instance by
setting the OPT_GOAL parameter in your configuration file. For example:
# Optimization goal: -1 = ALL_ROWS(Default), 0 = FIRST_ROWS
OPT_GOAL -1
Alternatively, you can set an optimization goal for a particular user environment
using the environment variable OPT_GOAL.
# ksh
export OPT_GOAL=0
If you only want to use FIRST_ROWS optimization for a single query, you can use
the optimizer directive FIRST_ROWS.
SELECT --+ FIRST_ROWS
fname, lname FROM customer
ORDER BY lname;
Optimizer directives are covered next in this unit.
Using directives
• Positive directives
• Negative directives
Directive
Using directives
Informix provides both positive and negative optimizer directives. A positive directive
instructs the optimizer to limit its choice to a certain set of paths. A negative
directive instructs the optimizer to avoid certain less-than-optimal paths. The
optimizer is still free to consider all other paths. If a new index provides an improved
path, the directive does not need to be changed. The optimizer is automatically free
to choose the new path.
If the outer circle is the set of all possible query paths, a positive directive instructs
the optimizer to limit considered paths to paths that are in the inner circle. A
negative directive allows the optimizer to choose any path in the set defined by the
outer circle, but not the inner circle.
Use negative directives rather than positive directives whenever possible. This
ensures that directives require less maintenance, because as indexes are added
and removed or data distributions change, the optimizer is free to select a more
optimal path whenever one is available.
Identifying directives
• Directives are identified by using comment notation followed by a plus
sign:
SELECT --+ORDERED,INDEX(c cust_idx)
*
FROM customer c, orders o
WHERE c.customer_num BETWEEN 101 AND 1001
AND c.customer_num = o.customer_num;
• SQEXPLAIN output
...
DIRECTIVES FOLLOWED:
ORDERED
DIRECTIVES NOT FOLLOWED:
INDEX ( c cust_idx ) Invalid Index Name Specified.
Estimated cost: 4
− ...
Managing the optimizer © Copyright IBM Corporation 2017
Identifying directives
Directives are identified by using comment notation followed by a plus sign (+). Valid
syntax includes:
--+directive text
{+directive text}
/*+directive text */
In order for directives to be allowed in Informix ESQL products, the ESQL
preprocessor has been modified to pass comments containing directives to the
database server instead of stripping them out.
Directives can also tell the optimizer what to AVOID, rather than what to choose.
This allows you to write a directive to avoid certain actions that are known to cause
performance problems for a query. The optimizer can still explore using any new
indexes or table attributes as they are added.
When directives are used, two headings are added to the SET EXPLAIN output:
DIRECTIVES FOLLOWED:
DIRECTIVES NOT FOLLOWED:
If the directive is followed, it is listed under the DIRECTIVES FOLLOWED heading.
• FULL (tablename): This forces the optimizer to perform a sequential scan on the
table specified, even if an index exists on a column in the table. For example:
SELECT --+FULL(e)
name, salary
FROM emp e;
• AVOID_FULL (tablename): The optimizer considers the various indexes it can
scan. If no indexes exist, the optimizer performs a full-table scan. This directive
might be used with REPEATABLE READ isolation level, for example, to avoid the
full-table scan and subsequent locking.
• INDEX_SJ: This directive forces an index self-join path using the specified index,
or choosing the least costly index in a list of indexes, even if data distribution
statistics are not available for the leading index key columns of the index.
• AVOID_INDEX_SJ: Tells the optimizer not to use an index self-join path for the
specified index or indexes.
The optimizer automatically considers the index self-join path if you specify the INDEX
or AVOID_FULL directive. Use the INDEX_SJ directive only to force an index self-join
path using the specified index (or the least costly index in a comma-separated list of
indexes). The INDEX_SJ directive can improve performance when a multicolumn index
includes columns that provide only low selectivity as index key filters.
Specifying the INDEX_SJ directive circumvents the usual optimizer requirement for
data distribution statistics on the lead keys of the index. This directive causes the
optimizer to consider an index self-join path, even if data distribution statistics are not
available for the leading index key columns. In this case, the optimizer only includes the
minimum number of index key columns as lead keys to satisfy the directive.
For example, if an index is defined on columns c1, c2, c3, c4, and the query specifies
filters on all four of these columns but no data distributions are available on any column,
specifying INDEX_SJ on this index results in column c1 being used as the lead key in
an index self-join path.
If you want the optimizer to use an index but not to consider the index self-join path,
then you must specify an INDEX or AVOID_FULL directive to choose the index, and
you must also specify an AVOID_INDEX_SJ directive to prevent the optimizer from
considering any other index self-join path.
Multiple directives can be used as long as they are in the same comment block.
SELECT --+ORDERED
c.customer_num, sum(total_price)
FROM customer c, orders o, items i
WHERE c.customer_num = o.customer_num
AND o.order_num = i.order_num
GROUP BY 1;
AVOID_HASH
EXPLAIN directive
SELECT --+ORDERED, EXPLAIN
c.customer_num, sum(total_price)
FROM customer c, orders o, items i
WHERE c.customer_num = o.customer_num
AND o.order_num = i.order_num
GROUP BY 1;
DIRECTIVES FOLLOWED:
ORDERED
EXPLAIN
DIRECTIVES NOT FOLLOWED:
Estimated Cost: 17
Estimated # of Rows Returned: 1
1) informix.c: INDEX PATH ...
Managing the optimizer © Copyright IBM Corporation 2017
EXPLAIN directive
The EXPLAIN directive allows you to turn on the SQL explain feature directly from
the query. The explain feature is very helpful for testing directives.
Directives
• CAN be used:
In SELECT, UPDATE, and DELETE statements
In SELECT statements embedded in INSERT statements
In views, stored procedures and triggers
• CANNOT be used:
In distributed queries that access remote tables
For UPDATE/DELETE WHERE CURRENT OF statements
• Multiple compatible directives can be used within the same comment
block
Directives
The visual describes when directives can and cannot be used.
External directives
• Allow dynamic substitution of SQL with directive in packaged
applications
• Use SAVE EXTERNAL DIRECTIVE SQL statement
• Requires ONCONFIG configuration parameter and session
environment variable
• Saved in sysdirectives system catalog table
• Example:
SAVE EXTERNAL DIRECTIVES /*+ avoid_full(table1) */
ACTIVE
FOR SELECT * FROM table1 WHERE col_1 > 5000;
External directives
The external directive feature of Informix allows the dynamic rewrite of an SQL
statement to add directives. External optimizer directives are useful when it is not
feasible to rewrite a query for a short-term solution to a problem, such as when a
query starts to perform poorly.
For external directives to work, the EXT_DIRECTIVES configuration parameter
must be set to a value greater than zero at server initialization time, and the
IFX_EXTDIRECTIVES variable might need to be set on the client.
Exercise 9
Managing the optimizer
• examine the effect on the optimizer of using Optimizer Directives
Exercise 9 :
Managing the optimizer
Purpose:
In this exercise, you will learn how to manage the optimizer using optimizer
parameters and directives.
UNLOAD TO /dev/null
SELECT --+INDEX(c zipcode_ix)
* FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
UNLOAD TO /dev/null
SELECT --+AVOID_INDEX(c zipcode_ix)
* FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
UNLOAD TO /dev/null
SELECT --+FULL(customer)
* FROM customer
WHERE zipcode > "65";
UNLOAD TO /dev/null
SELECT --+AVOID_FULL(customer)
* FROM customer
WHERE zipcode > "65";
UNLOAD TO /dev/null
SELECT --+ORDERED
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;
UNLOAD TO /dev/null
SELECT --+AVOID_HASH(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;
UNLOAD TO /dev/null
SELECT --+USE_NL(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
UNLOAD TO /dev/null
SELECT --+USE_NL(customer)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
UNLOAD TO /dev/null
SELECT --+AVOID_NL(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
SET EXPLAIN OFF;
Task 4. Using optimizer goal directives
In this task, you will demonstrate the use of optimization goal directives and analyze
the query plan developed for the following SQL statement. Use the EXPLAIN
directive to generate an Explain plan file for the query.
1. Run the following query and analyze the query plan developed for the SQL
statement generated in the sqexplain.out file.
SELECT --+ORDERED, EXPLAIN
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;
Results:
In this exercise, you learned how to manage the optimizer using optimizer
parameters and directives.
Exercise 9:
Managing the optimizer - Solutions
Purpose:
In this exercise, you will learn how to manage the optimizer using optimizer
parameters and directives.
UNLOAD TO /dev/null
SELECT --+INDEX(c zipcode_ix)
* FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
UNLOAD TO /dev/null
SELECT --+AVOID_INDEX(c zipcode_ix)
* FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
UNLOAD TO /dev/null
SELECT --+FULL(customer)
* FROM customer
WHERE zipcode > "65";
UNLOAD TO /dev/null
SELECT --+AVOID_FULL(customer)
* FROM customer
WHERE zipcode > "65";
In the above query, the optimizer used the zipcode_ix index from the
customer table as directed.
In the above query, the optimizer was directed to AVOID using the
zipcode_ix index on the customer table, so it did a sequential scan. Even
though there was a filter on customer_num, which is indexed, the
optimizer was going to have to read such a large part of the table that
reading the index was going to add more I/O than by just doing the
sequential scan on the table.
UNLOAD TO /dev/null
SELECT --+ORDERED
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;
In this query, the optimizer was directed to join the tables based on the
order of the tables listed in the FROM clause. The customer table was
processed first. Notice also that the optimizer chose to do a sequential
scan on both tables instead of using an index, and performed a dynamic
hash join.
Task 3. Using join method directives.
In this task, you will demonstrate the use of join-method optimizer directive and
analyze the query plans developed for the following SQL statement. Use SET
EXPLAIN to generate an ex09.expl Explain plan file for the query.
1. Run the following queries and compare the query plans developed for the SQL
statements generated in the ex09.expl file. Make sure you include the “SET
EXPLAIN OFF” statement.
SET EXPLAIN FILE TO "ex09.expl";
UNLOAD TO /dev/null
SELECT --+USE_HASH(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;
UNLOAD TO /dev/null
SELECT --+AVOID_HASH(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;
UNLOAD TO /dev/null
SELECT --+USE_NL(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
UNLOAD TO /dev/null
SELECT --+USE_NL(customer)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
UNLOAD TO /dev/null
SELECT --+AVOID_NL(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
SET EXPLAIN OFF;
In this query, the optimizer was directed to use a hash join. Since there
is no Build Outer notation, we know that the orders table was used as
the build table and the customer table was used as the probe table.
In this query, the optimizer was directed to avoid using a hash join on
the orders table, so it used a nested loop join.
In the first query, the optimizer was directed to use a nested-loop join on
the orders table. In the second one, it was directed to use a nested-loop
join on the customer table. Notice the difference in the cost of the two
queries. Specifying which table to do the nested-loop join on can make a
significant difference in performance.
Unit summary
Having completed this unit, you should be able to:
• Describe the effect on the engine of the different values of
OPTCOMPIND
• Describe the effects of setting the OPT_GOAL parameter
• Write optimizer directives to improve performance
Unit summary
Informix (v12.10)
Unit objectives
• Explain the benefits of referential and entity integrity
• Specify default values, check constraints, and referential constraints
Unit objectives
Definitions
• Referential integrity: Are the relationships between tables enforced?
• Entity integrity: Does each row in the table have a unique identifier?
• Semantic integrity: Does the data in the columns properly reflect the
types of information the column was designed to hold?
Definitions
Integrity is the accuracy or correctness of the data in the database. More definitions of
terms used in this and the next unit are:
• Referential integrity
Referential integrity enforces the primary and foreign key relationships between
tables. For example, a customer record must exist before an order can be placed
for that customer.
• Entity integrity
Entity integrity is enforced by creating a primary key that uniquely identifies each
row in a table.
• Semantic integrity
Semantic integrity is enforced by using the following:
• Data types: The data type defines the type of values that you can store in a
column. For example, the data type smallint allows you to enter values from
-32,767 to 32,767.
• Default values: The default value is the value inserted in a column when an
explicit value is not specified. For example, the user_id column of a table might
default to the login name of the user if a name is not entered.
prog.4gl
form1.per form2.per
IBM Informix
Data
database server
prog.4gl
form1.per form2.per
Constraint names
CREATE TABLE orders (
order_num INTEGER UNIQUE
CONSTRAINT order_num_uq,
order_date DATE NOT NULL
CONSTRAINT order_date_nn
DEFAULT TODAY);
Constraint names
A constraint is identified by its name. You can assign a name or use the default name
that the database server assigns. The names must be unique within a database.
Names are assigned to all constraints: primary key, foreign key, unique, check, and
NOT NULL. Constraint names are stored in the sysconstraints system catalog table.
If you want to change the mode (for example, enabled, disabled, or filtering) of a
specific constraint, you must know its name (see the Modes and Violation Detection
unit):
SET CONSTRAINTS pk_items, fk1_items TO DISABLED;
You can also set constraints for an entire table without having to know the constraint
name. For example:
SET CONSTRAINTS FOR TABLE orders TO DISABLED;
To drop a constraint without altering the table in any other way, use the DROP clause
with the ALTER TABLE command:
ALTER TABLE items DROP CONSTRAINT pk_items;
System default names are a composite of a constraint ID code, a table ID, and a unique
constraint ID. Name your constraints using a naming convention instead of taking the
system default. This makes identifying a constraint and its purpose easier.
CHECK constraint
ALTER TABLE items ADD CONSTRAINT
CHECK (quantity >= 1 AND quantity <= 10)
CONSTRAINT ck_items_qty;
CHECK constraint
You can add constraints to both the tables and columns.
Examples
• The first example adds a table-level constraint.
• The second example adds an equivalent constraint at the column level. If you
add a constraint to a column, you cannot reference other columns.
• The third example illustrates the error that occurs when you try to add
constraints to a table and a column.
• The fourth example shows how a constraint that references multiple columns
can be added at the table level. The columns must be from the same table.
When you modify a column, you modify everything about that column, which is why
the MODIFY clause must include the data type. If you do not list all constraints with
the MODIFY clause, any constraints not listed are dropped.
• When a user inserts a row into a child table, if all FOREIGN KEYS are non-NULL
and no corresponding PRIMARY KEY is present, the insert fails.
To fully enforce referential integrity, do not allow NULL values in the primary and foreign
key columns.
Cascading deletes
CREATE TABLE customer (
Parent
customer_num INT,
PRIMARY KEY(customer_num));
customer_num INT,
PRIMARY KEY(order_num),
FOREIGN KEY(customer_num) REFERENCES customer
ON DELETE CASCADE);
---------------------------------------------------
DELETE FROM customer WHERE customer_num = 1;
Cascading deletes
Cascading deletes let you define a referential constraint in which the database server
automatically deletes child rows when the corresponding parent row is deleted. This
feature is useful in simplifying application code and logic.
Cascading deletes provide a performance enhancement because by automatically
deleting rows in the database server rather than requiring the application to delete child
rows first, fewer SQL statements are processed. The database server can process
deletes more efficiently because the overhead of an SQL statement is not incurred.
If for any reason the original DELETE statement fails or the resulting DELETE
statements on the child rows fail, the entire DELETE statement is rolled back (parent
and child rows are rolled back).
To invoke cascading deletes, add the ON DELETE CASCADE clause after the
REFERENCES clause in the CREATE TABLE statement for the child table.
Restrictions
• Restrictions on using cascading deletes:
The database must have logging
The child table cannot be involved in a correlated subquery where the
DELETE statement is involved in the parent table
Restrictions
Here are some of the restrictions in using cascading deletes:
• Referential integrity with cascading deletes can be created with logging off.
However, the cascading deletes are not activated. If logging is turned off,
cascading deletes are deactivated (you receive a referential integrity error). Once
you turn on logging, cascading deletes are automatically reactivated; no action is
necessary by the administrator.
• A correlated subquery that uses the child table in a DELETE statement for the
parent table does not use cascading deletes. Instead, you receive the error:
735: Cannot reference table that participates in a cascaded delete.
sysdefaults sysconstraints
Exercise 10
Referential and entity integrity
• Work with referential and check constraints
Exercise 10 :
Referential and entity integrity
Purpose:
In this exercise, you will learn how to create and maintain referential and
entity integrity using constraints.
Exercise 10:
Referential and entity integrity - Solutions
Purpose:
In this exercise, you will learn how to create and maintain referential and
entity integrity using constraints.
4. Query the sysconstraints system catalog table for each constraint added.
Table customer:
SELECT constrid, constrname, c.owner, c.tabid,
constrtype, idxname
FROM sysconstraints c, systables t
WHERE t.tabname = "customer" AND c.tabid = t.tabid;
Table orders:
SELECT constrid, constrname, c.owner, c.tabid,
constrtype, idxname
FROM sysconstraints c, systables t
WHERE t.tabname = "orders" AND c.tabid = t.tabid;
Table items:
SELECT constrid, constrname, c.owner, c.tabid,
constrtype, idxname
FROM sysconstraints c, systables t
WHERE t.tabname = "items" AND c.tabid = t.tabid;
5. Did you notice additional constraints? If so, what columns are they for? Yes.
The additional constraints are in the customer and orders tables. These
are the NOT NULL constraints (constrtype ‘N’) automatically created for
the serial columns.
Task 2. Create the manufact table with referential integrity.
In this task, you will create a new table manufact and include the referential
constraints. You will use dbaccess to gather information about the manufact table.
1. Create a manufact table in your database in the dbspace dbspace4 with the
following columns and assign a Primary Key. Name the Primary Key pk_manu.
Column Description
name
manu_code Manufacturer code. This will be a 3-character identifier.
manu_name Name of manufacturer. Allow for a length of 15.
lead_time The interval of time in days to allow for delivery of orders
from this manufacturer.
3. Insert a couple of rows into the items table to verify the ALTER TABLE
statement in Step 2.
INSERT INTO items VALUES (1,1004,110,"ANZ",1,840);
INSERT INTO items VALUES (1,1005,201,"ANZ",0,200);
In the second INSERT statement above, the value of "0" for quantity fails
when trying to enter a value less than 1. The check constraint on the
quantity column catches this and returns the following error (your
constraint name might be different):
530: Check constraint (informix.c121_28) failed.
Results:
In this exercise, you learned how to create and maintain referential and entity
integrity using constraints
Unit summary
• Explain the benefits of referential and entity integrity
• Specify default values, check constraints, and referential constraints
Unit summary
Managing constraints
Informix (v12.10)
Unit objectives
• Determine when constraint checking occurs
• Drop a constraint
• Delete and update a parent row
• Insert and update a child row
Unit objectives
1 2 2
2 2 3
3 3 4
4 4 5
5 5 6
For referential constraints, a memory buffer or temp table records the violations.
One temp table records violations for each referential pair. The temp table contains
key values that were violated. As rows are inserted, deleted, and updated, the temp
tables are updated to reflect new violations and removal of old ones. Later, when
checking is done, the temp files are scanned and for those keys that are still valid,
the violations are revalidated. As violations are resolved, records are removed from
the temp table.
For check constraints, a memory buffer or temp table is also used. However, this
time the temp table records only the rowids of the violating rows. As rows are
updated, rows that now pass the check constraints are removed. When checking is
done, the temp table should now be empty.
For unique indexes, the checking is done on a row-by-row basis instead of at the
end of the statement. If you want to perform effective checking, use unique
constraints rather than creating unique indexes.
BEGIN WORK;
SET CONSTRAINTS ALL DEFERRED;
UPDATE orders SET order_num = 1006
WHERE order_num = 1001;
UPDATE items SET order_num = 1006
WHERE order_num = 1001;
COMMIT WORK;
In the example, if constraint mode is not set to deferred, the statement fails. This
happens because the referential constraint enforces the rule that all items must
have an order. The failure would occur at the first update statement because there
would be items that did not have an order (order number 1001 no longer exists in
the orders table).
Deferred checking is implemented similarly to immediate checking. However, the
checks for violations are made at the end of the transaction as opposed to the end
of the statement.
You must put the SET CONSTRAINTS ALL DEFERRED statement within a
transaction. It is valid from the time that it is set until the end of the transaction.
You can also replace the keyword ALL with the constraint name to defer only a
specific constraint. For example:
SET CONSTRAINTS uniq_ord DEFERRED;
Unique indexes (that is, indexes created using the UNIQUE keyword) are not used
by deferred checking. If you want the checking done at commit time, use unique
constraints rather than creating unique indexes.
1 2 1
2 2 2
3 3 3
4 4 4
5 5 5
Performance effect
• Referential constraints are implemented by using indexes on both
Primary and Foreign Keys.
• Indexes must be updated on UPDATE, INSERT, and DELETE.
• Index lookups on each UPDATE, INSERT, and DELETE.
• Numerous locks are required. Shared locks are held on all indexes
being used.
Performance effect
Unique constraints are enforced by internally creating unique indexes on
appropriate columns.
When a referential constraint is created, a non-unique index is built on the foreign
key columns. If an index exists, then that index is used.
A column can have both a referential and a unique constraint. A column can also
have multiple different referential constraints. In these situations, a single index is
used to enforce the multiple constraints.
Dropping a constraint (1 of 2)
Dropping a constraint
When a primary key constraint is dropped, any corresponding foreign key
constraints are automatically dropped as well. When a foreign key constraint is
dropped, the corresponding primary key constraint is not affected.
Dropping a constraint (2 of 2)
CREATE TABLE orders (
order_num SERIAL,
order_date DATE,
PRIMARY KEY (order_num)
CONSTRAINT pk_orders);
When a column that has a constraint is dropped, the action can affect more than just
the table that is mentioned in the ALTER TABLE statement. Any constraints that
reference the dropped column are also dropped.
In the example, dropping the primary key column in table orders requires table items
to be locked while the constraint is dropped. Also, the index that was used to
implement the constraint is dropped only if it was built for this constraint.
orders items
1003 …
1004 1004
1005 1004
… …
orders items
1003 …
1004 1004
1005 1004
… 1004
Exercise 11
Managing constraints
• investigate the effects of constraints and constraint checking
Exercise 11 :
Managing constraints
Purpose:
In this exercise, you will learn how to maintain referential and entity integrity
constraints.
Exercise 11:
Managing constraints - Solutions
Purpose:
In this exercise, you will learn how to maintain referential and entity integrity
constraints.
3. Increment the col1 column by 1. Did the update execute? Why or why not?
UPDATE parent_tab SET col1 = col1 + 1;
Yes
Because during immediate constraint checking the constraint
checking occurs at the end the statement. When all the rows were
updated a duplicate value did not exist.
2. Update the tables again using the correct syntax to allow deferred constraint
checking.
BEGIN WORK;
SET CONSTRAINTS ALL DEFERRED;
COMMIT WORK;
3. Query each table for the manu_code "PRC” to verify the update was
successful.
Table manufact:
SELECT count(*) FROM manufact
WHERE manu_code = "PRC";
Table stock:
SELECT count(*) FROM stock
WHERE manu_code = "PRC";
Table items:
SELECT count(*) FROM items
WHERE manu_code = "PRC";
577: A constraint of the same type already exists on the column set.
There is already a unique constraint on col1. In order to change the
unique constraint into a Primary Key constraint, the unique
constraint must be dropped first.
3. Create a table called child_tab in your database with the following column and
include a Foreign Key called fk_child_parent.
Column Name Description
c_col1 Integer.
CREATE TABLE child_tab (
c_col1 INTEGER,
FOREIGN KEY (c_col1)
REFERENCES parent_tab
CONSTRAINT fk_child_parent
);
5. Update the parent table parent_tab and change the col1 column value 5 to 25.
Why did the statement fail?
UPDATE parent_tab
SET col1 = 25
WHERE col1 = 5;
6. Delete from the parent table parent_tab where the col1 column value is 5.
Why did the statement fail?
DELETE FROM parent_tab
WHERE col1 = 5;
2. Update the child table child_tab and change the c_col1 column value 5 to 25.
Why did the statement fail?
UPDATE child_tab
SET c_col1 = 25
WHERE c_col1 = 5;
Unit summary
• Determine when constraint checking occurs
• Drop a constraint
• Delete and update a parent row
• Insert and update a child row
Unit summary
Informix (v12.10)
Unit objectives
• Enable and disable constraints and indexes
• Use the filtering mode for constraints and the indexes
• Reconcile the violations recorded in the database
Unit objectives
Disabling an object
• Disabling individual objects:
SET CONSTRAINTS c117_11, c117_12 DISABLED;
SET INDEXES idx_x1 DISABLED;
SET TRIGGERS upd_cust DISABLED;
• Disabling all objects for a table:
SET CONSTRAINTS, INDEXES, TRIGGERS
FOR customer DISABLED;
• Results:
Constraints are not checked
Triggers do not fire
Indexes are not updated or used for queries
Disabling an object
The two methods to disable objects are:
• Individually: To disable a constraint, index, or trigger, specify the object name.
You can find the constraint and index names with the dbschema utility or
dbaccess tool, or by querying the system catalog.
• By table: You can disable all constraints, triggers, and indexes for a table with
one SQL statement, as shown in the example. Any trigger that names the table in
the trigger event is disabled.
A disabled constraint is not checked and a disabled trigger does not execute. A
disabled index is neither updated nor used by the optimizer when it chooses query
paths.
Indexes created by constraints
If an index is created as a result of adding a referential or unique constraint, the
index is always enabled as long as the constraint is enabled.
The SET CONSTRAINTS statement places an exclusive table lock on the target
table for the duration of the statement.
Enabling a constraint
• Enabling individual objects:
SET CONSTRAINTS c117_11, c117_12 ENABLED;
SET INDEXES idx_x1 ENABLED;
SET TRIGGERS upd_cust ENABLED;
• Enabling all objects for a table:
SET CONSTRAINTS, INDEXES, TRIGGERS
FOR customer ENABLED;
If a constraint that is set
to disabled is enabled, all
existing rows must satisfy
the constraint. If some
rows violate the
constraint, an error is
returned.
Enabling a constraint
Two methods can be used to enable objects that exist in a database:
• Individually: To enable a constraint, index, or trigger, you must specify the
object name. The constraint and index names can be found by using
dbschema or dbaccess, or by querying the system catalog.
• By table: You can enable all constraints, triggers, and indexes for a table with
one SQL statement, as shown in the example. Any trigger that names the
table in the trigger event is enabled.
When a constraint is enabled from a disabled mode, all existing rows are checked to
see if they satisfy the constraint. If any rows do not satisfy the constraint, an error is
returned and the constraint remains disabled.
When a constraint is enabled from filtering, the existing rows are not rechecked
because they already satisfy the constraint.
When an index is enabled from disabled mode, the entire index is effectively rebuilt.
Recording violations
Violations Diagnostics
table table
Recording violations
When the constraint mode is changed to filtering or enabled, you can record any
subsequent violations in two tables: the violations table and the diagnostics table. All
violations for constraints or indexes on a table are placed in its corresponding
violations table and diagnostics table. Only one pair of these tables can be present
for each database table.
The violations table holds information about the row where the violation occurred.
The diagnostics table contains one row for every violation that occurred. In some
cases, one row can have multiple violations.
Example
An inserted row might violate the NOT NULL constraint, a referential constraint, and
a primary key constraint. In this case, three rows are placed in the diagnostics table
and one row is placed in the violations table.
In addition to the violation recorded in the violations table, you might also receive an
error. If the constraint is enabled, you receive an error when a violation occurs. Error
handling for violations in filtering mode is discussed later in this unit.
Table permissions
In general, if a user has INSERT, UPDATE, or DELETE permissions on the target
table, the user also has permissions to INSERT into the violations tables.
Table extents
The extent sizes for the violations and diagnostics table are set at the default value.
The violations table and the diagnostics table are placed in the same dbspace as
the target table. If the target table is fragmented, the violations table is fragmented
in the same manner. The diagnostics table is fragmented in a round robin fashion
over the same dbspaces on which the target table is fragmented.
I = Insert
D = Delete
O = Update (original values)
N = Update (new values)
S = Created by the SET command C = Constraint violation
D = Unique index violation
Violations table
informix_tupleid informix_tupleid
informix_optype objtype
informix_recowner objowner Owner of
constraint
objname
User
Name of constraint
Modes and violation detection © Copyright IBM Corporation 2017
Filtering mode
• Set individual objects to filtering:
SET CONSTRAINTS c117_11, c117_12 FILTERING;
SET INDEXES idx_x1 FILTERING;
• Set all objects for a table to filtering:
SET CONSTRAINTS,INDEXES
FOR customer FILTERING;
• Cause an error to be returned to the application if a violation occurs:
SET CONSTRAINTS,INDEXES
FOR customer FILTERING WITH ERROR;
Filtering mode
You can set constraints and unique indexes to filtering mode. In filtering mode, any
constraint or unique index violations are recorded in the violations table as they
occur.
Before you set an object to filtering mode, make sure that violation logging is
enabled for the table with the START VIOLATIONS TABLE statement.
You cannot set triggers and indexes other than unique indexes to filtering mode.
Error handling
In filtering mode, a violation does not cause the statement to roll back. In the default
filtering mode (WITHOUT ERROR), the application tool is not informed that a
violation occurred. Be careful with this mode, as you could incorrectly assume that
the transaction was completed in full when, in fact, it might not have been.
If the WITH ERROR clause is included in the SET statement, an error is returned to
the user:
971: Integrity violations detected.
However, unlike enabled mode, the error does not automatically roll back the
statement. To roll back the transaction, explicitly execute ROLLBACK WORK.
The INSERT statements that add the errors to the diagnostics and violations table
are a part of the current transaction. If you roll back the transaction, the rows in the
violations and diagnostics tables are also rolled back.
Example 1 (1 of 4)
CREATE TABLE customer (
customer_num SERIAL NOT NULL CONSTRAINT nn_cn,
name CHAR(15),
PRIMARY KEY (customer_num) CONSTRAINT pk_cn);
Example 1
The next few pages illustrate how you can use object modes.
The example shows two tables, customer and orders. The customer table is the
parent table with a primary key and the orders table is the child table with a
constraint referencing the customer table. This referential constraint means that a
corresponding customer row must be present for every orders row.
Example 1 (2 of 4)
• The following statement produces an error:
SET CONSTRAINTS, TRIGGERS, INDEXES
FOR customer DISABLED;
• Instead:
SET CONSTRAINTS, TRIGGERS, INDEXES
FOR orders DISABLED;
SET CONSTRAINTS, TRIGGERS, INDEXES
FOR customer DISABLED;
• Now start violation logging:
START VIOLATIONS TABLE FOR customer;
START VIOLATIONS TABLE FOR orders;
Suppose you want to perform a large load where some data might cause some
temporary referential integrity errors. You decide to disable the constraints for the
customer and orders table.
SET CONSTRAINTS order
You must execute the SET CONSTRAINTS statement in the proper order. You
cannot disable an object when other enabled objects refer to it. Because the
referential constraint for orders refers to the customer table, disabling customer
constraints first produces an error. Instead, disable the orders constraint first.
START VIOLATIONS TABLE
To log violations, inform the database server by using the START VIOLATIONS
TABLE statement. This statement creates the two violations tables for the table
listed. In the example, four additional tables are created: customer_vio,
customer_dia, orders_vio, and orders_dia.
Example 1 (3 of 4)
• This statement is successful:
INSERT INTO orders
(order_num, customer_num, ship_instruct)
VALUES (0, 2, "ship tomorrow");
• However, an error occurs when constraints are enabled:
SET CONSTRAINTS, TRIGGERS, INDEXES
FOR customer ENABLED;
SET CONSTRAINTS, TRIGGERS, INDEXES
FOR orders ENABLED;
971: Integrity violations detected.
Once you disable constraints, they are not checked. That is why the INSERT
statement shown in the example is successful. If constraints are enabled, the
statement fails because you cannot add an orders row without a corresponding
customer row.
But when you try to enable the constraints, the database server must check all rows
to make sure that there are no violations. The SET CONSTRAINTS... ENABLED
statement fails because of the violation introduced with the INSERT statement.
Example 1 (4 of 4)
• Errors placed in the violations table:
SELECT * FROM orders_vio, orders_dia
WHERE orders_vio.informix_tupleid =
orders_dia.informix_tupleid;
order_num 1
customer_num 2
ship_instruct ship tomorrow
informix_tupleid 1
informix_optype S
informix_recowner informix
informix_tupleid 1
objtype C
objowner informix
objname fk_cust
Example 2 (1 of 2)
• Checking for violations during normal SQL activity:
SET CONSTRAINTS, INDEXES
FOR customer FILTERING;
SET CONSTRAINTS, INDEXES
FOR orders FILTERING;
• This row is not inserted, but no error is returned!
INSERT INTO orders
(order_num, customer_num, ship_instruct)
VALUES (0, 4, "ship tomorrow");
Example 2
Suppose for some reason that the administrator wants to know what violations are
occurring during a load process without causing the load process to fail. The
administrator can determine violations by setting constraints and indexes to
FILTERING.
Since the WITH ERROR clause is not included in the SET statement, the
application is not notified when any error is returned. The INSERT statement in the
example fails and an entry is put in the violations table. However, the application
receives no error.
Serial values
Even though the row in the example is not inserted, the serial counter for the table is
incremented. The violations table shows the serial value of order_num as it would
have been if the row had been successfully inserted.
Example 2 (2 of 2)
• Enable constraints:
SET CONSTRAINTS, INDEXES FOR customer ENABLED;
SET CONSTRAINTS, INDEXES FOR orders ENABLED;
• Turn off violations logging:
STOP VIOLATIONS TABLE FOR customer;
STOP VIOLATIONS TABLE FOR orders;
• Fix errors that caused violations to occur:
INSERT INTO customer (customer_num, name)
VALUES (4, "SCHMIDT");
• Insert rows that caused violations into target table:
INSERT INTO orders
SELECT order_num, customer_num, ship_instruct
FROM orders_vio;
Now the administrator wants to reconcile any violations. First, the administrator
enables the constraints and indexes. Then the administrator turns off violations
logging. These two steps are required to insert the violations back into the target
table, thus avoiding any endless cycles (violations added to the violations table and
later inserted back into the customer table).
Next, the administrator fixes any errors that caused the violations to occur. In the
example above, the parent row is missing for a referential constraint.
Finally, the administrator can copy any rows in the violations table into the target
table with the INSERT INTO... SELECT FROM statement.
Exercise 12
Modes and violation detection
• use modes and violations to record and correct errors in data
Exercise 12:
Modes and violation detection
Purpose:
In this exercise, you will learn how to capture data that violates constraints.
Exercise 12:
Modes and violation detection – Solutions
Purpose:
In this exercise, you will learn how to capture data that violates constraints.
UPDATE items
SET manu_code = "HSK"
WHERE item_num = 2
AND order_num = 1001;
8. Enable the constraints and other objects again.
SET CONSTRAINTS, INDEXES FOR items ENABLED;
Task 2. Using filtering mode.
In this task, you will query the sysobjstate system catalog to ensure violation
logging is enabled and set constraints for the items table to filtering. You will insert
a row into the items table and query the items table to see if the row inserted. You
will enable the constraints and stop violation logging. You will find errors have
occurred in the violations tables, fix the errors, and enable the constraints again.
1. Query the sysobjstate system catalog to ensure that the objects are enabled
for the items table.
SELECT s.objtype, s.name, s.state,
s.tabid, t.tabname
FROM sysobjstate s, systables t
WHERE t.tabname = "items" AND s.tabid = t.tabid;
2. Set the filtering mode on all constraints for the items table.
SET CONSTRAINTS, INDEXES
FOR items FILTERING;
3. Insert the following row into the items table:
INSERT INTO items
(item_num, order_num, stock_num, manu_code,
quantity, total_price)
VALUES (3, 1010, 1, "WIL", 5, 300.00);
4. Query the items table to see if the row inserted.
SELECT * FROM items
WHERE order_num = 1010;
Unit summary
• Enable and disable constraints and indexes
• Use the filtering mode for constraints and the indexes
• Reconcile the violations recorded in the database
Unit summary
Concurrency control
Informix (v12.10)
Unit objectives
• Use the different concurrency controls
• Monitor the concurrency controls for lock usage
• Use the Retain Update Lock feature
Unit objectives
Informix isolation
• Informix allows users to control the level of isolation for their query by
using:
SET TRANSACTION
− ANSI compliant
− Supports access modes
− Can be set once per transaction
SET ISOLATION
− Not ANSI compliant
− Does not support access modes
− Can be changed within a transaction
Informix isolation
Informix provides two SQL statements for setting the current isolation level in an
application. The first, SET TRANSACTION, complies with the ANSI SQL-92
specification. The second statement, SET ISOLATION, is not ANSI compliant and does
not support access modes, but does allow you to specify an Informix defined isolation
level, cursor stability.
A primary difference between SET TRANSACTION and SET ISOLATION is that a SET
TRANSACTION statement remains in effect only for the duration of the transaction.
Additionally, you can execute only one SET TRANSACTION statement within a
transaction.
The SET ISOLATION statement allows you to change the effective isolation level within
a single transaction. For example:
BEGIN WORK;
SET ISOLATION TO DIRTY READ;
SELECT * FROM customer;
SET ISOLATION TO REPEATABLE READ;
INSERT INTO cust_info …
Once you set an isolation level by using the SET ISOLATION statement, it remains in
effect until the next SET ISOLATION statement or until the end of the session.
Comparison
Comparison
Informix supports each of the defined ANSI isolation levels and two additional isolation
levels, CURSOR STABILITY (specific to cursor manipulation) and COMMITTED READ
LAST COMMITTED.
Access methods
• By default, Informix transactions are always read/write capable
• To control the access mode, use the statements:
SET TRANSACTION READ WRITE;
SET TRANSACTION READ ONLY;
• Read-only transactions cannot:
Update, insert, or delete rows
Add, drop, alter, or rename database objects
Update database statistics
Grant or revoke privileges
Access methods
The ANSI SQL-92 defines both read/write and read-only transactions.
READ UNCOMMITTED
• ANSI:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
• Informix:
SET ISOLATION TO DIRTY READ;
Table
Database Server
Process
READ UNCOMMITTED
When the isolation level is read uncommitted or dirty read, the database server does
not place any locks or check for existing locks when resolving your query. During
retrieval, you can look at any row, even rows that contain uncommitted changes.
Dirty-read isolation makes it possible for your query to retrieve phantom rows. A
phantom row is a row inserted by a transaction, but the transaction is rolled back rather
than committed. Although the phantom row is never committed to the database and
therefore never truly exists in the database, it is visible to any process using dirty-read
isolation.
Dirty-read isolation can be useful when:
• The table is static
• 100 percent accuracy is not as important as speed and freedom from contention
• You cannot wait for locks to be released
Dirty read isolation is the only isolation level available for non-logging databases.
READ COMMITTED
• ANSI:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
• Informix:
SET ISOLATION TO COMMITTED READ;
Table
Can lock be acquired?
Database Server
Process
READ COMMITTED
Queries in logged databases default to ANSI read-committed isolation. Read committed
isolation is synonymous with the Informix committed read isolation. Read committed
isolation ensures that all rows read are committed to the database. To perform a
committed read, the database server attempts to acquire a shared lock on a row before
trying to read it. It does not place the lock; rather, it checks whether it can acquire the
lock. If it can, it is guaranteed that the row exists and is not being updated by another
process while it is being read. Remember, a shared lock cannot be acquired on a row
that is locked exclusively, which is always the case when a row is being updated.
As you are scanning rows using committed read, you are not looking at any phantoms
rows or dirty data. You know that the current row was committed (at least when your
process read it). After your process reads the row, however, other processes can
change it.
Committed read can be useful for:
• Lookups
• Queries
• Reports that yield general information
CURSOR STABILITY
• Informix:
SET ISOLATION TO CURSOR STABILITY;
CURSOR STABILITY
With CURSOR STABILITY, a shared lock is acquired on each row as it is read by a
cursor. This shared lock is held until the next row is retrieved. If data is retrieved by
using a cursor, the shared lock is held until the next FETCH is executed.
At this level, not only can you look at committed rows, but you are assured the row
continues to exist while you are looking at it. No other process (UPDATE or DELETE)
can change that row while you are looking at it.
You can use SELECT statements that use an isolation level of CURSOR STABILITY
for:
• Lookups
• Queries
• Reports yielding operational data
Example
A SELECT statement that uses CURSOR STABILITY is useful for detail-type reports
like price quotas or job-tracking systems.
If the isolation level of CURSOR STABILITY is set and a cursor is not used, CURSOR
STABILITY behaves in the same manner as READ COMMITTED (the shared lock is
never placed).
Table
Locks put on all
rows examined
Database Server
Process
BEGIN WORK;
UPDATE t1 SET c2="JOHN"
WHERE c1=3;
• SQL statement:
SET ISOLATION TO COMMITTED READ LAST COMMITTED;
SET ENVIRONMENT USELASTCOMMITTED 'ALL|COMMITTED
READ|DIRTY READ|NONE';
Lock types
Informix supports three types of locks:
• Shared locks prevent other processes from updating the locked object. Other
users maintain read access to the object. Multiple share locks can be placed on a
single object.
• Exclusive locks are automatically placed by the database server on any object
actively being modified. Exclusive locks prevent all reads except dirty reads. A
DBA or user can also explicitly request an exclusive lock on a database or table
to perform administrative or batch operations.
• Promotable or update locks are placed on objects that are retrieved for update
but are not yet being updated. They prevent other users from acquiring exclusive
or promotable locks on the object. As an example, when you open a cursor with
the FOR UPDATE clause, the database server acquires an update lock on each
row fetched. The lock is promoted to an exclusive lock when the UPDATE
WHERE CURRENT statement is executed.
Database-level locking
stores
database X
Database-level locking
This condition can be the case if you are:
• Executing many updates that involve many tables.
• Archiving the database files for backups.
• Altering the structure of the database.
You can lock the entire database by using the DATABASE statement with the
EXCLUSIVE option. The EXCLUSIVE option opens the database in an exclusive mode
and allows only the current user access to the database.
To allow other users access to the database, you must execute the CLOSE
DATABASE statement and then reopen the database.
Users with any level of database permission can open the database in exclusive mode.
Doing so does not give them any greater level of access than they normally have.
Others can
SELECT from
the table, but cannot
INSERT, UPDATE, or
DELETE
SELECT UPDATE
customer table X
DELETE
INSERT
Others cannot
SELECT from
the table, nor can they
INSERT, UPDATE, or
DELETE
SELECT UPDATE
X
customer table X
DELETE
INSERT
Unlocking a table
SELECT UPDATE
customer table
DELETE
INSERT
Unlocking a table
When a database is logged, table locks are automatically released when the
transaction commits. When the database does not use logging, transactions are not
supported and the table lock persists until the process completes or until you execute
an explicit UNLOCK TABLE statement.
Value Description
Deadlock detection
Process A Process B
Deadlock detection
In some databases, when multiple users request locks on the same resources,
deadlocks can occur. Deadlocks are serious problems, as they can halt a major portion
of the activity in a database system.
Informix has a built-in, sophisticated mechanism that detects potential deadlocks and
prevents them from happening. To prevent local deadlocks from occurring, the
database server maintains a list of locks for every user on the system. Before a lock is
granted, the lock list for each user is examined. If a lock is currently held on the
resource that the process requests to lock, the owner of that lock is identified and their
lock list is traversed to see if waits on any locks are held by the user who wants the new
lock. If so, the deadlock is detected at that point and an error message is returned to
the user who requested the lock.
The ISAM error code returned is:
-143 ISAM error: deadlock detected
Deadlocks cannot occur when the isolation level is set to COMMITTED READ LAST
COMMITTED.
B+ tree
scanner When a deleted item is
1 committed, its page number
is put in the B+ tree scanner
pool.
Index Key 2 The btscanner thread reads
the pool occasionally and
removes all committed
deleted items on each page it
finds.
Row versioning
• Used to determine if row changed:
Can help detect collisions in ER
Can be used to ensure application not updating stale data
• Creates two shadow columns:
ifx_insert_checksum
− Generated on initial insert
− Never changes
ifx_row_version
− Incremented each time row updated
• Not required, but improves performance of redirected writes
Row versioning
Row versioning is a feature that allows an application to check that it is not updating a
stale row. It is also used by Informix Enterprise Replication to help detect collisions in
data rows.
When row versioning is enabled, two shadow columns are created:
ifx_insert_checksum: A checksum generated when the row is initially inserted. This
value never changes.
ifx_row_version: This value is incremented each time the row is updated.
These columns are not visible, and are not returned with a SELECT * from the table. In
order to view their values, they must be explicitly listed in the SELECT list.
While row versioning is not required, it can help improve performance in redirected
writes, since only the checksum and version values are checked instead of comparing
the entire row.
Managing versioning
• Can be created when table created:
CREATE TABLE tab1 (
col1 INTEGER,
...
WITH VERCOLS;
• Can be added/dropped using ALTER TABLE:
ALTER TABLE tab1 ADD VERCOLS;
ALTER TABLE tab1 DROP VERCOLS;
Managing versioning
Row versioning is implemented using the VERCOLS keyword.
It can be implemented when the table is created, or later using the ALTER TABLE
statement, as shown in the visual.
/* process the data and make a decision */ /* process the data and make a decision */
do_update = get_user_input(my_id, my_name); do_update = get_user_input(my_id, my_name);
Exercise 13
Concurrency control
• Use isolation/transaction levels and lock modes to control the
effects of SQL statements
Exercise 13:
Concurrency control
Purpose:
In this exercise, you will learn how to use concurrency control by setting
isolation / transaction levels and lock modes.
Exercise 13:
Concurrency control - Solutions
Purpose:
In this exercise, you will learn how to use concurrency control by setting
isolation / transaction levels and lock modes.
BEGIN WORK;
SET TRANSACTION READ ONLY;
INSERT INTO customer (customer_num, fname, lname,
company, address1, city, state, zipcode)
VALUES ( 0, "Henry", "Smith", "Smith Studios",
"12345 Barney Road", "St. Bob", "CA", "90123");
What happened?
This results in the following error:
878: Invalid operation for a READ-ONLY transaction.
The statement failed because the transaction was set to READ-ONLY.
READ-ONLY ensures that data can only be read and not altered.
BEGIN WORK;
INSERT INTO customer (customer_num, fname, lname,
company, address1, city, state, zipcode)
VALUES ( 0, "Patrick", "Edmonds", "Edmonds Goods",
"789 West Stuckey Ave", "Central", "IL", "60580");
3. In the second session, run the following SELECT statement:
SET ISOLATION TO DIRTY READ;
SELECT * FROM customer
WHERE fname LIKE "Pat%"
AND lname LIKE "Edmonds%";
Why did the newly inserted customer display?
With DIRTY READ you can retrieve rows being inserted before the
transaction has been committed. These are sometimes called phantom
rows.
4. In the first session, rollback your transaction.
ROLLBACK WORK;
5. In the second session, run the query again. Notice that the customer "Patrick
Edmonds” is no longer in the customer table.
BEGIN WORK;
INSERT INTO STOCK(stock_num, manu_code, description,
unit_price, unit, unit_descr)
VALUES (0,"ANZ", "golf balls", 20.00, "case",
"12/case");
3. To find the locks being held on your table, you will need to know the partnum of
the stock table.
In the second session, run the following query:
4. Open a new Informix Server (putty) window (login = docker/tcuser; run docker
exec -it iif_developer_edition bash), and run the following query:
$ onstat -K
Find your partnum in the tblsnum column of the onstat -k output. Notice the X
in the type column. This indicates an exclusive lock is being held.
BEGIN WORK;
UPDATE customer
SET lname = "Smith"
WHERE customer_num = 125;
To find the locks being held on your table, you will need to know the partnum of
the manufact table.
2. In the second session window, run the following query:
SELECT tabname, hex(partnum)
FROM systables
WHERE tabname = "manufact";
3. Run the following command to find the locks currently held by the query in
Step 4.1:
$ onstat -K
Notice that there is an IS type lock ( intent-shared lock) on the entire table (rowid
= 0 is a table lock). An S type lock (or shared lock) is being held on the row 105.
5. Run the following command to find the additional locks currently for the
manufact table:
$ onstat -K
Notice the additional IX type lock, or intent-exclusive. This lock is for the session
trying to delete the manu_code HRO. The delete session will wait until the
other locks have been released before executing.
This session runs part way, but is waiting on the release of the
exclusive lock held by the first session running the update transaction.
3. In the third session window, run the following command to find the locks
currently held in the above steps:
$ onstat -K
What happened and why?
Unit summary
• Use the different concurrency controls
• Monitor the concurrency controls for lock usage
• Use the Retain Update Lock feature
Unit summary
Data security
Informix (v12.10)
Unit objectives
• Use the database, table, and column-level privileges
• Use the GRANT and REVOKE statements
• Use role-based authorization
Unit objectives
Database
Table
View Fragment in dbs1
Column
Fragment in dbs2
Database-level privileges
• The three levels of database access are:
Connect
Resource
DBA
Database-level privileges
Informix organizes database privileges into three levels of access.
CONNECT
The CONNECT privilege allows you to connect to a database, create temporary tables
and indexes, create views and synonyms, grant permission to objects that you create
and own, and drop or alter any objects that you own. You cannot create permanent
tables and indexes.
RESOURCE
The RESOURCE privilege gives you all CONNECT privileges and the ability to create
permanent tables and indexes, stored procedures, and functions.
DBA
A user with DBA privilege has full access to the database. The only restriction placed
on users with DBA status is the inability to revoke the DBA privilege from themselves.
However, a user with DBA status can grant the privilege to another user who can then
revoke it from the grantor.
Default privileges
• Database level:
When you create a database, you are automatically granted DBA privileges
• Table level:
Non-ANSI databases:
− All table-level privileges except ALTER and REFERENCES granted to all users
− Can use environment variable NODEFDAC to grant no privileges
MODE ANSI databases:
− No default privileges granted
Default privileges
Informix grants certain default database and table-level privileges.
• Default database-level privileges: If you want to allow other users to access the
database, you must grant them CONNECT, RESOURCE, or DBA privileges.
• Default table-level privileges: For non-ANSI databases, table-level privileges are
automatically granted to the public when a table is created. For ANSI-compliant
databases, table-level privileges are not automatically granted; you must explicitly
grant privileges to specific users or to public.
Environment variable NODEFDAC
To prevent the granting of default table-level privileges in a non-ANSI database, set the
NODEFDAC environment variable to YES before you execute the CREATE TABLE
statement.
PUBLIC The keyword that you use to specify access privileges for
all users.
user-list A list of login names for the users to whom you are
granting access privileges. You can enter one or more
names, separated by commas.
In the first example, the CONNECT privilege is granted to all users (PUBLIC). In the
second example, the RESOURCE privilege is granted only to the user maria and the
user joe. In the third example, janet is given DBA privilege.
PUBLIC The keyword that you use to specify access privileges for all
users.
user-list A list of login names for the users to whom you are granting
access privileges. You can enter one or more names,
separated by commas.
If you revoke the DBA or RESOURCE privilege from one or more users, they are left
with the CONNECT privilege. To revoke all database privileges from users with DBA or
RESOURCE status, you must revoke CONNECT as well as DBA or RESOURCE.
In the first example, the CONNECT privilege is revoked from mike.
In the second example, the RESOURCE privilege is revoked from the user maria. That
user now has the CONNECT privilege.
Even though CONNECT has been revoked from user mike in this example, remember
that CONNECT TO PUBLIC was granted in the previous slide. Since mike is always a
member of PUBLIC, this statement has no effect unless CONNECT was only granted
to specific users.
table or view The name of the table or view for which you are granting
access privileges.
PUBLIC The keyword that you use to specify access privileges for all
users.
user-list A list of login names for the users to whom you are granting
access privileges. You can enter one or more names,
separated by commas.
WITH GRANT Allows the user or users listed in the GRANT statement to
OPTION grant the same privileges to other users.
In the first example shown above, all privileges are granted to all users (PUBLIC) on the
customer table.
In the second example, liz is given update permissions on the order table with the ability
to give that permission to other users.
The third example has the INSERT and DELETE privileges granted to the user mike by
maria.
table or view The name of the table or view for which you are granting
access privileges.
PUBLIC The keyword that you use to specify access privileges for all
users.
user-list A list of login names for the users to whom you are granting
access privileges. You can enter one or more names,
separated by commas.
Although you can grant UPDATE and SELECT privileges for specific columns, you
cannot revoke these privileges column by column. If you revoke UPDATE or SELECT
privileges from a user, all UPDATE and SELECT privileges that you have granted to
that user are revoked.
In the first example shown above, all privileges are revoked from all users (PUBLIC) on
the orders table.
In the second example, the DELETE and UPDATE privileges are revoked from the
users mike and maria.
The third example shows INSERT and UPDATE permissions revoked from mike by
maria.
Routine privileges
• Examples:
Routine privileges
When you create a user-defined routine, either a stored procedure or a function, you
own and are automatically granted execute privileges for that routine. The execute
privilege allows you to issue an EXECUTE PROCEDURE or EXECUTE FUNCTION
statement for the routine. If you want to allow other users to execute the routine, you
must grant them the EXECUTE privilege by using the GRANT EXECUTE ON
statement.
DataBlade privileges
• Examples:
DataBlade privileges
DataBlades are a set of user-defined routines bundled into a shared library.
In the example, if the DBA has the IFX_EXTEND_ROLE configuration parameter set to
1 (on), the GRANT statement allows the user to create or drop a user-defined routine
(UDR) that has the EXTERNAL clause. The REVOKE statement does not allow the
users to do any manipulation of UDRs that have the EXTERNAL clause.
If the DBA leaves IFX_EXTEND_ROLE set to 0 (off), then no restrictions are applied to
which users can manipulate UDRs.
When you grant the EXTEND role to a specific user, that user is automatically granted
EXECUTE permission on the sysbldprepare UDR, and the sysroleauth system catalog
table is updated to reflect the new built-in role.
Roles
sales role
Roles
A role is a group of users who can be granted security privileges. Roles make the job of
administering security easier. Once a user is assigned to a role, the system
administrator needs only GRANT and REVOKE table and column privileges to a role.
You can nest roles within other roles.
The example shows users that are part of the marketing role, and users that are part of
the salesadmin role. However, all users are also part of the sales role.
Creating roles
• Examples:
Creating roles
The CREATE ROLE statement creates a role. The statement effectively puts an entry
in the sysusers system catalog table where the user type is G. The role name must be
less than or equal to 32 characters. All user and role names on the system must be
unique. For example, if you have a user name gus that can connect to the database
server, you cannot create a role called gus. In order to enforce this rule, the following
checks are in place:
• The CREATE ROLE statement checks to make sure that the role name is not
present in the password file.
• A user is not able to connect if the user name is created as a role name.
The CREATE ROLE statement can only be executed by a user who has DBA
permissions on the database. A role is a database object, meaning that it is only
applicable for the database in which it is created.
Once the role is created, the next step is to assign users to roles. The GRANT
statement assigns one or more users to the role specified. You can also assign one role
to another, as shown in the example. A successful GRANT statement puts an entry in
the sysroleauth system catalog table.
Using roles (1 of 2)
• Examples:
Using roles
Table and column-level privileges can be assigned to roles by using the GRANT
statement. However, database privileges cannot be assigned to roles.
Using roles (2 of 2)
• A user can either inherit a default role, or specify a role to use in their
session:
Default roles are assigned by the DBA using the GRANT ROLE statement:
GRANT DEFAULT ROLE slsadmin TO liz;
A user can set their own role through the SET ROLE SQL statement:
SET ROLE slsadmin;
SET ROLE DEFAULT;
• Default roles can be granted to PUBLIC:
GRANT DEFAULT ROLE slsadmin to PUBLIC;
• Default roles can be revoked with the REVOKE statement:
REVOKE DEFAULT ROLE FROM ram;
Before a user can gain access to ROLE permissions, they have to either inherit them
through default roles or put themselves into the ROLE through the SET ROLE
statement. Default roles can be granted and revoked from users through the GRANT
and REVOKE SQL statements as described in the example.
If different default roles are assigned to a user and to PUBLIC, the default role of the
user takes precedence. If a default role is not assigned to a user, the user only has
individually granted and public privileges.
Discussion
Assume that the DBA executes the following statements:
CREATE ROLE mkting;
CREATE ROLE slsadmin;
CREATE ROLE sales;
GRANT mkting to jim, mary, ram;
GRANT slsadmin to andy, liz, sam;
GRANT sales to mkting, slsadmin;
REVOKE ALL on orders from public;
GRANT select ON orders TO sales;
GRANT insert, update, delete ON orders to slsadmin;
The following statements are run by user mary. Which statements will fail? Why?
SELECT * FROM orders;
SET ROLE mkting;
SELECT * FROM orders;
Discussion
• The orders table is fragmented so that orders for customer numbers 1 -
10,000 are in dbspace1 and orders for customer numbers 10,001 -
20,000 are in dbspace2.
• Given the GRANT and REVOKE FRAGMENT statements on the
previous page, which of these statements would fail (if executed by
user1)?
INSERT INTO orders(cust_nbr) VALUES 100;
SELECT * FROM orders;
UPDATE orders SET cust_nbr = 12200;
WHERE cust_nbr = 220;
Discussion
The INSERT statement shown in the example succeeds because user1 has INSERT
permissions into the fragment in dbspace1.
The SELECT statement succeeds because user1 has SELECT permissions on the
table (fragment permissions are only for INSERT, UPDATE, and DELETE statements).
The UPDATE statement fails because user1 does not have UPDATE permissions for
the fragment in dbspace2. The user requires UPDATE permissions for the fragment
from where the row is moving and the fragment to where the row is moving.
Exercise 14
Data security
• assign and revoke privileges at the user and role levels
Exercise 14:
Data security
Purpose:
In this exercise, you will learn how to use the built-in features of Informix that
control data security.
Exercise 14:
Data security - Solutions
Purpose:
In this exercise, you will learn how to use the built-in features of Informix that
control data security.
What happened?
The update failed because sam only has update privileges on the
manu_code column and cannot select the order_num column.
272: No SELECT permission for items.order_num.
Task 2. Using GRANT and REVOKE statements.
In this task, you will revise the privileges previously assigned to the other users
using the GRANT and REVOKE statements.
1. In your first session:
User jane no longer needs to select from or insert into the items table.
• Execute the SQL statement needed to change jane’s access privileges.
REVOKE SELECT, INSERT ON items FROM jane;
User joe needs to select from only the order_num and total_price columns of
the items table.
• Execute the SQL statement needed to change joe’s access privileges.
GRANT SELECT (order_num, total_price) ON items TO joe;
2. In your second session:
• Connect to your database as user joe (password joe).
• Select all rows from the items table.
SELECT * FROM items;
What happened?
Only the columns granted SELECT privileges (order_num, total_price) are
returned to the user.
Unit summary
• Use the database, table, and column-level privileges
• Use the GRANT and REVOKE statements
• Use role-based authorization
Unit summary
Views
Informix (v12.10)
Unit objectives
• Create views
• Use views to present derived and aggregate data
• Use views to hide joins from users
Unit objectives
What is a view?
• A virtual table
• A dynamic window
V
I
Data
E
W
What is a view?
A view is often called a virtual table. As far as the user is concerned, it acts like an
ordinary table. But in fact, a view has no existence in its own right. Rather, it is
derived from columns in real tables.
A view can also be called a dynamic window on your database.
A view can calculate the results of a computation like sum (total_price). Yet as
individual prices change, the value of the calculated sum is always up to date.
Creating a view
CREATE VIEW ordsummary AS
SELECT order_num, customer_num, ship_date
FROM orders;
Creating a view
The CREATE VIEW statement consists of a CREATE VIEW clause and a SELECT
statement.
You can also give names to the columns in a view by listing them in parentheses after
the view name. If you do not assign names, the view uses the names of columns in the
underlying table.
Exceptions to syntax
Follow normal rules for writing the SELECT statement, except the following syntax is
prohibited:
• FIRST
• ORDER BY
• INTO TEMP
In the first example, the view ordsummary has three columns. They are given the same
names as the columns in the orders table.
In the second example, the view they_owe also has three columns. However, the
column names for the view differ from the column names in the orders table. They are
called ordno, orddate, and cnum instead of order_num, order_date, and
customer_num. In addition, the view they_owe shows only certain rows of the orders
table where paid_date is NULL.
Dropping a view
• Example:
Dropping a view
The DROP VIEW command allows you to remove a view from your database.
When you drop a view, no data is deleted. The underlying tables remain intact.
You cannot ALTER a view. To change a view, you must first remove the view by using
DROP VIEW, then recreating it with CREATE VIEW.
Restrictions on views
• You cannot create indexes on a view
• A view depends on its underlying tables
• Some views restrict inserts, updates, and deletes
• You must have full SELECT privileges on all columns
Restrictions on views
Several restrictions are imposed on views:
• You cannot create indexes on a view. However, when querying, you do receive
the benefit of existing indexes on columns in the underlying tables.
• A view depends on its underlying tables (and views). If you drop a table, all views
derived from that table are automatically dropped. If you drop a view, any views
derived from that view are automatically dropped.
• Some views restrict inserts, updates, and deletes. These restrictions are
described on the next page.
• You must have full SELECT privileges on all columns order to create a view on a
table.
The example shows what could happen when the WITH CHECK OPTION clause is not
used:
• A user inserts a row through the view no_check.
• A moment later, the user runs the following:
SELECT * FROM no_check;
• The newly added row does not show up in the output.
How do you determine whether the soccer ball was successfully entered into the
database?
If the user was using yes_check instead of no_check, then the INSERT would have
been rejected with an error message (data value out of range).
Exercise 15
Views
• create simple views
• create complex views
Exercise 15:
Views
Purpose:
In this exercise, you will learn how to create simple and more complex views
and how you can do data validation using views.
Exercise 15:
Views - Solutions
Purpose:
In this exercise, you will learn how to create simple and more complex views
and how you can do data validation using views.
6. Create another view called customer_restrict that will only return companies
with “Golf” in the company name.
CREATE VIEW customer_restrict
(first, last, city, st, company) AS
SELECT fname, lname, city, state, company
FROM customer
WHERE company MATCHES "*Golf*";
7. Select from the customer_restrict view.
SELECT * FROM customer_restrict;
Task 2. Creating a view with two tables.
In this task, you will be creating a view that matches customers with orders that
have been placed.
1. Create a view that will match up each customer with the orders he or she has
placed. Include only the orders that have not been shipped yet (ship_date is
NULL). Display the following information:
• Customer number
• Company name
• Order number
• Order date
• Date paid
CREATE VIEW order_view AS
SELECT c.customer_num, company,
order_num, order_date, paid_date
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND ship_date IS NULL;
The view only allows rows with a po_num starting with a B to be entered into the
table.
Results:
In this exercise, you learned how to create simple and more complex views
and how you can do data validation using views.
Unit summary
• Create views
• Use views to present derived and aggregate data
• Use views to hide joins from users
Unit summary
Introduction to stored
procedures
Informix (v12.10)
Unit objectives
• Explain the purpose of stored procedures
• Explain advantages of using stored procedures
Unit objectives
application db server
Pass SQL SQL parsed,
statement optimized, and
executed
application db server
• Using the extra level of security that a stored procedure provides, you can use
stored procedures to enforce business rules. For example, you can prohibit users
from deleting a row without first storing it in an archive table by writing a stored
procedure to accomplish both tasks and prohibit users from directly accessing the
table.
• Different programs requiring use of the same code can execute a stored
procedure rather than having the same code included in each program. The code
is stored in only one place, eliminating duplicate code.
• Stored procedures are especially helpful in a client/server environment. If a
change is made to application code, it must be distributed to every client. If a
change is made to a stored procedure, it resides in only one location.
• Instead of centralizing database code in applications, you move this code to the
database server. This allows applications to concentrate on user interface
interaction. This is especially important if there are multiple types of user
interfaces required.
Exercise 16
Introduction to stored procedures
• Create a stored procedure
Exercise 16:
Introduction to stored procedures
Purpose:
In this exercise, you will learn how to create a stored procedure, execute it
explicitly and how to use it in a SELECT statement.
2. Execute the procedure you just wrote using sample manufacturer’s codes and
prices.
3. Write a SELECT statement to select all rows in the items table for order
numbers 1047, 1062, 1065 and 1080. Display the following information sorted
by order and item number:
• Order number
• Item number
• Manufacturer code
• Original Item price
• Discounted price
Results:
In this exercise, you learned how to create a stored procedure, execute it
explicitly and how to use it in a SELECT statement.
Exercise 16:
Introduction to stored procedures - Solutions
Purpose:
In this exercise, you will learn how to create a stored procedure, execute it
explicitly and how to use it in a SELECT statement.
2. Execute the procedure you just wrote using sample manufacturer’s codes and
prices.
EXECUTE PROCEDURE discount ("HSK", 1000);
EXECUTE PROCEDURE discount ("NRG", 1000);
EXECUTE PROCEDURE discount ("PRO", 1000);
3. Write a SELECT statement to select all rows in the items table for order
numbers 1047, 1062, 1065 and 1080. Display the following information sorted
by order and item number:
• Order number
• Item number
• Manufacturer code
• Original Item price
• Discounted price
SELECT order_num, item_num, manu_code, total_price,
discount(manu_code, total_price) AS net_price
FROM items
WHERE order_num IN (1047, 1062, 1065, 1080)
ORDER BY order_num, item_num;
Results:
In this exercise, you learned how to create a stored procedure, execute it
explicitly, and use it in a SELECT statement.
Unit summary
• Explain the purpose of stored procedures
• Explain advantages of using stored procedures
Unit summary
Triggers
Informix (v12.10)
Unit objectives
• Create and execute a trigger
• Drop a trigger
• Use the system catalogs to access trigger information
Unit objectives
What is a trigger?
Trigger
EVENT on
ACTION
a table
INSERT INSERT
UPDATE UPDATE
DELETE DELETE
SELECT EXECUTE PROCEDURE
What is a trigger?
A trigger is a database mechanism that automatically executes an SQL statement when
a certain event occurs. The event that can trigger an action can be an INSERT,
UPDATE, DELETE, or a SELECT statement on a specific table. The statement that
triggers an action can specify either a table, or one or more columns within the table.
The table on which the trigger event operates is called the triggering table.
When the trigger event occurs, the trigger action is executed. The action can be any
combination of one or more INSERT, UPDATE, DELETE, or EXECUTE PROCEDURE
statements.
Triggers are a feature of the database server, so the type of application tool you use to
access the database is irrelevant in the execution of a trigger. By invoking triggers from
the database, a DBA can ensure that data is treated consistently across application
tools and programs. Triggers are frequently used with stored procedures, since the
SQL statement that a trigger executes can be an EXECUTE PROCEDURE statement.
CREATE TRIGGER
CREATE TRIGGER trigger_name ...
CREATE TRIGGER
The CREATE TRIGGER statement is used to create a trigger in the database.
Trigger events
• Trigger events:
INSERT
DELETE
UPDATE
SELECT
• Can define multiple triggers for same event on table
• Can define multiple INSTEAD OF triggers for same event on same
view
Trigger events
The trigger event can be an INSERT, UPDATE, SELECT, or DELETE SQL statement.
• You can define multiple triggers for INSERT, DELETE, UPDATE, and SELECT
types of triggering events on the same table.
• You can define multiple INSTEAD OF triggers for INSERT, DELETE, and
UPDATE types of triggering events on the same view.
Remote tables
The table specified by the trigger event must be a table in the current database. You
cannot specify a remote table.
Trigger action
• The trigger action specifies the action that should occur and when it
should occur:
Execute before rows are processed:
− BEFORE (define action here)
Execute after each row is processed:
− FOR EACH ROW (define action here)
Execute after all rows are processed:
− AFTER (define action here)
• The action can be either a SQL statement or a stored procedure.
Trigger action
Trigger actions are executed at the following times:
• Before the trigger event occurs: The BEFORE triggered action list executes once
before the trigger event executes. Even if no rows are processed by the trigger
event, the BEFORE trigger actions are still executed.
• After each row is processed by the trigger event: The FOR EACH ROW trigger
action occurs once after each row is processed by the trigger event.
• After the trigger event completes: The AFTER triggered action list executes once
after the trigger event executes. If no rows are processed by the triggering
statement, the AFTER triggered action list is still executed.
You cannot reference the triggering table in any of the trigger action SQL statements.
Exceptions include an UPDATE statement that updates columns not listed in the
triggering table, and SELECT statements in a subquery or stored procedure.
REFERENCING example
• Use the REFERENCING clause to provide before- and after- values to
the action:
CREATE TRIGGER salary_upd
UPDATE OF salary ON employee
REFERENCING NEW AS post OLD AS pre
FOR EACH ROW
(INSERT INTO salary_audit
(update_dtime, whodunit, old_salary, new_salary)
VALUES (CURRENT, USER, pre.salary, post.salary));
CREATE TRIGGER items_upd
UPDATE OF total_price ON items
REFERENCING NEW AS post OLD AS pre
FOR EACH ROW
(UPDATE orders
SET order_price = order_price +
post.total_price – pre.total_price
WHERE order_num = post.order_num);
Triggers © Copyright IBM Corporation 2017
REFERENCING example
The salary_upd trigger shown is an example of how to insert a row into an audit table
whenever the salary of an employee is changed.
The items_upd trigger is an example of how to update the derived value order_price in
the orders table whenever the price of the items in the order has changed.
The NEW and OLD correlation values are needed for the actions in both triggers.
Please note: These examples are for illustration purposes only. The column
order_price does not exist in the demonstration database, nor does the employee table.
Cascading triggers
CREATE TRIGGER del_cust --cascading delete
DELETE ON customer
REFERENCING OLD AS pre_del
FOR EACH ROW (
DELETE FROM orders
WHERE customer_num = pre_del.customer_num,
DELETE FROM cust_calls
WHERE customer_num =
pre_del.customer_num);
CREATE TRIGGER del_orders
DELETE ON orders
REFERENCING OLD AS pre_del
FOR EACH ROW (
DELETE FROM items
WHERE order_num = pre_del.order_num);
Cascading triggers
Executing one trigger can cause another trigger to be executed, as shown in the
example. Deleting a customer row causes the del_cust trigger to execute. The del_cust
trigger deletes a row from the orders table, which in turn triggers the del_orders trigger.
When these triggers complete, the DELETE statements is executed in this order:
DELETE customer
DELETE orders
DELETE items
DELETE cust_calls
This technique was frequently used before cascading deletes became a feature of the
CREATE TABLE statement. Cascading deletes make it possible to define a referential
constraint in which the database server automatically deletes child rows when a parent
row is deleted.
Cascading is not supported with ON SELECT triggers.
You can place comments within a trigger in a line by prefixing it with two dashes (--) as
shown in the example. You can also include a comment by enclosing it between two
braces ({}). The use of two dashes is the ANSI-compliant method of introducing a
comment.
Multiple triggers
• Multiple triggers on a table can include the same or different columns
Multiple triggers
Multiple triggers on a table can include the same or different columns.
In the example, update trigger trig3 on the items table includes stock_num in its column
list, which is also a triggering column in trig1.
The following example shows a triggering statement for the update trigger:
UPDATE taba SET (b, c) = (b + 1, c + 1);
Trig1 for columns a and c executes first, and trig2 for columns b and d executes next. In
this case, the lowest column number in the two triggers is column 1 (a), and the next is
column 2 (b).
Trigger procedures
• SPL routine which can only be invoked from FOR EACH ROW section
• Must include WITH TRIGGER REFERENCES when using EXECUTE
PROCEDURE statement to invoke trigger
CREATE TRIGGER ins_trig_tab1 INSERT ON tab1
REFERENCING NEW AS post
FOR EACH ROW
(EXECUTE PROCEDURE proc1()
WITH TRIGGER REFERENCES);
Trigger procedures
A trigger procedure is an SPL routine that EXECUTE PROCEDURE can invoke only
from the FOR EACH ROW section of the action clause of a trigger definition.
You must include the WITH TRIGGER REFERENCES keywords when you use the
EXECUTE PROCEDURE statement to invoke a trigger procedure.
Such procedures must include the REFERENCING clause and the FOR clause in the
CREATE PROCEDURE statement that defined the procedure.
This REFERENCING clause declares names for correlated variables that the
procedure can use to reference the old column value in the row when the trigger event
occurred, or the new value of the column after the row was modified by the trigger.
The FOR clause specifies the table or view on which the trigger is defined.
The following statement defines an insert trigger on tab1 that calls proc1 from the FOR
EACH ROW section as its triggered action, and performs an INSERT operation that
activates this trigger:
CREATE TRIGGER ins_trig_tab1 INSERT ON tab1
REFERENCING NEW AS post
FOR EACH ROW(EXECUTE PROCEDURE proc1() WITH TRIGGER REFERENCES);
IF (SELECTING) THEN
INSERT INTO temptab1 VALUES(o.col1,0,o.col2,0);
END IF
IF (DELETING) THEN
DELETE FROM temptab1 WHERE temptab1.col1 = o.col1;
END IF
END PROCEDURE;
This trigger procedure illustrates that the triggered action can be a different DML
operation from the triggering event, the insert trigger from the previous visual.
This procedure inserts a row when an insert trigger calls it and deletes a row when a
delete trigger calls it. It also performs INSERT operations if it is called by a select trigger
or by an update trigger.
The proc1 trigger procedure in this example uses Boolean conditional operators that
are valid only in trigger routines.
The INSERTING operator returns true only if the procedure is called from the FOR
EACH ROW action of an INSERT trigger. This procedure can also be called from other
triggers whose trigger event is an UPDATE, SELECT, or DELETE statement because
the UPDATING, SELECTING and DELETING operators return true if the procedure is
invoked in the triggered action of the corresponding type of triggering event.
The REFERENCING clause of the trigger declares a correlation name for the NEW
value that is different from the correlation name that the trigger procedure declared.
These names do not need to match, because the correlation name that was declared in
the trigger procedure has that procedure as its scope of reference.
The following statement activates the ins_trig_tab1 trigger, which executes the proc1
procedure.
INSERT INTO tab1 VALUES (111,222);
Because the trigger procedure increments the new value of col1 by 1, the values
inserted are (112 and 222), rather than the value 111 that the original triggering event
(INSERT) specified.
Discontinuing an operation
• Use a stored procedure to roll back a triggering event:
Discontinuing an operation
Stored procedure language has a statement called RAISE EXCEPTION that
discontinues the stored procedure with an error (if the error is not trapped in a stored
procedure with the ON EXCEPTION statement) and returns control to the application.
The RAISE EXCEPTION statement can be used to discontinue both the trigger event
and the trigger action. If the database has been created with logging, the application
can then roll back the transaction.
Error number -745 is reserved for use with triggers. The error message that the users
receive is:
745: Trigger execution has failed.
The application code is responsible for checking for errors after the triggering SQL
statement and issuing a ROLLBACK WORK.
Any error code could be used in the RAISE EXCEPTION statement that is called from
the trigger. You do not have to use error 745.
BEGIN WORK
INSERT INTO tab1 VALUES(1,30)
IF (sqlca.sqlcode < 0) THEN
DISPLAY 'error ',sqlca.sqlcode, Failure of a trigger action
can be captured by
' on insert statement' checking the sqlcode
ROLLBACK WORK
ELSE
COMMIT WORK
END IF
....
Dropping a trigger
• The DROP TRIGGER statement deletes the trigger from the database.
Dropping a trigger
Deleting a table causes triggers that reference that table in the Trigger Event clause to
be deleted.
When you alter a table and drop a column, the column is dropped from trigger column
list in the trigger event. Triggers that reference the table in the trigger action are not
deleted. You must find and drop those triggers yourself.
• UPDATE cursors:
Each UPDATE WHERE CURRENT OF statement executes the complete
trigger.
Managing triggers
• If a table is dropped, all associated triggers are dropped
• If the database is dropped, all triggers are dropped
• Managing triggers as if they were application code is recommended
Managing triggers
Triggers are created with SQL statements, which make them easy to create and drop.
However, triggers contain important rules for the data and can be easily overlooked
when it comes to proper source-code maintenance procedures.
If a DBA unwittingly dropped a database (perhaps to recreate it later) without saving the
triggers associated with the database, the triggers are lost.
Exercise 17
Triggers
• create a trigger that will record any deletes from the customer table
Exercise 17:
Triggers
Purpose:
In this exercise, you will learn how to create and manage triggers.
6. Query the order_queries and hit_list tables to check existing data for
order 1017.
7. Select information from the orders table for order 1017.
8. Query the order_queries and hit_list tables again for order 1017 to verify the
trigger actions.
Task 3. Retrieve trigger information from the database server.
In this task, you will use the dbschema utility to retrieve information about the
trigger created in a previous task.
1. Run the following command and view the information on the triggers:
$ dbschema -d stores_demo -t orders
2. Challenge: Execute a SELECT statement to retrieve the CREATE TRIGGER
statement for the del_cust trigger from the system catalog tables.
RESULTS:
In this exercise, you learned how to create and manage triggers.
Exercise 17:
Triggers - Solutions
Purpose:
In this exercise, you will learn how to create and manage triggers.
4. Create a SELECT trigger on the order_num column of the orders table. This
trigger will insert an audit record into the order_queries table that consists of
the order number, customer number, the user ID of the user executing the
query, and the timestamp of the date and time (to the second) when the query
was made.
CREATE TRIGGER trig1
SELECT OF order_num ON orders
REFERENCING OLD AS pre
FOR EACH ROW
(INSERT INTO order_queries
VALUES (pre.order_num, pre.customer_num, USER,
CURRENT));
5. Create another select trigger on the order_num column of the orders table.
This trigger will increment the num_selects column of the hit_list table every
time a select of the order_num column is made.
CREATE TRIGGER trig2
SELECT OF order_num ON orders
REFERENCING OLD AS pre
FOR EACH ROW
(UPDATE hit_list SET num_selects = num_selects + 1
WHERE order_num = pre.order_num);
6. Query the order_queries and hit_list tables to check existing data for
order 1017.
SELECT * FROM order_queries
WHERE order_num = 1017;
SELECT * FROM hit_list
WHERE order_num = 1017;
7. Select information from the orders table for order 1017.
SELECT * FROM orders
WHERE order_num = 1017;
8. Query the order_queries and hit_list tables again for order 1017 to verify the
trigger actions.
SELECT * FROM order_queries
WHERE order_num = 1017;
SELECT * FROM hit_list
WHERE order_num = 1017;
Unit summary
• Create and execute a trigger
• Drop a trigger
• Use the system catalogs to access trigger information
Unit summary
Terminology
Informix (12.10)
Unit objectives
• Review Informix terminology
Unit objectives
There are many terms that you must be familiar with as an Informix DBA. This is a
partial list with brief definitions. If you need more information about a term, please
consult the Informix Knowledge Center or ask your instructor.
• Database server - A database server is the program that manages the content of
the database as it is stored on disk. The database server knows how tables,
rows, and columns are organized in physical computer storage. The database
server also interprets and executes all SQL commands. An Informix database
server, or instance, is the set of database server processes together with the
shared memory and disk space that the server processes manage. Multiple
instances can exist on the same computer.
• Shared Memory - Informix shared memory consists of the resident portion, the
virtual portion, and the message portion. Shared memory is used for caching data
from the disk (resident portion), maintaining and controlling resources needed by
the processors (virtual portion), and providing a communication mechanism for
the client and server (message portion).
• Disk - The disk component is a collection of one or more units of disk space
assigned to the database server. All the data in the databases, plus all the system
information necessary to maintain the server, are stored within the disk
component.
• Processes - The processes that make up the database server are known as
virtual processors. These processes are each called oninit. Each virtual
processor (VP) belongs to a virtual processor class. A VP class is a set of
processes responsible for a specific set of tasks.
• Chunk - A chunk is a contiguous unit of space that is assigned to the server to
use; the server manages the use of space within that chunk.
• Dbspace - A dbspace is a logical collection of chunks. You can create databases
and tables in dbspaces.
• Root dbspace - The root dbspace is a required dbspace where all the system
information that controls the database server is located.
• Temporary dbspace - Special dbspaces, called temporary dbspaces, can be
created for the storing of temporary tables or temporary files. Having temporary
dbspaces prevents temporary tables and files from unexpectedly filling file
systems or contending for space in the dbspace with data tables, and can speed
up the creation of temporary tables.
• Page - When a chunk is assigned to the database server, it is broken down into
smaller units called pages. The page is the basic unit of I/O for an Informix server.
All data in a server is stored in pages. All pages used by the server have a fixed
data structure. When data is read from disk into a buffer in shared memory, the
entire page on which that data is stored is read into the buffer.
• Extent - An extent is a contiguous group of physical pages allocated to a single
table, index, or table fragment. The size of an extent that stores rows for a table is
specified when the table is created or altered. Each table has two extent sizes
defined: an initial (or first) extent size and a size for all subsequent extents.
• Tblspace - A tblspace is the logical collection of all the pages allocated for a given
table or, if the table is fragmented across dbspaces, a fragment of the table
located in a dbspace. The space represented by a tblspace is not necessarily
contiguous; pages can be spread out on a single chunk, or pages for a table can
be on different chunks. A tblspace is always contained within a single dbspace.
• Simple large object - Simple large objects, or binary large objects (blobs), are
streams of bytes of arbitrary value and length. A simple large object might be a
digitized image or sound, an object module or a source code file. There are two
types of simple large objects: TEXT and BYTE. The TEXT data type is used for
the storage of printable ASCII text such as source code and scripts. The BYTE
data type is used for storing any kind of binary data such as saved spreadsheets,
program load modules, and digitized images or sound.
• Blobspace - To improve the efficiency of storage and retrieval of simple large
objects, Informix also offers special dbspaces with characteristics customized for
these data types, called a blobspace. The blobspace forms a pool of storage
space that can be used for any simple large object columns in the server. Any
single blobspace can contain blob data from different columns in different tables,
even in different databases.
• Blobpage - When a blobspace is created, a blobpage size is specified for that
blobspace. This value is the number of pages that make up a single blobpage.
Simple large object data stored in a blobspace is stored on one or more
blobpages.
• Smart large objects - Smart large objects are a category of large objects that
support random access to the data. There are two smart large object types:
BLOB (binary large object) and CLOB (character large object).
• Sbspace - Another special-purpose dbspace, called a smart blobspace, or
sbspace, is used for storing smart large objects. Unlike blobspaces, an sbspace
stores data in standard Informix pages, just like a dbspace. Since smart binary-
large-object values can be very large, a single value can occupy many pages.
• System Monitoring Interface (SMI) - The SMI provides you with point-in-time
information about the contents of the shared memory data structures used by
Dynamic Server, as well as information about the various objects contained in the
Informix instance. The SMI is implemented as the sysmaster database.
• Sysmaster - The sysmaster database holds the tables for the System Monitoring
Interface. One sysmaster database is automatically created for each Dynamic
Server instance the first time the instance is brought online. The sysmaster
database contains its own system-catalog tables and views, and a set of virtual
tables that serve as pointers to shared memory data. It can be used to gather
status, performance, and diagnostic information about the Informix instance.
• Dbaccess - Dbaccess is a query tool designed for character-based environments
is dbaccess. Dbaccess offers many other features in addition to being a query
tool with a full-screen text editor. It provides a menu-driven interface for creating
databases and tables, as well as an option for selecting column, index,
permission, constraint, and status information for existing tables. Status
information includes such things as the number of rows in the table and the row
size of the table.
• Dbaccess can also be used to execute a text file containing SQL commands
• OpenAdmin Tool (OAT) - The OpenAdmin Tool, also known as OAT, is a web-
based tool for administering one or more Informix database servers from a single
location. OAT's graphical interface includes options to perform administrative
tasks and to analyze system performance.
Unit summary
• To review Informix terminology
Unit summary
Data types
Informix (12.10)
Unit objectives
• Review Informix data types
Unit objectives
Informix supports many data types that you must be familiar with as an Informix DBA.
This is a partial list with brief definitions. If you need more information about a data type,
please consult the Informix Knowledge Center or ask your instructor.
Informix contains a number of built-in data types that allow you to store and manage
data for most application needs. Built-in data types include character, Boolean,
numeric, time, large object, JSON and BSON data types.
Additionally, Informix includes DataBlade data types. A DataBlade (extension or
add-on) is a package of user-defined data types and routines that are designed to
extend the database server capabilities for a particular purpose. Some DataBlades
are included with some Informix editions, and others can be purchased separately.
Finally, Informix includes the ability to handle extended data types. Extended data
types enable you to characterize data that cannot be easily represented with the
built-in data types. They include complex data types and user-defined data types.
(Use of extended data types is not covered in this course.)
Character data types
Three of the character data types in Informix are CHAR (or CHARACTER),
VARCHAR, and LVARCHAR. CHAR holds fixed length character strings;
VARCHAR and LVARCHAR hold varying length character strings. Character data
types are also known as alphanumeric data types.
• CHARACTER - The CHARACTER (or CHAR) data type stores any combination
of letters, numbers, and symbols. Tabs and spaces can be included. No other
non-printable characters are allowed. A CHAR(n) column has a fixed length of n
bytes, where n is a value between 1 and 32767.
• VARCHAR - The VARCHAR(m,r) data type stores character strings of varying
length, where m is the maximum size (in bytes) of the column and r is the
minimum number of bytes reserved for that column.
• LVARCHAR - The LVARCHAR(m) data type also stores character strings of
varying length, where m is the maximum size (in bytes of the column.
Boolean data types
• BOOLEAN - The BOOLEAN data type stores the Boolean values for true and
false using a single byte of storage. The Boolean data type can be set to a value
of either 'T' (or 't') or 'F' (or 'f'). The Boolean data type is not case sensitive.
Large objects
Large object data types store large ASCII or binary data values. Informix provides
four large object data types, BYTE, TEXT, BLOB, and CLOB. The BYTE and TEXT
types are referred to as simple large objects. The BLOB and CLOB data types are
referred to as smart large objects.
• BYTE - BYTE data types represent large amounts of unstructured data with
unpredictable contents.
• TEXT - TEXT data types represent large text files, and can contain both single-
byte and multibyte characters that the locale supports.
• BLOB - The BLOB data type stores any type of binary data. BLOBs offer features
not available with BYTE, including random access to object data.
• CLOB - The CLOB data type stores any type of text data. CLOBs offer features
not available with BYTE, including random access to object data.
JSON and BSON data types
JSON and BSON are Informix built-in data types that are used to support relational
database operations on data in JSON or BSON document store format. The JSON data
type is in plain text format, while the BSON data type is the binary representation of the
JSON data type.
• JSON - The acronym JSON stands for JavaScript Object Notation. It is a plain
text format for entering and displaying structured data. It is language
independent, and is a self-describing data-interchange format.
• BSON - BSON is the binary (internal) representation of a JSON document.
Extended data types allow you to characterize data that cannot easily be represented
with the built-in data types. Extended data types include complex and user-defined.
• Complex - A complex data type is usually a composite of other existing data
types. Complex data types include collection data types (LIST, SET, MULTISET)
and row types (named and unnamed).
• User-Defined - A user-defined data type (UDT) is a data type that derived from an
existing data type They can be used to extend the built-in types already available
and create customized data types.
Unit summary
• To review Informix data types
Unit summary
XML publishing
Informix (v12.10)
Objectives
• Describe the XML capabilities of the Informix server
• Create XML documents using SQL
Objectives
XML publishing
• Provides way to transform results of SQL Query to XML structure
• Can optionally include XML schema and header
• Special characters automatically handled
• Can store results in Informix database
• Must start idsxmlvp VP to use XML functions
XML publishing
XML publishing provides a way to transform the results of SQL queries into XML
structures.
When you publish an XML document using the built-in XML publishing functions, you
transform the result set of an SQL query into an XML structure, optionally including an
XML schema and header.
Special characters, such as the less than (<), greater than (>), double quote (“),
apostrophe (‘), and ampersand (&) characters are automatically converted to their XML
notation.
You can store the XML in the Informix database for use in XML-based applications.
To use these functions, you must start the idsxmlvp virtual processor.
XML functions (1 of 2)
• Two sets of functions publish XML from SQL queries
• Functions return LVARCHAR or CLOB:
genxml, genxmlclob
− Return rows of SQL results as XML elements
genxmlelem, genxmlelemclob
− Return each column value as separate elements
genxmlschema, genxmlschemaclob
− Return an XML schema in XML format
genxmlquery, genxmlqueryclob
− Return results of SQL query in XML format
genxmlqueryhdr, genxmlqueryhdrclob
− Return results of SQL query in XML format with XML header
XML functions
Several functions let you publish XML from SQL queries. The functions that are
provided in Informix are of two types: functions that return LVARCHAR and functions
that return CLOB. All of the functions handle NULL values and special characters.
The functions are:
• genxml and genxmlclob: Return rows of SQL results as XML elements.
• genxmlelem and genxmlelemclob: Return each column value as separate
elements.
• genxmlschema and genxmlschemaclob: Return an XML schema and result in
XML format.
• genxmlquery and genxmlqueryclob: Return the result set of a query in XML
format.
• Accepts an SQL query as a parameter
• genxmlqueryhdr and genxmlqueryhdrclob: Return the result set of a query in XML
with the XML header.
• Provide a quick method for generating the required XML header
XML functions (2 of 2)
• Functions return LVARCHAR or CLOB (continued):
extract, extractxmlclob
− Evaluate XPATH expression on XML column, document, or string and return XML
fragment
extractvalue, extractxmlclobvalue
− Evaluate XPATH expression on XML column, document, or string and return value
of the XML node
existsnode
− Determines whether XPATH evaluation results in at least one XML element.
idsxmlparse
− Parses XML document to determine whether it is well formed
1 125 Chemistry
2 250 Physics
3 375 Mathematics
4 500 Biology
• The query:
SELECT genxml(classes, "row") FROM classes;
Returns data in XML format
If first parameter is name of table, returns all columns
Second parameter is name of XML element
• The results:
<row classid="1" class="125" subject="Chemistry"/>
<row classid="2" class="250" subject="Physics"/>
<row classid="3" class="375" subject="Mathematics"/>
<row classid="4" class="500" subject="Biology"/>
From the same classes table as the example in the previous slide, this example uses
the row( ) construct to return only the columns classid and class.
This example uses the same classes table as the previous examples.
The genxmlelem function returns the columns in the table as individual elements.
The first parameter, the name of the table, species that all columns are to be returned.
The same syntax as in the previous visual could be used to only return specific
columns:
SELECT genxmlelem(row(classid, subject), "classes")
FROM classes
WHERE classid = 1;
The results of that query would be:
<classes>
<row>
<classid>1</classid>
<subject>Chemistry</subject>
</row>
</classes>
The genxmlqueryhdr function produces XML output similar to the genxmlelem function
shown on the previous slide.
The difference is that in addition to the data, the genxmlqueryhdr function also produces
an XML header in the output.
The genxmlschema function is identical to the genxml function, but also generates an
XML schema along with the data output.
Using the name of the table in the first parameter specifies that all columns are to be
returned.
You can use the row(col1, col2, …) format in the first parameter to specify the list of
columns desired.
The second parameter is the name of the XML element to be returned in the results.
<?xml version="1.0" encoding="en_US.819" ?> xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://schemas.ibm.com/informix/2006/sqltypes"
xmlns="http://schemas.ibm.com/informix/2006/sqltypes"
ElementFormDefault="qualified"> <xs:element name="classes"><xs:complexType>
<xs:sequence> <xs:element name="classid" type="xs:serial"/> <xs:element
name="class" type="xs:smallint"/> <xs:element name="subject"
type="xs:char(15)"/> </xs:sequence> </xs:complexType> </xs:element>
</xs:schema> <classes> <row> <classid>1</classid> <class>125</class>
<subject>Chemistry </subject> </row> <row> <classid>2</classid>
<class>250</class> <subject>Physics </subject> </row> <row>
<classid>3</classid> <class>375</class> <subject>Mathematics </subject>
</row> <row> <classid>4</classid> <class>500</class> <subject>Biology
</subject> </row> </classes>
Exercise
XML publishing
• configure the Informix instance for XML publishing
• Generate XML output from SQL queries
Exercise:
XML publishing
Purpose:
In this exercise, you will learn how to use the XML publishing features of
Informix.
Exercise:
XML publishing - Solution
Purpose:
In this exercise, you will learn how to use the XML publishing features of
Informix.
Unit summary
• Describe the XML capabilities of the Informix server
• Create XML documents using SQL
Unit summary
Informix (v12.10)
Objectives
• Describe the features of the Basic Text Search DataBlade Module
• Search the database for text content using the Basic Text Search
DataBlade Module
• Use the XML data index and search features of the Basic Text Search
DataBlade Module
Objectives
BLOB bts_blob_ops
CHAR bts_char_ops
CLOB bts_clob_ops
LVARCHAR bts_lvarchar_ops
VARCHAR bts_varchar_ops
NCHAR bts_nchar_ops
NVARCHAR bts_nvarchar_ops
Stopwords
• Can create custom stopword list
• Specify stopwords parameter when creating bts index
• Stopwords must be lowercase
• Input can be:
Inline with comma-delimited values
− stopwords=“(word1,word2,word3)”
An external file
− stopwords=“file:/directory/filename”
A table column
− stopwords=“table:table_name.column_name”
Stopwords
You can optionally create a custom stopword list to replace the default list. Invoking a
custom stopword list is done by specifying the stopword parameter when creating the
BTS index.
The stopword list can be entered either inline in the stopword parameter, or by
referencing an external file or a column in a database table.
The difference is that you only need to know the names of the database, table, and
index for the oncheck command, while you must know the directory path to the BTS
index in order to use the bts_index_compact function.
The index is locked while these processes run.
The deferred mode is best for large indexes that are updated frequently.
XML searches
• Must specify fields to be searched:
With xmltags index parameter, default field is first field
With all_xmltags index parameter, no default field
• Specify field/path name followed by colon (:)
bts_contains(column,'fruit:Orange')
bts_contains(column,'fruit:"Orange Juice"')
bts_contains(column,'/fruit/citrus:"Orange Juice"')
• If enable include_namespaces index parameter must escape colon
within namespace
bts_contains(column,'fruit\:citrus:Orange')
• BTS search modifiers also apply to XML searches
XML searches
Searches of XML data are also done using the bts_contains predicate, but now you can
specify specific tags or XML paths to search. This is done by specifying the field or path
name followed by a colon (:) and then the search word or phrase.
When creating an index for searching XML data there are a number of parameters you
can specify that are specific to XML data. These parameters are discussed in the
following pages.
The search modifiers discussed in previous pages, such as fuzzy, proximity, range, and
wildcards also apply to XML searches.
XML tags are case-sensitive. When you use the inline comma-separated field names
for input, the field names are transformed to lowercase characters. If the field names
are uppercase or mixed case, use an external file or a table column for input instead.
The file or table that contains the field names must be readable by the user creating the
index. The file or table is read only when the index is created. If you want to add new
field names to the index, you must drop and re-create the index. The field names in the
file or table column can be separated by commas, whitespaces, newlines, or a
combination.
This visual shows the CREATE INDEX statement using the xmltags index parameter to
specify indexing the title, author, and date XML fields.
The fields indexed are:
• title:graph theory
• author:stewart
• date:january 14, 2008
Given the XML data and CREATE INDEX statement from the previous visual, the
example query returns the data shown in XML format.
To create an index for all the XML tags, use the SQL statement:
CREATE INDEX ... USING bts(all_xmltags="yes") ...;
The index contains three fields that can be searched:
• title:graph theory
• author:stewart
• date:january 14, 2008
The top-level <book></book> tags are not indexed because they do not contain text
values. If you enable path processing with the xmlpath_processing parameter, you can
index the full paths:
CREATE INDEX...USING bts(all_xmltags="yes",xmlpath_processing=”yes”)...;
The index contains three fields with full paths that can be searched:
• /book/title:graph theory
• /book/author:stewart
• /book/date:january 14, 2008
Given the XML data and CREATE INDEX statement from the previous visual, the query
in this example returns the data shown in XML format.
This example shows the CREATE INDEX statement using the xmlpath_processing
index parameter.
This indexes the following paths:
• /book/title:graph theory
• /book/author:stewart
• /book/date:january 14, 2008
Given the XML data and CREATE INDEX statement from the previous visual, the query
in this example returns the data shown in XML format.
This visual shows the CREATE INDEX statement using the include_contents index
parameter.
This indexes the title, author, and date fields, and adds the contents field, which
includes all text and tags.
The actual fields indexed are:
• title:graph theory
• author:stewart
• date:january 14,2008
• contents:<book> <title>Graph Theory</title> <author>Stewart</author>
<date>January 14, 2008</date> </book>
Given the XML data and CREATE INDEX statement from the previous visual, the query
on the contents field in this example returns the data shown in XML format.
Given the XML fragment shown, if you create a BTS index with the include_subtag_text
parameter disabled (include_subtag_text=”no”), the index has three separate comment
fields:
• comment:this
• comment:text is very
• comment:to me
If you create a BTS index with the include_subtag_text parameter enabled
(include_subtag_text=”yes”), all of the text is indexed in a single comment field:
comment:this highlighted text is very important to me
To create an index with the untagged values only, use the statement:
CREATE INDEX ... USING bts(strip_xmltags="yes") ...;
The index contains a single contents field:
contents:Graph Theory Stewart January 14, 2008
To create an index that has XML tag fields as well as a field for the untagged values,
use the statement:
CREATE INDEX ... USING
bts(all_xmltags="yes",include_contents="yes",strip_xmltags="yes") ...;
The index contains XML tag fields as well as the untagged values in the contents field:
title:graph theory author:stewart date:january 14, 2008 contents:Graph
Theory Stewart January 14, 2008
Exercise
Basic Text Search DataBlade module
• configure the Informix instance for basic text searches using the Basic
Text Search DataBlade module
• conduct searches of textual data using the functions of the BTS
DataBlade
• explore indexing and searching of XML data
Exercise:
Basic Text Search DataBlade module
Purpose:
In this exercise, you will learn how to use the Basic Text Search DataBlade.
Exercise:
Basic Text Search DataBlade module - Solutions
Purpose:
In this exercise, you will learn how to use the Basic Text Search DataBlade.
12. In dbaccess, create a BTS index on the newly-created and loaded thoughts
column of the customer table. Use the deferred delete method.
CREATE INDEX bts_idx
ON customer (thoughts bts_lvarchar_ops)
USING bts
(delete = 'deferred')
IN bts_space;
Task 2. Conduct BTS searches.
1. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains either of the words vegetable or fruit.
SELECT TRIM(fname) || " " || lname, thoughts
FROM customer
WHERE bts_contains(thoughts,'vegetable fruit');
2. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains the exact phrase white chocolate.
SELECT TRIM(fname) || " " || lname, thoughts
FROM customer
WHERE bts_contains(thoughts,'"white chocolate"');
3. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains the words eat and chocolate within 8 words of
each other.
SELECT TRIM(fname) || " " || lname, thoughts
FROM customer
WHERE bts_contains(thoughts,'"eat chocolate"~8');
5. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains a word similar to contract.
SELECT TRIM(fname) || " " || lname, thoughts
FROM customer
WHERE bts_contains(thoughts,'contract~');
3. Create and run an SQL query that returns the XML data for records that have
the word “Black” anywhere in the record.
SELECT * FROM boats
WHERE bts_contains(xml_data, ' contents:black ');
4. Create and run an SQL query that returns the XML data for records that have a
word similar to “Quinn” in the boat name.
SELECT * FROM boats
WHERE bts_contains(xml_data, ' boatname:quinn~ ');
Results:
In this exercise, you learned how to use the Basic Text Search DataBlade.
Unit summary
• Describe the features of the Basic Text Search DataBlade Module
• Search the database for text content using the Basic Text Search
DataBlade Module
• Use the XML data index and search features of the Basic Text Search
DataBlade Module
Unit summary
Informix (v12.10)
Objectives
• Describe how to use the Node DataBlade Module to index
hierarchical data
• Write SQL queries to return hierarchical data using the Node
DataBlade
Objectives
Employee/Manager
Standard query
• For each level, select the employee count:
SELECT count(*) FROM employee e1, employee e2, employee e3,
employee e4, employee e5
WHERE e5.mgr_id = :value AND e4.mgr_id = e5.emp_id
AND e3.mgr_id = e4.emp_id AND e2.mgr_id = e3.emp_id
AND e1.mgr_id = e2.emp_id;
• Without an extensible engine, must flatten relationships into master-
detail relationship:
The relationships are nested, not regular 1:N model format
Requires multiple passes through the data to find all the recursive
relationships
Can only be solved with procedural or set processing
As levels increase, programming becomes more complex, losing the ability
to dynamically create SQL operations
Standard query
To run a query such as the one shown above against the tree shown in the previous
page requires multiple passes through the data. It can have significant performance
impact, particularly as the number of levels increase.
Without an extendable database engine (such as Informix), the relationships must be
flattened into master-detail or parent-child relationships.
A workable hierarchy
1.0
CREATE TABLE employee (
emp_id node PRIMARY KEY,
name varchar(50));
1.2.3.4.5
Node DataBlade module © Copyright IBM Corporation 2017
A workable hierarchy
This chart depicts how the employee table is represented in a hierarchical format using
the Node data type of the Node DataBlade Module.
Because of the way the Node DataBlade represents the hierarchy, processing
becomes linear instead of recursive, using either table scans or partial index scans.
This representation allows functional comparisons such as equal, less than, greater
than, less than or equal to, equal to or greater than, and not equal.
Other functions allow administrative tasks to be performed on the structure:
• Graft: Moves sections of the node tree.
• Increment: Determines the next node at the same level.
• NewLevel: Creates a node level.
• GetMember: Returns information about a node level.
• GetParent: Returns the parent node.
• Ancestors: Returns the ancestor list back to the root node.
Node functions
• Equals • NewLevel
• NotEqual • GetParent
• LessThan • IsParent
• LessThanOrEqual • IsChild
• GreaterThan • Ancestors
• GreaterThanOrEqual • Graft
• Compare • IsDecendant
• Increment • GetMember
• Length
• Depth
Node functions
A list of the functions that are part of the Node DataBlade Module is shown here.
Exercise
Node DataBlade module
• configure and use the Node DataBlade module for indexing
hierarchical data
Exercise:
Node DataBlade module
Purpose:
In this exercise, you will learn how to use the Node DataBlade.
Exercise:
Node DataBlade module - Solutions
Purpose:
In this exercise, you will learn how to use the Node DataBlade.
4. Examine the output for the access and join methods and the cost.
Note that the engine processed the query as if it were a 5-table join, essentially
making 5 passes through the table.
$ more sqexplain.out
Results:
In this exercise, you learned how to use the Node DataBlade.
Unit summary
• Describe how to use the Node DataBlade module to index
hierarchical data
• Write SQL queries to return hierarchical data using the Node
DataBlade
Unit summary
Informix (v12.10)
Objectives
• Set environment variables necessary for Global Language Support
• List the components of a locale
• Use the NCHAR and NVARCHAR data types
• Explain the effect of collation sequence on various SQL statements
Objectives
À ö
é
ñ
Using Global Language Support © Copyright IBM Corporation 2017
What is a locale?
• A locale is a language environment composed of:
A code set
A collation sequence
A character classification
Numeric (non-money) formatting
Monetary formatting
Date and time formatting
Messages
• Define a locale with an environment variable. For example:
$ setenv CLIENT_LOCALE ja_jp.sjis
What is a locale?
A GLS locale represents the language environment for a specific location. It contains
language specifications as well as regional and cultural information. A locale consists of
a code set, a collation sequence, formatting specifications for numeric, money, date,
and time values, and message definitions.
You can define separate locales for the client application, the database, and the
database server. The three environment variables which you can set are:
• CLIENT_LOCALE
• DB_LOCALE
• SERVER_LOCALE
The specification of a locale defines the GLS behavior. No other flags or environment
variables need to be set. The default locale for either the application, database, or
database server is US 8859-1 English (en_us.8859-1).
65 67 77 69 32 67 11
A C M E C o
ASCII Symbol
A B C D
A1 A2 B C1 C2 C3 D
A1 A2 B1 B2 C1 C2
1 2 3 4 5 6
For data types BYTE and TEXT, the database server returns all bytes without partial
character replacement.
Substring designators should be used only when it is possible to determine the physical
location of the logical characters desired. The SQL functions LENGTH,
OCTET_LENGTH, and CHAR_LENGTH can be used to determine the physical and
logical lengths of strings in columns. Function LENGTH returns the string length in
bytes minus trailing white spaces. Function OCTET_LENGTH returns the number of
bytes and, function CHAR_LENGTH returns the number of logical characters.
SQL identifiers
GLS allows you to use any alphabetic characters of a code set to form most SQL
identifiers (names of tables, columns, views, indexes, and so on). The servername,
dbspace names, and blobspace names are the exceptions. The locale defines which
characters within a code set are considered alphabetic. Multibyte characters can be
used within an identifier, but the physical length of an identifier must be 18 bytes or less.
An identifier with multibyte characters has fewer logical characters than the length of the
identifier in bytes.
• Code set order: the physical order of characters in the code set
• Localized order: the language-specific order of characters
A A
C À
a a
b b
c C
À c
Localized collation sequences can specify case folding (case insensitivity) or characters
which are equivalents. For example, if collation is in code set order (data types CHAR
or VARCHAR), the statement:
SELECT lname FROM customer
WHERE lname IN ('Azevedo','Llaner','Oatfield')
returns only one of Azevedo, azevedo, or Àzevedo whereas done in localized order, all
three might be returned.
• Numeric:
US English
− 3,225.01
French
−3 225,01
• Monetary:
US English
− $100,000.49
French
− 100 000,49FF
1993 82
1912 01
1911 -01
1910 -02
1900 -12
The server locale determines how the database server performs I/O operations on the
server computer. These I/O operations include reading or writing the following files:
• Diagnostic files that the database server generates to provide additional
diagnostic information
• Log files that the database server generates to record events
• Explain file, sqexplain.out, that is generated by executing the SQL statement SET
EXPLAIN
The database server is the only IBM Informix product that needs to know the server
locale.
Locale compatibility
The languages and territories of the client, database, and server locales might be
different if the code sets are the same. Be careful, however, because GLS does not
provide semantic translation. If the locale stored in the database is us_en.8859-1 and
the CLIENT_LOCALE is fr_fr.8859-1, a value stored in the database as $10.00 is
displayed on the client as 10,00FF. There is no exchange rate calculation.
Additionally, the code set of the locale stored in the database might differ from the
CLIENT_LOCALE code set. However, there are restrictions. If a database is created
with DB_LOCALE = aa_bb.cs1, then the CLIENT_LOCALE might equal any locale,
cc_dd.cs2, but only if locale cc_dd.cs1 exists and there is code set conversion between
cs1 and cs2 (code set conversion is explained later in the unit). If cc_dd.cs1 does not
exist, then an error -23101 is returned.
If the SERVER_LOCALE is not compatible with the DB_LOCALE (that is, the code sets
are different and not convertible), data is written to external files without code set
conversion.
Most processing relating to collation sequence or character classification is handled
by the database server. Most processing related to formatting of date, number, and
money values is performed by the client.
Specifying locales
• Default:
setenv CLIENT_LOCALE en_us.8859-1
setenv DB_LOCALE en_us.8859-1
setenv SERVER_LOCALE en_us.8859-1
• Example:
setenv CLIENT_LOCALE ja_jp.sjis
setenv DB_LOCALE ja_jp.ujis
setenv SERVER_LOCALE ja_jp.ujis
Specifying locales
The following three environment variables specify the locales for the client application,
database, and database server.
• CLIENT_LOCALE
• DB_LOCALE
• SERVER_LOCALE
When the client requests a connection, it sends CLIENT_LOCALE and DB_LOCALE to
the database server. If the client and database locales sent by the client are not
compatible with what is stored in the database, a warning is returned to the client in the
SQL communications area (SQLCA) via the SQLWARN7 warn flag (except when the
code sets differ and code set conversion is available). The client application should
check this flag after connecting to a database.
The server locale, specified by SERVER_LOCALE, determines how the database
server reads and writes external files.
Code set conversion does not convert words to different languages. For example, it
does not convert the English word yes to the French word oui. It only ensures that each
character is processed or printed the same regardless of how it is encoded. Code set
conversion does not:
• Perform semantic translation. Words are not translated from one language to
another.
• Create characters which do not exist in the target code set. Conversion is from a
valid source character to a valid target character via a conversion file.
Code set conversion file
A code set conversion file is used to map source characters to target characters. If a
conversion file does not exist for the source-to-target relationship, an error is returned to
the client application when it begins execution. BYTE data is never converted. Use the
glfiles utility to generate a listing of the code set conversion files available on your
system.
Compatible locales
The code set of the CLIENT_LOCALE (cc_dd.cs2) might differ from the code set of
the locale stored in the database (aa_bb.cs1), only if locale cc_dd.cs1 exists and
there is a code set conversion file between cs1 and cs2.
• Informix utilities:
onaudit onshowaudit dbaccess
dbload onstat dbexport
oncheck onunload dbimport
onload dbschema
• ESQL/C
• ESQL/COBOL
Unit summary
• Set environment variables necessary for Global Language Support
• List the components of a locale
• Use the NCHAR and NVARCHAR data types
• Explain the effect of collation sequence on various SQL statements
Unit summary