Sunteți pe pagina 1din 848

®

Course Guide
Informix 12.10 Database Administration
Course code IX223 ERC 1.0

IBM Training
Preface

September, 2017
NOTICES
This information was developed for products and services offered in the USA.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for
information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to
state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any
non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document.
The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive, MD-NC119
Armonk, NY 10504-1785
United States of America
The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND,
EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in
certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these
changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the
program(s) described in this publication at any time without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any manner serve as an endorsement of
those websites. The materials at those websites are not part of the materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information
concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available
sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the
examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and
addresses used by an actual business enterprise is entirely coincidental.
TRADEMARKS
IBM, the IBM logo, ibm.com, and Informix are trademarks or registered trademarks of International Business Machines Corp., registered in many
jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is
available on the web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.
Adobe, and the Adobe logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other
countries.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
PuTTY is copyright 1997-2017 by Simon Tatham.

VMware is a trademark of VMware, Inc

© Copyright International Business Machines Corporation 2017.


This document may not be reproduced in whole or in part without the prior written permission of IBM.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

© Copyright IBM Corp. 2001, 2017 P-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Contents
Preface................................................................................................................. P-1
Contents ............................................................................................................. P-3
Course overview............................................................................................... P-15
Document conventions ..................................................................................... P-17
Exercises.......................................................................................................... P-18
Additional training resources ............................................................................ P-19
IBM product help .............................................................................................. P-20
Creating databases and tables ............................................................ 1-1
Unit objectives .................................................................................................... 1-3
Prerequisites ...................................................................................................... 1-4
Before you create a database............................................................................. 1-5
Database names ................................................................................................ 1-6
Database logging ............................................................................................... 1-7
No logging .......................................................................................................... 1-8
NO LOGGING still has logging ......................................................................... 1-10
Unbuffered logging ........................................................................................... 1-11
Buffered logging ............................................................................................... 1-12
MODE ANSI databases .................................................................................... 1-13
The database dbspace ..................................................................................... 1-15
Creating a database ......................................................................................... 1-16
Creating a table ................................................................................................ 1-17
Select a valid table name ................................................................................. 1-18
Extents ............................................................................................................. 1-20
Estimating row and extent sizes ....................................................................... 1-22
Managing extents ............................................................................................. 1-24
Table lock modes ............................................................................................. 1-26
Tables and dbspaces ....................................................................................... 1-28
Creating a table ................................................................................................ 1-29
Creating a table: Simple large objects .............................................................. 1-30
Creating a table: Smart large objects................................................................ 1-31
Creating a temporary table ............................................................................... 1-32
DBCENTURY ................................................................................................... 1-34
The DBSCHEMA utility ..................................................................................... 1-35
Using ONCHECK and ONSTAT ....................................................................... 1-37

© Copyright IBM Corp. 2001, 2017 P-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Sysmaster table: sysdatabases ........................................................................ 1-38


System Catalog table: systables....................................................................... 1-39
System Catalog table: syscolumns ................................................................... 1-40
Exercise 1: Create databases and tables ......................................................... 1-41
Unit summary ................................................................................................... 1-70
Altering and dropping databases and tables ...................................... 2-1
Unit objectives .................................................................................................... 2-3
Altering a table ................................................................................................... 2-4
Fast ALTER........................................................................................................ 2-5
In-place ALTER .................................................................................................. 2-6
Slow ALTER ....................................................................................................... 2-8
Slow ALTER process ......................................................................................... 2-9
Data space reclamation: CLUSTER index ........................................................ 2-10
Data space reclamation: TRUNCATE............................................................... 2-12
Renaming columns, tables, and databases ...................................................... 2-13
Converting simple objects to smart objects ...................................................... 2-15
Dropping tables and databases ........................................................................ 2-16
Exercise 2: Alter and drop databases and tables.............................................. 2-17
Unit summary ................................................................................................... 2-27
Creating, altering, and dropping indexes ............................................ 3-1
Unit objectives .................................................................................................... 3-3
B+ tree index structure ....................................................................................... 3-4
B+ tree splits ...................................................................................................... 3-6
Indexes: Unique and duplicate ........................................................................... 3-7
Composite index................................................................................................. 3-8
Using composite indexes .................................................................................... 3-9
Cluster indexes................................................................................................. 3-10
The CREATE INDEX statement ....................................................................... 3-11
Detached indexes ............................................................................................. 3-13
Index fill factor .................................................................................................. 3-14
Altering, dropping, and renaming indexes ........................................................ 3-15
SYSINDICES and SYSINDEXES system catalogs ........................................... 3-16
Forest of trees index ......................................................................................... 3-17
Comparing B+ tree and forest of trees indexes ................................................ 3-18
Exercise 3: Create, alter, and drop indexes ...................................................... 3-19
Unit summary ................................................................................................... 3-27

© Copyright IBM Corp. 2001, 2017 P-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Managing and maintaining indexes ..................................................... 4-1


Unit objectives .................................................................................................... 4-3
Benefits of indexing ............................................................................................ 4-4
Costs of indexing ................................................................................................ 4-5
B+ tree maintenance .......................................................................................... 4-7
Indexing guidelines ............................................................................................. 4-9
Index join columns ............................................................................................ 4-10
Index filter columns .......................................................................................... 4-11
Index columns involved in sorting ..................................................................... 4-12
Avoid highly duplicate indexes.......................................................................... 4-13
Volatile tables ................................................................................................... 4-14
Keeping key size small ..................................................................................... 4-15
Composite indexes ........................................................................................... 4-16
Clustered indexes ............................................................................................. 4-17
Drop versus disable indexes............................................................................. 4-18
Parallel index builds ......................................................................................... 4-19
Calculating index size ....................................................................................... 4-20
Exercise 4: Managing and maintaining indexes ................................................ 4-23
Unit summary ................................................................................................... 4-51
Table and index partitioning ................................................................ 5-1
Unit objectives .................................................................................................... 5-3
What is fragmentation?....................................................................................... 5-4
Fragments and extents ....................................................................................... 5-5
Advantages of fragmentation .............................................................................. 5-6
Parallel scans and fragmentation ....................................................................... 5-8
Parallel scans (PDQ queries) ............................................................................. 5-9
DSS queries ..................................................................................................... 5-10
Balanced I/O and fragmentation ....................................................................... 5-11
OLTP queries ................................................................................................... 5-12
Types of distribution schemes .......................................................................... 5-13
Round robin fragmentation ............................................................................... 5-15
Round robin for smart large objects .................................................................. 5-17
Expression-based fragmentation ...................................................................... 5-18
Using PARTITIONING ...................................................................................... 5-19
Logical and relational operators........................................................................ 5-20
Fragmentation by expression guidelines .......................................................... 5-22
Using hash functions ........................................................................................ 5-24
Fragmentation based on a list .......................................................................... 5-26
Fragmentation based on an interval ................................................................. 5-27
Fragmented/partitioned indexes ....................................................................... 5-29

© Copyright IBM Corp. 2001, 2017 P-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

CREATE INDEX statement .............................................................................. 5-31


ROWIDS .......................................................................................................... 5-32
Selecting a fragmentation strategy ................................................................... 5-33
Fragmentation of temporary tables ................................................................... 5-35
Creating fragmented temporary tables ............................................................. 5-36
Fragmenting an index ....................................................................................... 5-37
System Catalog: sysfragments ......................................................................... 5-38
Optional discussion: Case study....................................................................... 5-39
Exercise 5: Table and index partitioning ........................................................... 5-41
Unit summary ................................................................................................... 5-58
Maintaining table and index partitioning ............................................. 6-1
Unit objectives .................................................................................................... 6-3
The ALTER FRAGMENT statement ................................................................... 6-4
Initializing a new fragmentation strategy ............................................................. 6-5
Adding an additional fragment ............................................................................ 6-6
Dropping a fragment ........................................................................................... 6-7
Modifying an existing fragment ........................................................................... 6-8
Attaching and detaching fragments .................................................................... 6-9
How is ALTER FRAGMENT executed? ............................................................ 6-11
Skipping inaccessible fragments ...................................................................... 6-12
Defragmenting partitions .................................................................................. 6-13
Exercise 6: Maintaining table and index partitioning ......................................... 6-14
Unit summary ................................................................................................... 6-27
The Informix query optimizer ............................................................... 7-1
Unit objectives .................................................................................................... 7-3
The query plan ................................................................................................... 7-4
Access plan ........................................................................................................ 7-5
Join plan ............................................................................................................. 7-7
Join method: Nested-loop join ............................................................................ 7-8
Join method: Dynamic hash join ......................................................................... 7-9
Evaluate information for each table .................................................................. 7-10
Determining the query plan .............................................................................. 7-12
Generating the query plan ................................................................................ 7-13
The explain file ................................................................................................. 7-14
The query plan ................................................................................................. 7-15
Query statistics ................................................................................................. 7-16
EXPLAIN query statistics output ....................................................................... 7-17
Analyzing query plans ...................................................................................... 7-18
Sequential scan with temporary file .................................................................. 7-19
Sequential scan with filter ................................................................................. 7-21

© Copyright IBM Corp. 2001, 2017 P-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Key-only index scan (aggregate) ...................................................................... 7-23


Index scan with lower index filter ...................................................................... 7-25
Index scan: Lower and upper index filters ........................................................ 7-27
Dynamic hash join ............................................................................................ 7-29
Hash join: parallel scan and sort threads .......................................................... 7-31
Nested loop join................................................................................................ 7-33
Key-first index scan .......................................................................................... 7-35
Key-only index scan ......................................................................................... 7-36
Skip duplicate index scan ................................................................................. 7-37
Index self joins.................................................................................................. 7-38
Optimizing subqueries ...................................................................................... 7-39
Multi-index scan ............................................................................................... 7-43
Skip scan.......................................................................................................... 7-44
Multi-index scan in sqexplain ............................................................................ 7-45
Multi-index ........................................................................................................ 7-46
Exercise 7: The Informix query optimizer .......................................................... 7-48
Unit summary ................................................................................................... 7-65
Updating_statistics_and_data_distributions ...................................... 8-1
Unit objectives .................................................................................................... 8-3
What does UPDATE STATISTICS do? .............................................................. 8-4
UPDATE STATISTICS modes............................................................................ 8-5
UPDATE STATISTICS statement....................................................................... 8-6
Statistics available with LOW mode .................................................................... 8-7
UPDATE STATISTICS LOW information ............................................................ 8-8
MEDIUM and HIGH modes ................................................................................ 8-9
UPDATE STATISTICS MEDIUM ...................................................................... 8-10
SAMPLING SIZE examples .............................................................................. 8-12
UPDATE STATISTICS HIGH/MEDIUM information.......................................... 8-13
How distributions are created ........................................................................... 8-14
What information is kept? ................................................................................. 8-16
The sysdistrib System Catalog table................................................................. 8-17
dbschema -hd display....................................................................................... 8-18
Distribution output ............................................................................................ 8-19
Resolution ........................................................................................................ 8-20
Confidence ....................................................................................................... 8-22
Updating distributions only ............................................................................... 8-23
Create index/build distribution process ............................................................. 8-24
Create index/build distribution architecture ....................................................... 8-25
Automatic UPDATE STATISTICS exceptions................................................... 8-26
How to update statistics.................................................................................... 8-27
Updating statistics on small tables .................................................................... 8-29

© Copyright IBM Corp. 2001, 2017 P-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

UPDATE STATISTICS on temporary tables ..................................................... 8-30


When to update statistics ................................................................................. 8-31
Problem queries ............................................................................................... 8-32
Dropping distributions ....................................................................................... 8-34
When table changes affect distributions ........................................................... 8-35
Space utilization ............................................................................................... 8-36
Update statistics tools....................................................................................... 8-38
Fragment-level statistics ................................................................................... 8-39
STATLEVEL property ....................................................................................... 8-40
Update statistics extensions ............................................................................. 8-41
Exercise 8: Updating statistics and data distributions ....................................... 8-42
Unit summary ................................................................................................... 8-52
Managing the optimizer ........................................................................ 9-1
Unit objectives .................................................................................................... 9-3
Influencing the optimizer..................................................................................... 9-4
OPTCOMPIND ................................................................................................... 9-5
SET OPTIMIZATION .......................................................................................... 9-7
OPTIMIZATION LOW ......................................................................................... 9-9
When to try SET OPTIMIZATION LOW ............................................................ 9-10
FIRST_ROWS .................................................................................................. 9-11
Using directives ................................................................................................ 9-13
Types of optimizer directives ............................................................................ 9-14
Identifying directives ......................................................................................... 9-15
Access method directives ................................................................................. 9-17
Access method directives in combination ......................................................... 9-19
Join order directive ........................................................................................... 9-20
Join method directive........................................................................................ 9-21
Optimizing goal directives ................................................................................. 9-23
EXPLAIN directive ............................................................................................ 9-24
Directives ......................................................................................................... 9-25
Tips for using directives .................................................................................... 9-26
External directives ............................................................................................ 9-27
Exercise 9: Managing the optimizer .................................................................. 9-28
Unit summary ................................................................................................... 9-43
Referential and entity integrity ........................................................ 10-1
Unit objectives .................................................................................................. 10-3
Definitions ........................................................................................................ 10-4
Integrity at the application level ........................................................................ 10-6
Integrity at the database server level ................................................................ 10-7
Types of integrity constraints ............................................................................ 10-8

© Copyright IBM Corp. 2001, 2017 P-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

UNIQUE, NOT NULL, and DEFAULT ............................................................... 10-9


Constraint names ........................................................................................... 10-10
CHECK constraint .......................................................................................... 10-11
What is a referential constraint? ..................................................................... 10-12
Types of referential constraints....................................................................... 10-14
Cyclic referential constraints ........................................................................... 10-15
Cyclic referential constraints: Example ........................................................... 10-16
Cascading deletes .......................................................................................... 10-17
Restrictions .................................................................................................... 10-18
Adding a cascading delete ............................................................................. 10-19
Multiple-path referential constraints ................................................................ 10-20
Self-referencing referential constraints ........................................................... 10-21
Creating primary key constraints .................................................................... 10-22
Creating foreign key constraints ..................................................................... 10-23
Adding a primary key constraint ..................................................................... 10-24
Adding a foreign key constraint ...................................................................... 10-25
System Catalog tables.................................................................................... 10-26
Exercise 10: Referential and entity integrity .................................................... 10-27
Unit summary ................................................................................................. 10-35
Managing constraints ...................................................................... 11-1
Unit objectives .................................................................................................. 11-3
Constraint transaction modes ........................................................................... 11-4
Immediate constraint checking ......................................................................... 11-5
Deferred constraint checking ............................................................................ 11-7
Detached constraint checking........................................................................... 11-9
Performance effect ......................................................................................... 11-10
Dropping a constraint ..................................................................................... 11-11
Deleting and updating a parent row ................................................................ 11-13
Inserting and updating a child row .................................................................. 11-14
Exercise 11: Managing constraints ................................................................. 11-15
Unit summary ................................................................................................. 11-25
Unit 12 Modes and violation detection ........................................................ 12-1
Unit objectives .................................................................................................. 12-3
Types of database objects ................................................................................ 12-4
Database object modes .................................................................................... 12-5
Why use object modes? ................................................................................... 12-6
Disabling an object ........................................................................................... 12-7
Creating a disabled object ................................................................................ 12-8
Enabling a constraint ........................................................................................ 12-9
Recording violations ....................................................................................... 12-10

© Copyright IBM Corp. 2001, 2017 P-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Violation tables setup ..................................................................................... 12-11


Violations table schema.................................................................................. 12-13
Filtering mode................................................................................................. 12-14
Turning off violation logging ............................................................................ 12-16
System catalog table: sysobjstate .................................................................. 12-17
System catalog table: sysviolations ................................................................ 12-18
Example 1 ...................................................................................................... 12-19
Example 2 ...................................................................................................... 12-23
Exercise 12: Modes and violation detection .................................................... 12-25
Unit summary ................................................................................................. 12-31
Unit 13 Concurrency control ........................................................................ 13-1
Unit objectives .................................................................................................. 13-3
ANSI SQL-92 transaction isolation ................................................................... 13-4
Informix isolation .............................................................................................. 13-5
Comparison ...................................................................................................... 13-6
Access methods ............................................................................................... 13-7
READ UNCOMMITTED.................................................................................... 13-8
READ COMMITTED ......................................................................................... 13-9
CURSOR STABILITY ..................................................................................... 13-10
REPEATABLE and SERIALIZABLE reads ..................................................... 13-11
COMMITTED READ LAST COMMITTED ...................................................... 13-12
COMMITTED READ LAST COMMITTED example ........................................ 13-13
Configuring COMMITTED READ LAST COMMITTED ................................... 13-14
LAST COMMITTED considerations ................................................................ 13-16
Locks and concurrency................................................................................... 13-17
Database-level locking ................................................................................... 13-19
Locking a table in Share Mode ....................................................................... 13-20
Locking a table in Exclusive Mode .................................................................. 13-21
Unlocking a table ............................................................................................ 13-22
Row and page locks ....................................................................................... 13-23
Configurable lock mode .................................................................................. 13-24
Setting the lock mode ..................................................................................... 13-25
RETAIN UPDATE LOCKS .............................................................................. 13-26
Deadlock detection ......................................................................................... 13-27
What happens after a delete?......................................................................... 13-28
Row versioning ............................................................................................... 13-29
Managing versioning ...................................................................................... 13-30
Ifx_row_id virtual column ................................................................................ 13-31
Versioning code example ............................................................................... 13-32
Exercise 13: Concurrency control ................................................................... 13-33
Unit summary ................................................................................................. 13-49

© Copyright IBM Corp. 2001, 2017 P-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Data security..................................................................................... 14-1


Unit objectives .................................................................................................. 14-3
Levels of data security ...................................................................................... 14-4
Database-level privileges ................................................................................. 14-5
Table and column-level privileges .................................................................... 14-6
Default privileges .............................................................................................. 14-7
Granting database-level privileges ................................................................... 14-8
Revoking database-level privileges .................................................................. 14-9
Granting table-level privileges ........................................................................ 14-11
Revoking table-level privileges ....................................................................... 14-13
Granting column-level privileges..................................................................... 14-15
Routine privileges ........................................................................................... 14-16
DataBlade privileges ...................................................................................... 14-17
Roles .............................................................................................................. 14-18
Creating roles ................................................................................................. 14-19
Using roles ..................................................................................................... 14-20
GRANT and REVOKE FRAGMENT ............................................................... 14-22
Discussion ...................................................................................................... 14-23
Exercise 14: Data security .............................................................................. 14-24
Unit summary ................................................................................................. 14-35
Unit 15 Views ................................................................................................. 15-1
Unit objectives .................................................................................................. 15-3
What is a view? ................................................................................................ 15-4
Creating a view................................................................................................. 15-5
Dropping a view................................................................................................ 15-6
Views: Access to columns ................................................................................ 15-7
Views: Access to rows ...................................................................................... 15-8
Views: A virtual column .................................................................................... 15-9
Views: An aggregate function ......................................................................... 15-10
A view that joins two tables ............................................................................ 15-11
A view on another view................................................................................... 15-12
Restrictions on views ...................................................................................... 15-13
Views: INSERT, UPDATE, and DELETE ........................................................ 15-14
The WITH CHECK OPTION clause ................................................................ 15-15
Views and access privileges ........................................................................... 15-17
System catalog tables for views ..................................................................... 15-18
Exercise 15: Views ......................................................................................... 15-19
Unit summary ................................................................................................. 15-26

© Copyright IBM Corp. 2001, 2017 P-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Unit 16 Introduction to stored procedures .................................................. 16-1


Unit objectives .................................................................................................. 16-3
What are stored procedures? ........................................................................... 16-4
Example of a stored procedure......................................................................... 16-5
SQL statements in a procedure ........................................................................ 16-6
Some advantages of stored procedures ........................................................... 16-8
Stored procedure performance ....................................................................... 16-10
System catalog tables .................................................................................... 16-11
Exercise 16: Introduction to stored procedures............................................... 16-12
Unit summary ................................................................................................. 16-17
Unit 17 Triggers............................................................................................. 17-1
Unit objectives .................................................................................................. 17-3
What is a trigger? ............................................................................................. 17-4
CREATE TRIGGER ......................................................................................... 17-5
Trigger events .................................................................................................. 17-6
Trigger action ................................................................................................... 17-7
REFERENCING example ................................................................................. 17-8
The WHEN condition ........................................................................................ 17-9
Cascading triggers ......................................................................................... 17-10
Multiple triggers .............................................................................................. 17-11
Multiple triggers: Execution order ................................................................... 17-12
INSTEAD OF trigger on view .......................................................................... 17-14
INSTEAD OF trigger on view: Example .......................................................... 17-15
Triggers and stored procedures...................................................................... 17-17
Trigger procedures ......................................................................................... 17-18
Procedure triggers and Boolean operators ..................................................... 17-19
Discontinuing an operation ............................................................................. 17-21
Capturing the error in the application .............................................................. 17-22
Customizable error messages ........................................................................ 17-23
Dropping a trigger........................................................................................... 17-24
Cursors and triggers ....................................................................................... 17-25
Example of an UPDATE cursor ...................................................................... 17-26
Triggers and constraint checking .................................................................... 17-27
System catalogs for triggers ........................................................................... 17-28
Managing triggers........................................................................................... 17-29
Security and triggers ...................................................................................... 17-30
Exercise 17: Triggers...................................................................................... 17-31
Unit summary ................................................................................................. 17-38

© Copyright IBM Corp. 2001, 2017 P-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Terminology .................................................................................. A-1


Unit objectives .................................................................................................... A-3
Unit summary ..................................................................................................... A-7
Data types ..................................................................................... B-1
Unit objectives .................................................................................................... B-3
Unit summary ..................................................................................................... B-9
XML publishing ............................................................................. C-1
Objectives .......................................................................................................... C-3
XML publishing ................................................................................................... C-4
XML functions..................................................................................................... C-5
XML publishing examples ................................................................................... C-8
Exercise: XML publishing ................................................................................. C-13
Unit summary ................................................................................................... C-17
Basic Text Search DataBlade module ......................................... D-1
Objectives .......................................................................................................... D-3
Basic Text Search DataBlade module ................................................................ D-4
Configure Basic Text Search .............................................................................. D-5
Text DataBlade indexing .................................................................................... D-7
Creating a BTS index ......................................................................................... D-8
BTS Index Operator class .................................................................................. D-9
Stopwords ........................................................................................................ D-10
Maintaining BTS indexes: Manual maintenance ............................................... D-11
Maintaining BTS indexes: Automatic maintenance ........................................... D-13
BTS query syntax ............................................................................................. D-14
BTS search restrictions .................................................................................... D-15
BTS searches: Words and phrases .................................................................. D-16
BTS searches: Boolean operators .................................................................... D-17
BTS searches: Fuzzy and proximity searches .................................................. D-19
BTS searches: Range searches and boosting .................................................. D-20
XML searches .................................................................................................. D-22
XML index parameters ..................................................................................... D-23
XML index parameters: xmltags ....................................................................... D-24
XML index parameters: all_xmltags .................................................................. D-28
XML index parameters: xmljpath_processing ................................................... D-31
XML index parameters: include_contents ......................................................... D-34
XML index parameters: include_namespaces .................................................. D-37

© Copyright IBM Corp. 2001, 2017 P-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

XML index parameters: include_subtag_text .................................................... D-38


XML index parameters: strip_xmltags............................................................... D-40
Basic Text Search DataBlade restrictions ......................................................... D-43
Exercise: Basic Text Search DataBlade module............................................... D-44
Unit summary ................................................................................................... D-52
Node DataBlade module............................................................... E-1
Objectives .......................................................................................................... E-3
Solving the hierarchical problem......................................................................... E-4
Typical employee tree structure.......................................................................... E-5
Standard query ................................................................................................... E-6
Node DataBlade module .................................................................................... E-7
A workable hierarchy .......................................................................................... E-8
How the Node DataBlade works ......................................................................... E-9
Node DataBlade queries .................................................................................. E-11
Node functions ................................................................................................. E-12
Exercise: Node DataBlade module ................................................................... E-13
Unit summary ................................................................................................... E-23
Using Global Language Support ................................................. F-1
Objectives .......................................................................................................... F-3
Global Language Support................................................................................... F-4
What is a locale? ................................................................................................ F-5
A locale specifies a code set .............................................................................. F-7
Multibyte code sets ............................................................................................. F-8
Using Multibyte code sets ................................................................................... F-9
A locale specifies a collation order ................................................................... F-11
NCHAR and NVARCHAR ................................................................................. F-12
Collation Order and SQL statements ................................................................ F-13
Locales: Numeric and Monetary formats .......................................................... F-15
A locale specifies Date and Time formats ......................................................... F-16
Date and Time customization ........................................................................... F-17
Locales: Client, Database, and Server ............................................................. F-18
Specifying locales............................................................................................. F-20
Multiple locales: Code set conversion............................................................... F-21
Conversion: Performance consideration ........................................................... F-23
Multibyte character: Utilities and APIs .............................................................. F-24
The glfiles utility ................................................................................................ F-25
Migrating to GLS from NLS or ALS ................................................................... F-26
Unit summary ................................................................................................... F-27

© Copyright IBM Corp. 2001, 2017 P-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Course overview
Preface overview
In this course, students will learn the basic concepts of data management with
Informix 12.10. They will learn how to create, manage, and maintain tables and
indexes; how the Informix Optimizer works; and how to use the SET EXPLAIN feature
to determine query effectiveness.
Intended audience
The main audience for this course is Informix Database Administrators. It is also
appropriate for Informix System Administrators and Informix Application Developers.
Topics covered
Topics covered in this course include:
Creating, altering, and dropping databases
Creating, altering, and dropping tables
Creating, altering, and dropping indexes
Table and index partitioning
The Informix query optimizer and access plans
Updating statistics and data distributions
Referential and entity integrity
Creating and managing constraints
Modes and violation detection
Concurrency control and locking mechanisms
Data security
Views
Triggers

© Copyright IBM Corp. 2001, 2017 P-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Course prerequisites
Students in this course should satisfy the following prerequisites:
• IX101 - Introduction to Informix terminology and data types (or equivalent
experience or knowledge)
• Knowledge of Structured Query Language (SQL)
• Experience using basic Linux functionality
Course Environment
The environment provided in this course is implemented as a virtual image deployed in
Skytap.

© Copyright IBM Corp. 2001, 2017 P-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Document conventions
Conventions used in this guide follow Microsoft Windows application standards, where
applicable. As well, the following conventions are observed:
• Bold: Bold style is used in demonstration and exercise step-by-step solutions to
indicate a user interface element that is actively selected or text that must be
typed by the participant.
• Italic: Used to reference book titles.
• CAPITALIZATION: All file names, table names, column names, and folder names
appear in this guide exactly as they appear in the application.
To keep capitalization consistent with this guide, type text exactly as shown.

© Copyright IBM Corp. 2001, 2017 P-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Exercises
Exercise format
Exercises are designed to allow you to work according to your own pace. Content
contained in an exercise is not fully scripted out to provide an additional challenge.
Refer back to the material in the unit and to your instructor if you need assistance
with a particular task. The exercises are structured as follows:
The purpose section
This section presents a brief description of the purpose of the exercise, followed by
a series of tasks. These tasks provide information to help guide you through the
exercise. Within each task, there may be numbered questions relating to the task.
Complete the tasks by using the skills you learned in the unit. If you need more
assistance, you can refer to your instructor or to the solutions section for more
detailed instruction.
The task sections
Each task section has a title with the overall goal of the task. This is followed by the
steps that describe what you need to do to meet the goal.
The solutions sections
The solutions section contains the solution to each task / task step that is to be
performed. You can refer to this section to get further information on how to complete a
task or task step.

© Copyright IBM Corp. 2001, 2017 P-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

Additional training resources


• Visit IBM Analytics Product Training and Certification on the IBM website for
details on:
• Instructor-led training in a classroom or online
• Self-paced training that fits your needs and schedule
• Comprehensive curricula and training paths that help you identify the courses
that are right for you
• IBM Analytics Certification program
• Other resources that will enhance your success with IBM Analytics Software
• For the URL relevant to your training requirements outlined above, bookmark:
• IBM Analytics Learning Services:
https://www.ibm.com/analytics/us/en/services/learning.html
• IBM Informix 12.10 Knowledge Center:
https://www.ibm.com/support/knowledgecenter/en/SSGU8G_12.1.0/com.ibm.
welcome.doc/welcome.htm

© Copyright IBM Corp. 2001, 2017 P-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Preface

IBM product help


Help type When to use Location

Task- You are working in the product and IBM Product - Help link
oriented you need specific task-oriented help.

Books for You want to use search engines to Start/Programs/IBM


Printing find information. You can then print Product/Documentation
(.pdf) out selected pages, a section, or the
whole book.
Use Step-by-Step online books
(.pdf) if you want to know how to
complete a task but prefer to read
about it in a book.
The Step-by-Step online books
contain the same information as the
online help, but the method of
presentation is different.

IBM on the You want to access any of the


Web following:

• IBM - Analytics Learning • https://www.ibm.com/analytic


Services s/us/en/services/learning.html
• Online support • https://www.ibm.com/support/
home/
• IBM Web site • http://www.ibm.com

© Copyright IBM Corp. 2001, 2017 P-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Creating databases and tables

Creating databases and tables

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Unit 1 Creating databases and tables

© Copyright IBM Corp. 2001, 2017 1-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Unit objectives
• Review prerequisites
• Create databases and tables
• Determine database logging and storage requirements
• Locate where the database server stores a table on disk
• Create temporary tables
• Locate where the database server stores temporary tables
• Use the system catalog tables to gather information
• Use the dbschema utility

Creating databases and tables © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 1-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Prerequisites
• Basic UNIX knowledge
• Basic Structure Query Language (SQL) knowledge
• dbaccess
• Informix terminology
• Informix data types

Creating databases and tables © Copyright IBM Corporation 2017

Prerequisites
Before you start working with your Informix database, you should have basic UNIX
knowledge. You should be familiar with basic SQL, as well as working with dbaccess.
You should also be familiar with the terminology used with Informix and various Informix
data types. Some of these terms and data types are summarized in Appendix A
(Terminology) and Appendix B (Data types). Please review this material. If you have
any questions, or need any further information, please contact your instructor.

© Copyright IBM Corp. 2001, 2017 1-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Before you create a database

Before you create a database, you must:


• Select a database name
• Identify the appropriate transaction logging mode for the database
• Determine whether the database must be MODE ANSI
• Choose a dbspace location

Creating databases and tables © Copyright IBM Corporation 2017

Before you create a database


Informix defines a database as a collection of information (contained in tables) that is
useful to a particular organization or used for a specific purpose. In practice, a database
consists of a set of related tables and functions.
Each database server contains at least four system-created databases: the sysmaster
database, the sysutils database, the sysuser database, and the sysadmin database. In
addition, the database server might contain user-defined databases.
Before creating a user-defined database, the administrator must determine:
• A database name which is unique to the database server instance
• The appropriate transaction logging mode for the database
• Whether or not the database should be flagged as a MODE ANSI database

© Copyright IBM Corp. 2001, 2017 1-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Database names
• Maximum of 128 bytes (characters)
• Valid names can consist of:
 Letters: A to Z, a to z
 Digits: 0 - 9
 Underscore ( _ )
• Must be unique among all databases within the database server

Creating databases and tables © Copyright IBM Corporation 2017

Database names
When you select a database name, you can choose any combination of letters, digits,
and the underscore character. If you use a non-default locale, database names can
contain any alphabetic characters that the locale supports. (Note: The first character of
a database name cannot be a digit (0-9) or a dollar sign ($)).
Database names cannot include hyphens, spaces, or other non-alphanumeric
characters. Database names cannot exceed 128 characters in length.
Each database name must be unique within its database server.
Database names are not case sensitive.

© Copyright IBM Corp. 2001, 2017 1-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Database logging
• Logging involves:
 Recording information about transactions in the logical log buffers in
memory
 Flushing the logical log buffers to logical log files located on disk
• Types of Logging:
 No Logging
TRANSACTION
 Unbuffered
 Buffered
 Mode ANSI
begin work;
insert record;
begin work;
update record; insert record;
commit work; update record;
commit work;

Creating databases and tables © Copyright IBM Corporation 2017

Database logging
Every database that the database server manages has a logging status. The logging
status indicates whether the database uses transaction logging and, if so, which log-
buffering mechanism the database uses.
The four types of database logging are:
• No logging
• Buffered logging
• Unbuffered logging
• Mode-ANSI

© Copyright IBM Corp. 2001, 2017 1-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

No logging
• UPDATE, INSERT, DELETE records are NOT written to logical logs
• Data definition language (DDL) is written to logical log
CREATE TABLE T1;
INSERT INTO T1;
UPDATE T1;
ALTER TABLE T1;
DELETE FROM T1;
INSERT INTO T1;
DROP TABLE T1;

CREATE TABLE T1;


ALTER TABLE T1;
CREATE TABLE T1;
DROP TABLE T1; ALTER TABLE T1;
DROP TABLE T1;

Creating databases and tables © Copyright IBM Corporation 2017

No logging
Informix allows you to create a database with no transaction logging.
If you create a database with no transaction logging, data manipulation language (DML)
records, such as UPDATE, INSERT, and DELETE, are not written to the logical logs.
Data definition language (DDL) records, such as CREATE, ALTER, and DROP, are
written to the logical log.
While it is recommended that production databases always use transaction logging, you
might want to create your database without logging, load all of its tables, and then turn
on logging. This significantly reduces the time required to load the database and
prevent long transactions.
A database without logging cannot be fully recovered when you have to restore the
system from a backup. Normally, after you apply a backup to recover a system, you
apply the logical-log files to recreate any transactions that committed after the backup.
Since the logs do not contain any record of the data manipulation operations that were
performed after the backup in a no-logging database, these transactions are lost.
You can enable logging for a no-logging database using either the ontape utility or the
ondblog and onbar utilities.

© Copyright IBM Corp. 2001, 2017 1-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

For example, if the database stores currently uses no-logging, you can change it to
buffered logging as follows:
ontape -s -B stores
or
ondblog buf stores
onbar -b -F

© Copyright IBM Corp. 2001, 2017 1-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

NO LOGGING still has logging


To emphasize:
• All DDL statements and database administrative activities are written
to the logical logs
• In addition, certain administrative components like table space
management will always get logged regardless of mode
• The system administrator must still plan for log management even if all
databases are NO LOGGING

Creating databases and tables © Copyright IBM Corporation 2017

NO LOGGING still has logging


As shown on the diagram on the previous visual, records for data manipulation
language statements such as UPDATE, INSERT, and DELETE are not written to the
logical logs, but the records for data definition statements such as CREATE TABLE,
ALTER TABLE, and DROP TABLE are logged. In addition, certain administrative
components like table space management will always get logged regardless of mode.
For a system administrator, this means that logging activity still occurs in no-logging
databases, and the administrator must consider:
• backing up log files on disk
• how logging affects checkpoint activity

© Copyright IBM Corp. 2001, 2017 1-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Unbuffered logging
• All statements are written to the logical logs
• COMMIT flushes the logical log buffer
INSERT;
INSERT;
COMMIT;
UPDATE;
COMMIT;

INSERT; INSERT;
INSERT;
COMMIT;
INSERT;
COMMIT;
UDPATE; COMMIT; UPDATE;
COMMIT;

Creating databases and tables © Copyright IBM Corporation 2017

Unbuffered logging
When a database is created with unbuffered logging, all transaction activity is written to
the logical log buffers in shared memory and then flushed to disk when the COMMIT (or
COMMIT WORK) statement is executed. This ensures that all completed work is saved
on disk and guarantees all committed transactions can be successfully recovered
following any type of system failure.

© Copyright IBM Corp. 2001, 2017 1-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Buffered logging
• All statements are written to the logical logs
• Full logical log buffer flushes the logical log buffer
INSERT;
INSERT;
COMMIT;
UPDATE;
COMMIT;

INSERT; INSERT; COM-


MIT; UPDATE; COMMIT;
INSERT;
INSERT;
COMMIT;
UPDATE;
COMMIT;

Creating databases and tables © Copyright IBM Corporation 2017

Buffered logging
If the database is created with buffered logging, all transaction activity is written to the
logical log buffers in shared memory. These buffers in memory are flushed to logical
logs on disk as they become full.
The advantage of flushing the logical-log buffer when it becomes full is to reduce the
number of physical I/Os performed. Because a physical I/O is a relatively expensive
operation, this can improve database performance. The disadvantage of buffered
logging is, should a system crash occur, whatever is contained in the logical-log buffer
was not written to disk and is lost.
You can change the logging mode for a database from buffered to unbuffered at any
time.
If an instance contains both unbuffered log databases and buffered log databases, the
logical log buffers are flushed each time a COMMIT (or COMMIT WORK) statement is
executed for an unbuffered log database, or when they become full. They are also
flushed to disk whenever a checkpoint occurs, and whenever a connection is closed.

© Copyright IBM Corp. 2001, 2017 1-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

MODE ANSI databases


• Use unbuffered logging
• Logging cannot be disabled or turned off
• Statements are implicitly contained in transactions
• Owner naming is enforced
• The default read isolation level is repeatable read
• Default table and synonym privileges are not granted to user PUBLIC

Creating databases and tables © Copyright IBM Corporation 2017

MODE ANSI databases


Another decision for a DBA to make when creating a database is whether the database
can be created as a MODE ANSI database.
MODE ANSI databases are subject to some restrictions that do not apply to other
Informix databases:
• A MODE ANSI database always uses unbuffered logging and logging cannot be
disabled.
• All SQL statements are implicitly contained in transactions in a MODE ANSI
database.
MODE ANSI databases do not require you to explicitly begin a transaction. An
application is always implicitly in a transaction. All statements following a
COMMIT WORK or ROLLBACK WORK statement are grouped together and
committed or rolled back by the successive COMMIT WORK or ROLLBACK
WORK statement. A BEGIN WORK statement is not required when using a
MODE ANSI database, but if coded must precede any other SQL statement.

© Copyright IBM Corp. 2001, 2017 1-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

• Owner naming is enforced for MODE ANSI databases. Unless you are the table
or synonym owner, you must qualify the owner name in every SQL statement, for
example:
SELECT tabname
FROM 'informix'.systables
WHERE tabid > 99
• The default read isolation level for a MODE ANSI database is repeatable read.
This must be considered carefully by database administrators as repeatable read
requires every row that is read to be locked and can negatively affect concurrent
user access to data.
• No default table or synonym privileges are granted to the user PUBLIC.
Environment variable: NODEFDAC
For databases that are not MODE ANSI, the database server automatically grants table
level SELECT, INSERT, UPDATE, DELETE, and INDEX privileges to group PUBLIC.
To prevent default table privileges from being granted, set the NODEFDAC
environment variable before you create the table. A korn shell example is shown here:
export NODEFDAC=yes
Isolation levels and database and table-level privileges are discussed in greater detail in
a later unit.

© Copyright IBM Corp. 2001, 2017 1-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

The database dbspace


The dbspace in which the database is created determines:
• Where the system catalog tables are created
• The default location of user tables

db1 rootdbs
database

db2
database dbspace1

dbspace2

Creating databases and tables © Copyright IBM Corporation 2017

The database dbspace


When you create a database, the database server automatically generates a set of
tables within the database called the system catalog tables. These tables track all the
user created tables, indexes, constraints, and details of other objects that are created in
the database.
When you create a database, you can specify a dbspace for the database. This is the
dbspace in which all these system catalog tables are placed. Additionally, this dbspace
is the default dbspace for all other tables created in the database. The ability to create a
database in a specific dbspace allows you to control the space that is available for that
dbspace.
If you do not specify a database dbspace, the default dbspace location for the database
and all its system catalog tables and user-defined tables is the root dbspace. For best
performance and manageability, you should always specify a dbspace when you create
a database.
Dbaccess does not display the system catalog tables in the TABLES option. You can,
however, select data from these tables using standard SELECT statements. You can
also use the INFO command in SQL to display column and index information for the
system catalog tables:
INFO COLUMNS FOR sysindices;

© Copyright IBM Corp. 2001, 2017 1-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Creating a database
• Create a no-logging database in dbspace db_dbs:
CREATE DATABASE db IN db_dbs;
• Create a database with unbuffered logging in dbspace db_dbs:
CREATE DATABASE db IN db_dbs WITH LOG;
• Create a database with buffered logging in dbspace db_dbs:
CREATE DATABASE db IN db_dbs WITH BUFFERED LOG;
• Create a MODE ANSI database in dbspace db_dbs:
CREATE DATABASE db IN db_dbs WITH LOG MODE ANSI;

Creating databases and tables © Copyright IBM Corporation 2017

Creating a database
The examples show how to create a database using different logging modes.

© Copyright IBM Corp. 2001, 2017 1-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Creating a table

rootdbs

database stores

table orders dbspace1

dbspace2

Creating databases and tables © Copyright IBM Corporation 2017

Creating a table
When creating a new table, the administrator must:
• Select a valid table name
• Identify appropriate columns and data types
• Determine first and next extent sizes
• Select the lock mode
• Choose dbspace location
• Determine if the table is to be logged

© Copyright IBM Corp. 2001, 2017 1-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Select a valid table name


• Maximum length is 128 characters
• Can include alphanumeric characters and “_”
• Cannot include any other special characters
• Supports ANSI delimited identifiers
 Must set DELIMIDENT to use delimited identifiers

Creating databases and tables © Copyright IBM Corporation 2017

Select a valid table name


When you select a table name, you can choose any combination of letters, digits, and
the underscore character. If you use a non-default locale, table names can contain any
alphabetic characters that the locale supports. (Note: The first character of a table
name cannot be a digit (0-9) or the dollar sign ($)).
Table names cannot include hyphens, spaces, or other non-alphanumeric characters.
Table names cannot exceed 128 characters.
Delimited identifiers
Informix allows you to override the standard rules for naming tables and other objects
by employing ANSI-defined delimited identifiers. A delimited identifier is an object name
enclosed in double quotation marks.
A delimited identifier can:
• Be a reserved word
• Contain spaces and special characters
• Be case-sensitive
Environment variable: DELIMIDENT

© Copyright IBM Corp. 2001, 2017 1-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

To use delimited identifiers, Informix requires that you set the DELIMIDENT
environment variable.
For example, using the UNIX korn shell, execute the command:
export DELIMIDENT=ON

© Copyright IBM Corp. 2001, 2017 1-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Extents
• An extent is a collection of physically contiguous pages on a disk
• Space for tables is allocated in extents
• Extent sizes for a table are specified when the table is created

page 0 page 1
bitmap page free page
page 2 page 3
free page free page
page 4 page 5
free page free page
page 6 page 7
free page free page

Creating databases and tables © Copyright IBM Corporation 2017

Extents
Disk space for a table is allocated in units called extents. An extent is an amount of
contiguous space on disk; the amount is specified for each table when the table is
created. Each table has two extent sizes associated with it:
EXTENT SIZE The size of the first extent allocated for the table.
This first extent is allocated when the table is
created. The default is eight pages.

NEXT SIZE The size of each subsequent extent added to the


table. The default is eight pages.

When an extent is added, all pages are flagged as FREE except for one or more
bitmap pages. When the first extent allocated has no more space (that is, all pages
contain data), another extent is allocated for the table; when this extent is filled, another
extent is allocated, and so on.
Regardless of the logging mode, extent allocation, extent merging, and other extent
operations are always logged.
Tblspace

© Copyright IBM Corp. 2001, 2017 1-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

All the extents allocated in a specific dbspace for a given table are logically grouped
together and are referred to as a tblspace. While the space within an extent is
guaranteed to be contiguous, the space represented by the tblspace might not be
contiguous as extents can be spread across a device as space permits.
Extent size
The minimum size for an extent is four pages. There is no maximum size. An extent
size must be an even multiple of the page size for the system.
It is important to calculate extent requirements for your tables.

© Copyright IBM Corp. 2001, 2017 1-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Estimating row and extent sizes

1. Estimate the total number of rows


2. Calculate the size of each row
3. Calculate the usable space on each page
4. Determine the number of data pages required:
 Determine the number of rows per page
 Determine the number of pages
5. Calculate the total space required in kilobytes

Creating databases and tables © Copyright IBM Corporation 2017

Estimating row and extent sizes


Calculating extent size
To calculate the initial extent size for your table, use the following guidelines:
1. Estimate the total number of rows that you want to store initially.
2. Calculate the size of each row.
- Add the widths of all columns in the table to calculate the row size.
- Add four bytes for the slot table entry. The result is rowsize.
- For each text and byte column, whether stored in the table or in a
blobspace, add 56 bytes to the rowsize. Tables that contain LVARCHAR,
TEXT, VARCHAR, and BLOB columns located in the table are
impossible to size accurately. You can use the maximum size or an
average size, but the resulting row size is always an estimate.
3. Subtract 28 from the total size of a page, pagesize, to account for the header
and footer timestamp that appears on the page. The result is the usable space
on the page, pageuse.
4. If rowsize is less than or equal to pageuse:
number_of_data_pages = number_of_rows/maxrows
where:
maxrows = min(pageuse/rowsize, 255)

© Copyright IBM Corp. 2001, 2017 1-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

If rowsize is greater than pageuse, the database server divides the row
between pages. The initial portion of the row is the homepage. Subsequent
portions are stored in remainder pages.
The size of the table is calculated as:
number_of_data_pages = number_of_homepages +
number_of_remainder_pages
5. Calculate the total space required in kilobytes:
(number_of_data_pages * pagesize)/1024
To calculate an appropriate NEXT SIZE value for successive extents allocated for the
table, apply steps 1 through 5, but instead of using the initial number of rows, use the
number of rows by which you anticipate the table will grow over time. Also, be sure to
consider how much disk space you have available and whether you plan to add
additional disks to the system at a specific time.
Example
Assume that your table initially has 1,000,000 rows and is expected to grow
between 10 percent and 30 percent per year. Also, assume that you have
budgeted to purchase more disks in 12 months and that you will reload your
database to distribute it over existing and new devices at that time.
Given these assumptions, you might want to size additional extents to hold
100,000 rows. If your table grows at 10 percent per year, the database server
only allocates one extent during the year. If your table grows at 30 percent, the
database server might have to allocate 3 or 4 additional extents. In either
case, the number of extents allocated will be small enough to avoid affecting
performance, or the need to reorganize your table, before the scheduled
maintenance period.
Variable length rows (VARCHAR, LVARCHAR, and NVARCHAR) introduce uncertainty
into the calculations. When Informix allocates space to rows of varying size, it considers
a page to be full when no room exists for an additional row of the maximum size.

© Copyright IBM Corp. 2001, 2017 1-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Managing extents

existing extent
Concatenation expanded extent
new extent

16th extent 17th extent 33rd extent


Doubling
Size 16 KB Size 32 KB Size 64 KB

ALTER TABLE customer


Manual Modification
NEXT SIZE 32;

Creating databases and tables © Copyright IBM Corporation 2017

Managing extents
Informix implements several features to simplify extent management for the database
administrator.
Concatenation
The first of these features is automatic extent concatenation. When an extent can be
allocated that is next to (contiguous with) the most recently allocated extent, Informix
automatically concatenates the new extent to the existing extent to make one extent.
You most often see automatic extent concatenation when you perform batch loads on
one table at a time. Since each new extent allocated is contiguous to the previous
extent, the two extents are concatenated. This makes it possible to load a very large
table by using a default extent size and end up with the entire table contained in a
single extent.
Doubling
Another feature that database server implements to ease the burden of extent
management is automatic extent doubling. Each time the number of extents allocated
for a particular tblspace reaches a multiple of 16, the database server doubles the size
of each successive extent allocated.

© Copyright IBM Corp. 2001, 2017 1-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Manual modification
If the Database Administrator (DBA) detects that the initial extent-size specification for a
table was inadequate and that many extents are being allocated, the size of successive
extents can be increased by using the ALTER TABLE command. This does not alter
any existing extents, so if a very large number of extents was allocated it might be wise
to rebuild the table.
Space limitation
When Informix allocates a new extent, it always attempts to locate contiguous free
space equal to or larger than the requested extent size. If no contiguous free space
remains in the dbspace large enough to accommodate the extent size, database server
allocates the largest remaining segment of continuous free space, even though it might
be less than the current extent size for the tblspace.
SYSEXTENTS
To collect information about the extents allocated for a given table or index, query the
system catalog table sysmaster:sysextents.

© Copyright IBM Corp. 2001, 2017 1-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Table lock modes

Row Lock Page Lock

• ROW locking only locks the row


 Index locks are placed only on the key values
• PAGE locking locks an entire page of data
 Index locks lock all keys on the page
• Lock mode can be changed at any time
• Default lock mode set with $ONCONFIG parameter
DEF_TABLE_LOCKMODE

Creating databases and tables © Copyright IBM Corporation 2017

Table lock modes


A table can be created with either row-level or page-level locking.
Row-level locking
Row-level locking provides the highest degree of concurrency in a multi-user
environment. Only the row being accessed or modified is locked. Because each lock
requires a small amount of memory (approximately 44 bytes) and no disk access, row
locking is relatively inexpensive. The primary drawback of row-level locking is the
number of locks that might be required for very large transactions. Since locks are held
until a transaction is committed, an update to all rows in a 1,000,000 row table within a
single transaction would require 1,000,000 locks (plus one for the table and one for the
database).
To circumvent a situation like this, you have two options. First, any time you need to
modify most of the rows in a single table, you can avoid row locks by placing an
exclusive lock on the table. For example:
LOCK TABLE tabname IN EXCLUSIVE MODE;
When the table is locked in exclusive mode, the database server does not place
individual locks on the rows.
Another option is to divide a large transaction into several smaller transactions.

© Copyright IBM Corp. 2001, 2017 1-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Page-level locking
Page-level locking locks an entire data page and an entire index page. Page-level
locking can reduce the level of concurrency, but might be beneficial for tables that are
always processed in physical order. For example, a temporary table built for processing
month-end financial data is a likely candidate for page-level locking. If the rows are
processed sequentially by a single batch application, concurrency is not an issue, and
page-level locking reduces the number of locks that must be acquired.
Changing the lock mode
To change the lock mode for a table, execute an ALTER TABLE statement. No physical
change is made to the table, but the locklevel column in systables catalog table is
updated to reflect the new mode.
Setting the default lock mode
You can set a default lock mode to use for all newly created tables by setting either a
configuration parameter or an environment variable.
To override the server default of page-level locking, add a configuration parameter
called DEF_TABLE_LOCKMODE and set it to ROW.
If you only want to override the server settings for the current session, set an
environment variable called IFX_DEFAULT_TABLE_LOCKMODE to ROW or PAGE,
as shown by this example:
$ export IFX_DEF_TABLE_LOCKMODE=ROW
The IFX_DEFAULT_TABLE_LOCKMODE overrides the default set by the
DEF_TABLE_LOCKMODE configuration parameter, and the LOCK MODE option of
the CREATE TABLE or ALTER TABLE commands overrides both the configuration
parameter and the environment variable settings.
Page-level locking is the default mode if no lock mode is specified, either explicitly or by
setting the default lock mode.

© Copyright IBM Corp. 2001, 2017 1-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Tables and dbspaces

instance

dbspace1 dbspace2 dbspace3

Database1 Database2 Table 1c


Table 1a Table 2a Table 2c
Table 1b Table 2b

Creating databases and tables © Copyright IBM Corporation 2017

Tables and dbspaces


The ability to choose the dbspace where a table is created allows you to:
• Manage and balance I/O requirements for the table and database
• Limit the amount of space available to a table
• Logically group tables to increase the granularity of backup and recovery

© Copyright IBM Corp. 2001, 2017 1-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Creating a table
CREATE TABLE orders(
order_num SERIAL NOT NULL,
customer_num INTEGER, Columns and data types

order_date DATE
)
IN dbspace1 Location of the table

EXTENT SIZE 64 Size of the initial and next


extent sizes in kilobytes
NEXT SIZE 32
LOCK MODE ROW; Lock level

Creating databases and tables © Copyright IBM Corporation 2017

Creating a table
The CREATE TABLE statement:
• Assigns a name to the table that is unique within the database
• Inserts the table and column information into the systables and syscolumns
system catalog tables
• Allocates contiguous storage space, as specified by the EXTENT SIZE clause, in
the database dbspace or the dbspace specified
• Sets the lock level for the table
The example CREATE TABLE statement creates a table named orders with three
columns (order_num of data type SERIAL, customer_num of data type INTEGER, and
order_date of data type DATE). The order_num column is a required value (NOT
NULL); the customer_num column and the order_date column are optional. The table is
placed in the dbspace dbspace1. Initially, 64 kilobytes are allocated for the first extent.
Each successively added extent is 32 kilobytes in size. Table locking is performed at
the row level.

© Copyright IBM Corp. 2001, 2017 1-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Creating a table: Simple large objects


CREATE TABLE evaluation(
employee_num SERIAL,
manager_num INTEGER,
emp_eval_form TEXT IN blobspace1,
emp_picture BYTE IN blobspace2,
emp_phrase TEXT IN TABLE
)
IN dbspace2;

Creating databases and tables © Copyright IBM Corporation 2017

Creating a table: Simple large objects


Simple large objects can be stored in:
• Blobspaces
• Table extents
In the example shown, the emp_eval_form and emp_picture columns are stored in
blobspaces, while the emp_phrase column is stored in table extents.

© Copyright IBM Corp. 2001, 2017 1-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Creating a table: Smart large objects


CREATE TABLE movie(
movie_num INTEGER,
movie_title CHAR(50),
video BLOB,
audio BLOB,
description CLOB
)
PUT video IN (sbsp3),
audio IN (sbsp1),
description IN (sbsp5);

Creating databases and tables © Copyright IBM Corporation 2017

Creating a table: Smart large objects


Unless the PUT clause is used, the database server stores smart large objects in
the default sbspace, which is identified by the configuration parameter
SBSPACENAME.
In the example, three smart large object columns have been stored in three different
sbspaces: sbsp1, sbsp3, and sbsp5.

© Copyright IBM Corp. 2001, 2017 1-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Creating a temporary table


• Temporary tables are created in dbspaces specifically created for
temporary objects
• These dbspaces are designated by:
 The DBSPACETEMP environment variable
 The DBSPACETEMP configuration parameter
• Must set configuration parameter TEMPTAB_NOLOG or use WITH NO
LOG clause in SQL
CREATE TEMP TABLE temp_order
(order_num INTEGER)
WITH NO LOG;
SELECT customer_num, company
FROM customer INTO TEMP cust_temp
WITH NO LOG;

Creating databases and tables © Copyright IBM Corporation 2017

Creating a temporary table


Informix allows a user to create an explicit temporary table that is like a permanent table
in every way, except that it exists only for the duration of the user session that creates it.
Once the user closes the database or terminates the session, the temporary table is
automatically dropped. The user can also drop a temporary table by using the DROP
TABLE statement.
You can create indexes on a temporary table, but you cannot alter the structure of a
temporary table. Instead, you must drop and recreate the temporary table.
You should create temporary tables in a dbspace that is specifically designated for
temporary tables within the database server. A dbspace is designated as temporary at
the time the dbspace is created. A temporary dbspace does not accommodate logging.
If you create temporary tables in logged databases, you should always use the WITH
NO LOG clause or set the configuration parameter TEMPTAB_NOLOG to '1' so that
the updates, inserts and deletes to the temporary tables are not logged. If the
configuration parameter is not set or the WITH NO LOG clause is not used in the SQL,
the temp table is created in the default database dbspace and all transactions to it are
logged.
Environment variable: DBSPACETEMP

© Copyright IBM Corp. 2001, 2017 1-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

The DBSPACETEMP environment variable can be set to one or more of the specifically
designated temporary dbspaces. If the DBSPACETEMP environment variable is not
set, the database server uses the value of the DBSPACETEMP configuration
parameter.

© Copyright IBM Corp. 2001, 2017 1-33


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

DBCENTURY
• Today’s date: 10/31/1998
• Dates stored with different DBCENTURY settings

DBCENTURY Date Entered Date Entered Date Entered


Setting 03/25/01 12/01/98 10/13/98
P 03/25/1901 12/01/1898 10/13/1998
F 03/25/2001 12/01/1998 10/13/2098
C 03/25/2001 12/01/1998 10/13/1998
R 03/25/1901 12/01/1998 10/13/1998

Creating databases and tables © Copyright IBM Corporation 2017

DBCENTURY
The environment variable DBCENTURY allows selection of the appropriate century
for two-digit year DATE and DATETIME values.
Acceptable values for DBCENTURY are: P, F, C, or R.

P Past. The year is expanded with both the current and past centuries.
The closest date before today’s date is chosen.

F Future. The year is expanded with both the current and future
centuries. The closest date after today’s date is chosen.

C Closest. The past, present, and next centuries are used to expand the
year value. The date closest to today's date is used.

R Present. The present century is used to expand the year value.

The system default for DBCENTURY is R.


When a DBCENTURY value of P or F is set and today’s date is entered, the century
is the past century or the future century.
Today’s century is used when the keyword TODAY is substituted for today’s date.

© Copyright IBM Corp. 2001, 2017 1-34


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

The DBSCHEMA utility


Example:
dbschema -d stores -t orders -ss

 The d option specifies the database


 The t option specifies a table
 The ss option specifies dbspace, lock mode, and extent information

Creating databases and tables © Copyright IBM Corporation 2017

The DBSCHEMA utility


The dbschema utility displays the SQL statements (the schema) that are necessary to
replicate database objects.
dbschema options
You must specify the name of the database with the -d option.
Additional options, shown below, can also be included.

-t tabname Only the table or view is included. Specify ALL in place of


tabname for all tables.

-s synname CREATE SYNONYM statements for the specified user


(synname) are included. Specify ALL in place of synname for
all synonyms.

-p Print only GRANT statements for the user listed. Specify ALL
username in the place of username for all users.

-f stproc Print the stored procedure listed. Specify ALL in place of


stproc for all stored procedures.

© Copyright IBM Corp. 2001, 2017 1-35


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

-hd Displays distribution information. Specify ALL in place of


tabname tabname for all tables.

-r Generate CREATE and GRANT of the role specified or enter


all for all roles.

-ss Generates database server specific information for the


specified table including the lock mode, extent sizes, and
dbspace name.

If you specify a filename at the end of the command, all output is redirected to that file,
which can then be executed as an SQL script. Otherwise, output is sent to the standard
output destination.

© Copyright IBM Corp. 2001, 2017 1-36


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Using ONCHECK and ONSTAT


• To view disk information about extents allocated to databases or
tables, execute:
oncheck -pe

• To view information about tables resident in shared memory execute


onstat -t

Creating databases and tables © Copyright IBM Corporation 2017

Using ONCHECK and ONSTAT


Informix provides the oncheck and onstat utilities to query information about your
database server, its disk usage, and memory usage.
Some commands can be executed only by user Informix.
For a complete list of oncheck and onstat options, refer to the Informix Knowledge
Center at:
(https://www.ibm.com/support/knowledgecenter/SSGU8G_12.1.0/com.ibm.welcome.do
c/welcome.htm)

© Copyright IBM Corp. 2001, 2017 1-37


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Sysmaster table: sysdatabases


• The sysdatabases system catalog table contains:
 One row for each database created
 Columns of interest include:
− name: Database name
− partnum: Partition number for the systables table for the database
− flags: Type of logging used

Creating databases and tables © Copyright IBM Corporation 2017

Sysmaster table: sysdatabases


The sysmaster table sysdatabases describes each database that has been created in
the database server. Whenever a new database is created, a new entry is automatically
added to the sysdatabases table. The sysdatabases table includes information on each
database's name, partition number, and type of logging used.

© Copyright IBM Corp. 2001, 2017 1-38


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

System Catalog table: systables


• The systables system catalog table contains:
 One row for each table, view, synonym, and sequence
 Columns of interest include:
− tabname: Name of table
− tabid:Unique numeric table identifier (used in other system catalog tables to
reference the table)
− partnum: physical storage location code
− rowsize: Maximum size of the row in bytes

Creating databases and tables © Copyright IBM Corporation 2017

System Catalog table: systables


Table names are stored only in the systables table. In all other system catalog
tables, tables are referenced by their tabid, or table identifier. Therefore, one
important function of the systables table is to provide a link between table names
and tabids. Tabids are assigned sequentially and are unique. System catalog tables
are numbered 1 - 99; tabid values for user-defined tables are always greater than or
equal to 100.
The partnum field is the tblspace number (tblsnum) referenced in the onstat output.
It is used to uniquely identify the table. To find the tblsnum of a table, you can select
this information from the systables table. Because the tblsnum is shown in onstat
commands in hexadecimal, select the partnum value in hex format. This can be
done with the query:
SELECT tabname, hex(partnum) FROM SYSTABLES;

© Copyright IBM Corp. 2001, 2017 1-39


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

System Catalog table: syscolumns


• The syscolumns system catalog table contains:
 colname: The column name
 tabid: The unique table identifier
 colno: The sequence number of the column within the table
 coltype: The data type of column
 collength: The physical length

Creating databases and tables © Copyright IBM Corporation 2017

System Catalog table: syscolumns


The database server assigns column numbers sequentially within each table.
The coltype field contains a small integer value that identifies the data type of the
column as the $INFORMIXDIR/incl/esql/sqltypes.h file defines it. If the coltype field
contains a value that is greater than 256, that column does not allow null values. To
determine the data type for this column, subtract 256 from the value. For example, if
a column has a coltype value of 258 you subtract 256 to get 2, which indicates that
the column is an integer column with no nulls.
The colmin and colmax column values hold the second-smallest and second-largest
data values in the column.

© Copyright IBM Corp. 2001, 2017 1-40


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Exercise 1
Create databases and tables
• create databases in specific locations
• create regular and temporary tables
• create tables for storing large objects
• use the dbschema utility for generating database and table schemas

Creating databases and tables © Copyright IBM Corporation 2017

Exercise 1: Create databases and tables

© Copyright IBM Corp. 2001, 2017 1-41


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Exercise 1:
Create databases and tables

Purpose:
In this exercise, you will create databases and tables, including tables with
large object data types. You will also use the dbschema utility for generating
database and table schemas.

Task 1. Access the Informix system.


In this task, you will start the virtual machine and access the Informix 12.10 system.
If you are not using the standard class setup, skip this step and consult your
instructor for information on how to access the Informix system.
The virtual Windows system runs in VMWare.
1. Start the virtual machine by clicking on the run button, which is a right-facing
triangle icon. It may take a few minutes for the VMWare image to start. When
the status of the image is "Running", click on "Access this VM" (the
ix223_win12r2 link)
2. If the "Shutdown Event Tracker" dialog box is displayed, click Cancel.
3. If the "Server Manager" window is opened, click the X in the upper right-hand
corner to close it.
4. Double-click the "Docker Quickstart Terminal" icon on the Windows desktop.
The "Docker Quickstart Terminal" window will open, and docker will be started.
5. At the $ prompt, enter the following:
docker start iif_developer_edition
6. At the $ prompt, enter the following:
exit
7. Double click on the "putty" icon on the Windows desktop. The "PuTTY
Configuration" window will open.
8. Click on "Informix Server" and then click on the open button. A PuTTY window
will open.
9. At the "login as:" prompt, enter the following:
docker

© Copyright IBM Corp. 2001, 2017 1-42


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

10. At the "docker@localhost's password:" prompt, enter the following:


tcuser
11. At the "docker@default:~$" prompt, enter the following:
docker exec -it iif_developer_edition bash
12. At the "IDS-12.10 dev:" prompt, enter any valid Linux or Informix command. For
example:
dbaccess
13. When you are finished using the terminal window, enter the following at the
"IDS-12.10 dev:" prompt:
exit
Then enter the following at the "docker@default:~$" prompt:
exit
14. When you are finished using the virtual machine, click the X in the upper right-
hand corner. The "VMWare Workstation" dialog box will be displayed. Click on
either the "Suspend" button or the "Power off" button.
Task 2. Using dbaccess.
In this task, you will familiarize yourself with the dbaccess functions for entering,
running, saving, and loading SQL statements.
1. At the UNIX prompt, enter the following command:
dbaccess
This will open the dbaccess editior, with the menu at the top.
2. From the dbaccess menu, choose Query-language. You can either highlight it
and press enter, or just type in the letter q. The SELECT DATABASE prompt
will be displayed.
3. At the SELECT DATABASE prompt, select sysmaster@dev by highlighting it
and pressing Enter. The SQL menu will be displayed.
4. From the SQL menu, choose New. You can either highlight it and press Enter,
or just type in the letter n. The cursor will move to the typing area.
5. In the typing area, type in the following SQL statement:
SELECT * FROM SYSTABLE;
Refer to the top of the screen for editing options (ESC done editing, CTRL-X
delete a character, and so on.
6. After entering the SQL, press the ESC key on the keyboard. The SQL menu will
be displayed.
7. To execute the SQL, choose Run. You can either highlight it and press enter, or
just type in the letter r.

© Copyright IBM Corp. 2001, 2017 1-43


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

8. The SQL will be executed. Since it includes an error, an error message will be
displayed at the bottom of the screen. (There is no table in the database named
systable.) To correct the error, choose Modify from the menu. You can either
highlight it and press Enter, or just type in the letter m.
9. The cursor will be displayed in the typing area, as close to the error as possible.
Refer to the MODIFY menu at the top of the screen for editing options. Correct
the error so the SQL reads as follows:
SELECT * FROM SYSTABLES;
10. Press the ESC key on the keyboard. The SQL menu will be displayed.
11. Execute the SQL statement by choosing Run. If there are no further errors, the
first page of the results will be displayed.
12. Page through the output by choosing Next from the DISPLAY menu. You can
either highlight it and press enter, or just type in the letter n.
13. Continue paging through the output. When you have seen enough, choose Exit
from the DISPLAY menu. You can either highlight it and press enter, or just
type the letter e. The SQL menu will be displayed.
14. To save the SQL, choose Save from the SQL menu. You can either highlight it
and press Enter, or just type the letter s. The SAVE>> prompt will be displayed.
15. At the SAVE>> prompt, type in the filename into which you want to save the
SQL statement, in this case EXAMPLE, and then press Enter. The SQL menu
will be displayed.
16. To exit out of the editing options, choose the Exit option from the SQL menu.
You can either highlight it and press Enter, or just type the letter e. The
DBACCESS menu will be displayed.
17. To exit out of dbaccess, choose the Exit option from the SQL menu. You can
either highlight it and press Enter, or just type the letter e. The Unix prompt will
be displayed.
18. Enter dbaccess again. Choose the sysmaster database. Choose the Query-
language option. The SQL menu will be displayed.
19. To open an existing SQL file, select the Choose option from the SQL menu.
You can either highlight it and press Enter, or just type the letter c. A list of your
saved files will be displayed.
20. From the file list, choose the EXAMPLE file by highlighting it and pressing
Enter. The contents of the file will be loaded into the typing area. Use the
previously discussed commands to run the SQL and view the output.
21. Exit dbaccess and return to the UNIX prompt.

© Copyright IBM Corp. 2001, 2017 1-44


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Task 3. Create a database.


In this task, you will create a database, identify its default dbspace location using
oncheck, drop the database, and recreate it in a specified dbspace.
1. Create a database called stores_demo and have it use buffered logging.
2. Use oncheck -pe to locate the dbspace where your database has been
created.
Is the database in the best location?
How do you specify the location of your database?
3. Drop your database using the following statement:
DROP DATABASE stores_demo;
4. Recreate your database in the dbspace1 dbspace. Remember to have it use
buffered logging. Hint: Use the IN dbspace_name clause with your CREATE
DATABASE statement.
5. Use oncheck -pe to locate the dbspace where your database has been
created.

© Copyright IBM Corp. 2001, 2017 1-45


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Task 4. Create tables.


In this task, you will create and load four tables in your database using specified
dbspaces. You will use the oncheck utility to query the system catalogs to see how
extents were allocated. You will drop all the tables, calculate the proper extent size
usage, and recreate and reload the tables.
1. Make sure that you are in the /home/informix/labs directory. Create the SQL
scripts in individual files so you can save and modify them later.
2. Create a customer table in your database with the following columns:
Column name Description
customer_num Number starting at 101 and increasing by 1 for every customer.
fname Customer first name. Allow for a length of 15.
lname Customer last name. Allow for a length of 15.
company Company name. Allow for a length of 20.
address1 First address line for the customer. Allow for a length of 20.
address2 Second address line for the customer. Allow for a length of 20.
city City where the customer lives. Allow for a length of 15.
state State where the customer lives. Allow for a length of 2.
zipcode Customer zipcode. Allow for a length of 5.
3. Create a stock table in your database with the following columns:
Column name Description
stock_num Manufacturer stock number that identifies the specific item. It
is a number, no greater than 1,000.
manu_code Manufacturer code for the item. The code is 3 characters in
length.
description Item description. Allow for a length of 15.
unit_price Item price per unit. It has a maximum of 6 digits, including 2
decimal places.
unit Unit by which the item is ordered. Allow for a length of 4.
unit_descr Description of the unit. Allow for a length of 15.

© Copyright IBM Corp. 2001, 2017 1-46


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

4. Create an orders table in your database with the following columns:


Column name Description
order_num Number starting at 1001 and increasing by 1 for every order.
order_date Date the order is placed.
customer_num Customer number this order belongs to. This refers to the
customer number in the customer table.
po_num Customer purchase order number. It can contain letters and
numbers. Allow for a length of 10.
ship_date Date the order is shipped.
ship_weight Shipping weight of the order. It contains a maximum of 8 digits,
including 2 decimal places.
ship_charge Shipping charge amount. It contains a maximum of 6 digits,
including 2 decimal places.
paid_date Date the order is paid.
5. Create an items table in your database with the following columns:
Column Description
name
item_num Numeric identifying the individual line number of this item. It has a
maximum value of 1,000.
order_num Order number this item belongs to. This refers to the order number
in the orders table.
stock_num Stock number for the item. This refers to the stock number in the
stock table.
manu_code Manufacturer code for the item ordered. This refers to the
manufacturer code in the stock table.
quantity Quantity ordered. This has a maximum value of 2,000.
total_price Quantity ordered times unit price (from the stock table). Allow for a
maximum of 8 digits including 2 decimal places.
6. Run the load script load.sql. This script loads the tables you created. Use the
following command to execute the load script:
$ dbaccess stores_demo load.sql

© Copyright IBM Corp. 2001, 2017 1-47


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

7. Use oncheck -pe to determine the location of the table extents. Save the output
to a file to compare later.
Hint: If you do not specify the dbspace when creating a table, where is the table
stored?
Hint: When using oncheck to create an output file of your tables for a specific
dbspace, use the following command:
$ oncheck -pe dbspace_name > filename
Have the tables been efficiently located and loaded in the best possible
manner?
Since your tables are already created, you can query systables table to get
information that is needed to calculate extent sizes. What information would that
be?
8. Drop each of your tables using the following command:
DROP TABLE table_name;
9. Use the following information to recreate your customer table:
• Calculate the first extent size to initially store 1200 rows.
• Calculate a next extent size assuming that your table will grow approximately
10 percent per year for one year.
• Use row-level locking.
• Locate the table in dbspace2.
10. Use the following information to recreate your orders table:
• Calculate the first extent size to initially store 500 rows.
• Calculate a next extent size assuming that your table will grow approximately
10 percent per year for one year.
• Use row-level locking.
• Locate the table in dbspace3.

© Copyright IBM Corp. 2001, 2017 1-48


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

11. Use the following information to recreate your items table:


• Calculate the first extent size to initially store 500 rows.
• Calculate a next extent size assuming that your table will grow approximately
10 percent per year for one year.
• Use row-level locking.
• Locate the table in dbspace4.
12. Use the following information to recreate your stock table:
• Calculate the first extent size to initially store 74 rows.
• Calculate a next extent size assuming that your table will grow approximately
10 percent per year for one year.
• Use row-level locking.
• Locate the table in dbspace4.
13. Use the load script to load your tables again.
$ dbaccess stores_demo load.sql
14. Run the following commands on each table:
• Use oncheck -pe to check your table to see how the extents were allocated.
Compare how extents were allocated when defaults were used compared to
their current layout.
• Query the systables system catalog for the customer table for the following:
• First extent size
• Next extent size
• Type of table locking
• Row size
Is the row size correct? Why or why not?

© Copyright IBM Corp. 2001, 2017 1-49


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Task 5. Create a table for a simple large object.


You will create and load a table for a simple large object in a specified dbspace.
1. Create a catalog table in your database in the dbspace4 dbspace with the
following columns:
Column Description
Name
catalog_num Number starting at 10001 and increasing by 1 for every catalog
entry.
stock_num Stock number for the item. This refers to the stock number in the
stock table.
cat_descr Description of item. This is a text description and can be large.
cat_picture Picture of item. This is an image.
cat_advert Tag line below the picture. The tag line can be as many as 255
characters long, but is never fewer than 65.
Put the simple large objects in the table. Be sure and calculate the extent sizes
as in the previous task, allowing for 74 rows.
2. Load the catalog table using the load script loadcat.sql.
$ dbaccess stores_demo loadcat.sql
Task 6. Create a temporary table.
In this task, you will be using two windows: one window will be used to create a
temporary table and the other to monitor the temporary table. First, however, you
must make sure that the database engine is configured to recognize the temporary
dbspaces.
1. Bring the engine offline by issuing the following command:
$ onmode -ky
2. Edit the onconfig file and make sure that the DBSPACETEMP parameter has
your two temporary dbspaces listed as shown below. Important: Do not put any
spaces or other white space in the list of dbspace names.
DBSPACETEMP tempdbs1:tempdbs2
3. Bring the engine online by issuing the following command:
$ oninit
4. Open two windows. In the first window, open a dbaccess session and create a
temporary table called cust_temp. Make it a no logging table. The table
contains customer_num, fname, and lname from the customer table.
Important: Do not exit this dbaccess session. You will continue with it in a later
exercise.
5. In the second window, run oncheck -pe to see the temporary table usage.

© Copyright IBM Corp. 2001, 2017 1-50


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Task 7. Use dbschema.


In this task, you will create files containing the statements to recreate the tables
from your database. Be sure to save your SQL scripts so you can use them again in
future tasks.
Use the dbschema utility to create the following files:
1. Create a file called customer_schema.sql, which will contain the SQL
statements necessary to recreate the customer table. Be sure to use the -ss
option to capture lock level and dbspace information.
2. Create a file called orders_schema.sql, which will contain the SQL statements
necessary to recreate the orders table.
3. Create a file called items_schema.sql, which will contain the SQL statements
necessary to recreate the items table.
4. Create a file called stock_schema.sql, which will contain the SQL statements
necessary to recreate the stock table.
5. Create a file called catalog_schema.sql, which will contain the SQL
statements necessary to recreate the catalog table.
Results:
In this exercise, you created databases and tables, including tables with large
object data types. You also used the dbschema utility for generating database
and table schemas.

© Copyright IBM Corp. 2001, 2017 1-51


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Exercise 1:
Create databases and tables - Solutions

Purpose:
In this exercise, you will create databases and tables, including tables with
large object data types. You will also use the dbschema utility for generating
database and table schemas.

Task 1. Access the Informix system.


In this task, you will start the virtual machine and access the Informix 12.10 system.
If you are not using the standard class setup, skip this step and consult your
instructor for information on how to access the Informix system.
1. The virtual Windows system runs in VMWare. Start the virtual machine by
clicking on "Power on this virtual machine".
2. If the "Shutdown Event Tracker" dialog box is displayed, click Cancel.
3. If the "Server Manager" window is opened, click the X in the upper right-hand
corner to close it.
4. Double click the "Docker Quickstart Terminal" icon on the Windows desktop.
The "Docker Quickstart Terminal" window will open, and docker will be started.
5. At the $ prompt, enter the following:
docker start iif_developer_edition
6. At the $ prompt, enter the following:
exit
7. To open a terminal window on the Informix system, double click on the "putty"
icon on the Windows desktop.
The "PuTTY Configuration" window will open.
8. Click on "Informix Server" and then click on the open button.
A PuTTY window will open.
9. At the "login as:" prompt, enter the following:
docker
10. At the "docker@localhost's password:" prompt, enter the following:
tcuser
11. At the "docker@default:~$" prompt, enter the following:
docker exec -it iif_developer_edition bash

© Copyright IBM Corp. 2001, 2017 1-52


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

12. At the "IDS-12.10 dev:" prompt, enter any valid Linux or Informix command. For
example,
dbaccess
13. When you are finished using the terminal window, enter the following at the
"IDS-12.10 dev:" prompt:
exit
Then enter the following at the "docker@default:~$" prompt:
exit
14. When you are finished using the virtual machine, click the X in the upper right-
hand corner. The "VMWare Workstation" dialog box will be displayed. Click on
either the "Suspend" button or the "Power off" button.
Task 2. Using dbaccess.
In this task, you will familiarize yourself with the dbaccess functions for entering,
running, saving, and loading SQL statements.
1. At the UNIX prompt, enter the following command:
dbaccess
This will open the dbaccess editior, with the menu at the top.
2. From the dbaccess menu, choose Query-language. You can either highlight it
and press enter, or just type in the letter q. The SELECT DATABASE prompt
will be displayed.
3. At the SELECT DATABASE prompt, select sysmaster@dev by highlighting it
and pressing Enter. The SQL menu will be displayed.
4. From the SQL menu, choose New. You can either highlight it and press Enter,
or just type in the letter n. The cursor will move to the typing area.
5. In the typing area, type in the following SQL statement:
SELECT * FROM SYSTABLE;
Refer to the top of the screen for editing options (ESC done editing, CTRL-X
delete a character, and so on.
6. After entering the SQL, press the ESC key on the keyboard. The SQL menu will
be displayed.
7. To execute the SQL, choose Run. You can either highlight it and press enter, or
just type in the letter r.
8. The SQL will be executed. Since it includes an error, an error message will be
displayed at the bottom of the screen. (There is no table in the database named
systable.) To correct the error, choose Modify from the menu. You can either
highlight it and press Enter, or just type in the letter m.

© Copyright IBM Corp. 2001, 2017 1-53


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

9. The cursor will be displayed in the typing area, as close to the error as possible.
Refer to the MODIFY menu at the top of the screen for editing options. Correct
the error so the SQL reads as follows:
SELECT * FROM SYSTABLES;
10. Press the ESC key on the keyboard. The SQL menu will be displayed.
11. Execute the SQL statement by choosing Run. If there are no further errors, the
first page of the results will be displayed.
12. Page through the output by choosing Next from the DISPLAY menu. You can
either highlight it and press enter, or just type in the letter n.
13. Continue paging through the output. When you have seen enough, choose Exit
from the DISPLAY menu. You can either highlight it and press enter, or just
type the letter e. The SQL menu will be displayed.
14. To save the SQL, choose Save from the SQL menu. You can either highlight it
and press Enter, or just type the letter s. The SAVE>> prompt will be displayed.
15. At the SAVE>> prompt, type in the filename into which you want to save the
SQL statement, in this case EXAMPLE, and then press Enter. The SQL menu
will be displayed.
16. To exit out of the editing options, choose the Exit option from the SQL menu.
You can either highlight it and press Enter, or just type the letter e. The
DBACCESS menu will be displayed.
17. To exit out of dbaccess, choose the Exit option from the SQL menu. You can
either highlight it and press Enter, or just type the letter e. The Unix prompt will
be displayed.
18. Enter dbaccess again. Choose the sysmaster database. Choose the Query-
language option. The SQL menu will be displayed.
19. To open an existing SQL file, select the Choose option from the SQL menu.
You can either highlight it and press Enter, or just type the letter c. A list of your
saved files will be displayed.
20. From the file list, choose the EXAMPLE file by highlighting it and pressing
Enter. The contents of the file will be loaded into the typing area. Use the
previously discussed commands to run the SQL and view the output.
21. Exit dbaccess and return to the UNIX prompt.

© Copyright IBM Corp. 2001, 2017 1-54


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Task 3. Create a database.


In this task, you will create a database, identify its default dbspace location using
oncheck, drop the database, and recreate it in a specified dbspace.
1. Create a database called stores_demo and have it use buffered logging.
CREATE DATABASE stores_demo WITH BUFFERED LOG
2. Use oncheck -pe to locate the dbspace where your database has been
created.
oncheck -pe | more
Is the database in the best location?
No, the database is located in the rootdbs, where the system databases
used to maintain the database server are stored.
How do you specify the location of your database?
Use the IN dbspace_name clause with your CREATE DATABASE
statement.
3. Drop your database using the following statement (you must have another
database selected first, to remove the ‘stores_demo’ database):
DROP DATABASE stores_demo;
4. Recreate your database in the dbspace1 dbspace. Remember to have it use
buffered logging. Hint: Use the IN dbspace_name clause with your CREATE
DATABASE statement.
CREATE DATABASE stores_demo IN dbspace1 WITH BUFFERED LOG;
5. Use oncheck -pe to locate the dbspace where your database has been
created.
$ oncheck -pe dbspace1 | more

© Copyright IBM Corp. 2001, 2017 1-55


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Task 4. Create tables.


In this task, you will create and load four tables in your database using specified
dbspaces. You will use the oncheck utility to query the system catalogs to see how
extents were allocated. You will drop all the tables, calculate the proper extent size
usage, and recreate and reload the tables.
1. Make sure that you are in the /home/informix/labs directory. Create the SQL
scripts in individual files so you can save and modify them later.
$ cd /home/informix/labs
***From this point in the Exercise, coding is done in ‘dbaccess’***
2. Create a customer table (type the code and Run) in your database with the
following columns:
Column name Description
customer_num Number starting at 101 and increasing by 1 for
every customer.
fname Customer first name. Allow for a length of 15.
lname Customer last name. Allow for a length of 15.
company Company name. Allow for a length of 20.
address1 First address line for the customer.
Allow for a length of 20.
address2 Second address line for the customer.
Allow for a length of 20.
city City where the customer lives.
Allow for a length of 15.
state State where the customer lives.
Allow for a length of 2.
zipcode Customer zipcode. Allow for a length of 5.
CREATE TABLE customer (
customer_num SERIAL(101),
fname CHAR(15),
lname CHAR(15),
company CHAR(20),
address1 CHAR(20),
address2 CHAR(20),
city CHAR(15),
state CHAR(2),
zipcode CHAR(5)
)
;

© Copyright IBM Corp. 2001, 2017 1-56


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

3. Create a stock table (type the code and Run) in your database with the
following columns:
Column Description
name
stock_num Manufacturer stock number that identifies the specific item. It is a
number, no greater than 1,000.
manu_code Manufacturer code for the item. The code is 3 characters in length.
description Item description. Allow for a length of 15.
unit_price Item price per unit. It has a maximum of 6 digits, including 2
decimal places.
unit Unit by which the item is ordered. Allow for a length of 4.
unit_descr Description of the unit. Allow for a length of 15.
CREATE TABLE stock (
stock_num SMALLINT,
manu_code CHAR(3),
description CHAR(15),
unit_price MONEY(6,2),
unit CHAR(4),
unit_descr CHAR(15)
)
;

© Copyright IBM Corp. 2001, 2017 1-57


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

4.Create an orders table (type the code and Run) in your database with the
following columns:
Column name Description
order_num Number starting at 1001 and increasing by 1 for every order.
order_date Date the order is placed.
customer_num Customer number this order belongs to. This refers to the
customer number in the customer table.
po_num Customer purchase order number. It can contain letters and
numbers. Allow for a length of 10.
ship_date Date the order is shipped.
ship_weight Shipping weight of the order. It contains a maximum of 8 digits,
including 2 decimal places.
ship_charge Shipping charge amount. It contains a maximum of 6 digits,
including 2 decimal places.
paid_date Date the order is paid.
CREATE TABLE orders (
order_num SERIAL(1001),
order_date DATE,
customer_num INTEGER,
po_num CHAR(10),
ship_date DATE,
ship_weight DECIMAL(8,2),
ship_charge MONEY(6,2),
paid_date DATE
)
;

© Copyright IBM Corp. 2001, 2017 1-58


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

5. Create an items table (type the code and Run) in your database with the
following columns:
Column Description
name
item_num Numeric identifying the individual line number of this item. It has a
maximum value of 1,000.
order_num Order number this item belongs to. This refers to the order number
in the orders table.
stock_num Stock number for the item. This refers to the stock number in the
stock table.
manu_code Manufacturer code for the item ordered. This refers to the
manufacturer code in the stock table.
quantity Quantity ordered. This has a maximum value of 2,000.
total_price Quantity ordered times unit price (from the stock table). Allow for a
maximum of 8 digits including 2 decimal places.
CREATE TABLE items (
item_num SMALLINT,
order_num INTEGER,
stock_num SMALLINT,
manu_code CHAR(3),
quantity SMALLINT,
total_price MONEY(8,2)
)
;
6. Run the load script load.sql. This script loads the tables you created. Use the
following command to execute the load script:
$ dbaccess stores_demo load.sql
Use oncheck -pe to determine the location of the table extents. Save the output
to a file to compare later.
Hint: If you do not specify the dbspace when creating a table, where is the table
stored? It is stored in the same dbspace as the database (in this case,
dbspace1).
Hint: When using oncheck to create an output file of your tables for a specific
dbspace, use the following command:
$ oncheck -pe dbspace_name > filename

For example:
$ oncheck –pe dbspace1 > D1T3S6 (represents Demo 1, Task 3, Step 6)

© Copyright IBM Corp. 2001, 2017 1-59


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Have the tables been efficiently located and loaded in the best possible
manner?
No, the tables are all stored in dbspace1, and may have multiple
fragments interleaved with fragments of other tables.
Since your tables are already created, you can query systables table to get
information that is needed to calculate extent sizes. What information would that
be? Note: nrows may show up as 0 since UPDATE STATISTICS has not been
run yet.

Rowsize, number of columns, and number of rows have information that


help in calculating the extent sizes.

SELECT rowsize, ncols, nrows


from systables
where tabname = 'table name';

7. Drop each of your tables using the following command:


DROP TABLE table_name;
8. Use the following information to recreate your customer table:
• Calculate the first extent size to initially store 1200 rows.
• Calculate a next extent size assuming that your table will grow approximately
10 percent per year for one year.
• Use row-level locking.
• Locate the table in dbspace2.

© Copyright IBM Corp. 2001, 2017 1-60


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

CREATE TABLE customer (


customer_num SERIAL(101),
fname CHAR(15),
lname CHAR(15),
company CHAR(20),
address1 CHAR(20),
address2 CHAR(20),
city CHAR(15),
state CHAR(2),
zipcode CHAR(5)
)
IN dbspace2
EXTENT SIZE 150 NEXT SIZE 16
LOCK MODE ROW;
The name, company, and address columns can be either CHAR or VARCHAR.
If the column is likely to be indexed, it is best to use a CHAR data type.
Assuming the structure given above and a 2-kilobyte page size, the initial extent
calculation is:
-Initial rows = 1200
-Rowsize = 4 + 15 + 15 + 20 + 20 + 20 + 15 + 2 + 5 + 4 (slot) = 120
-Pageuse = 2048 - 28 = 2020
Since rowsize is less than the page size:
-Rows per page = (pageuse/rowsize) = 2020 / 120 = 16.83 ==> 16
-Number of data pages required = 1200 / 16 = 75
-Number of kilobytes required = (75 * 2048) / 1024 = 150
The EXTENT SIZE is 150.
-Assuming that the table will grow by 10 percent per year, the DBA can
calculate the NEXT SIZE by applying the calculations above for growth
rows:
-Number of rows at 10 percent growth = 1200 / 10 = 120
-Number of data pages required = 120 / 16 = 7.5 ==> 8
-Number of kilobytes required = (8 * 2048) / 1024 = 16
The NEXT SIZE is 16.

© Copyright IBM Corp. 2001, 2017 1-61


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

9. Use the following information to recreate your orders table:


• Calculate the first extent size to initially store 500 rows.
• Calculate a next extent size assuming that your table will grow approximately
10 percent per year for one year.
• Use row-level locking.
• Locate the table in dbspace3.
CREATE TABLE orders (
order_num SERIAL(1001),
order_date DATE,
customer_num INTEGER,
po_num CHAR(10),
ship_date DATE,
ship_weight DECIMAL(8,2),
ship_charge MONEY(6,2),
paid_date DATE
)
IN dbspace3
EXTENT SIZE 22 NEXT SIZE 8
LOCK MODE ROW;
Assuming the structure given above and a 2-kilobyte page size, the initial extent
calculation is:
-Initial rows = 500
-Rowsize = 4 + 4 + 4 + 10 + 4 + 5 + 4 + 4 + 4 (slot) = 43
-Pageuse = 2048 - 28 = 2020
Since rowsize is less than the page size:
-Rows per page = (pageuse/rowsize) = 2020 / 43 = 46.97 ==> 46
-Number of data pages required = 500 / 46 = 10.869 ==> 11
-Number of kilobytes required = (11 * 2048) / 1024 = 22
The EXTENT SIZE is 22.
Assuming that the table will grow by 10 percent per year, the DBA can calculate
the NEXT SIZE by applying the calculations above for growth rows:
-Number of rows at 10 percent growth = 500 / 10 = 50
-Number of data pages required = 50 / 46 = 1.0869 ==> 2
However, the minimum extent size is 4 pages, so the NEXT SIZE must be:
-Number of kilobytes required = (4 * 2048) / 1024 = 8
The NEXT SIZE is 8.

© Copyright IBM Corp. 2001, 2017 1-62


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

10. Use the following information to recreate your items table:


• Calculate the first extent size to initially store 500 rows.
• Calculate a next extent size assuming that your table will grow
approximately 10 percent per year for one year.
• Use row-level locking.
• Locate the table in dbspace4.
CREATE TABLE items (
item_num SMALLINT,
order_num INTEGER,
stock_num SMALLINT,
manu_code CHAR(3),
quantity SMALLINT,
total_price MONEY(8,2)
)
IN dbspace4
EXTENT SIZE 12 NEXT SIZE 8
LOCK MODE ROW;

Assuming the structure given above and a 2-kilobyte page size, the initial extent
calculation is:
-Initial rows = 500
-Rowsize = 2 + 4 + 2 + 3 + 2 + 5 + 4 (slot) = 22
-Pageuse = 2048 - 28 = 2020
Since rowsize is less than the page size:
-Rows per page = (pageuse/rowsize) = 2020 / 22 = 91.81 ==> 91
-Number of data pages required = 500 / 91 = 5.494 ==> 6
-Number of kilobytes required = (6 * 2048) / 1024 = 12
The EXTENT SIZE is 12.
Assuming that the table will grow by 10 percent per year, the DBA can calculate
the NEXT SIZE by applying the calculations above for growth rows:
-Number of rows at 10 percent growth = 500 / 10 = 50
-Number of data pages required = 50 / 91 = 0.549 ==> 1
However, the minimum extent size is 4 pages, so the NEXT SIZE must be:
-Number of kilobytes required = (4 * 2048) / 1024 = 8
The NEXT SIZE is 8.

© Copyright IBM Corp. 2001, 2017 1-63


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

11. Use the following information to recreate your stock table:


• Calculate the first extent size to initially store 74 rows.
• Calculate a next extent size assuming that your table will grow
approximately 10 percent per year for one year.
• Use row-level locking.
• Locate the table in dbspace4.
CREATE TABLE stock (
stock_num SMALLINT,
manu_code CHAR(3),
description CHAR(15),
unit_price MONEY(6,2),
unit CHAR(4),
unit_descr CHAR(15)
)
IN dbspace4
LOCK MODE ROW;

Assuming the structure given above and a 2-kilobyte page size, the initial extent
calculation is:
-Initial rows = 74
-Rowsize = 2 + 3 + 15 + 4 + 4 + 15 + 4 (slot) = 47
-Pageuse = 2048 - 28 = 2020
Since rowsize is less than the page size:
-Rows per page = (pageuse/rowsize) = 2020 / 47 = 42.98 ==> 42
-Number of data pages required = 74 / 42 = 1.761 ==> 2
-Number of kilobytes required = (2 * 2048) / 1024 = 4
However, the minimum extent size is 4 pages, so the EXTENT SIZE must be:
-Number of kilobytes required = (4 * 2048) / 1024 = 8
The EXTENT SIZE is 8.
Assuming that the table will grow by 10 percent per year, the DBA can calculate
the NEXT SIZE by applying the calculations above for growth rows:
-Number of rows at 10 percent growth = 74 / 10 = 7.4
-Number of data pages required = 7.4 / 42 = 0.176 ==> 1
However, the minimum extent size is 4 pages, so the NEXT SIZE must be:
-Number of kilobytes required = (4 * 2048) / 1024 = 8

© Copyright IBM Corp. 2001, 2017 1-64


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

The NEXT SIZE is 8.


Notice that the EXTENT SIZE and NEXT SIZE have not been included in the
CREATE TABLE syntax. Why?
The default size for table extents is 8 pages. Therefore, on a system with a 2-
kilobyte page size, the default sizes for EXTENT SIZE and NEXT SIZE would
be 16 kilobytes each. This is plenty of room to store the rows for the stock table
and, therefore, the explicit inclusion of EXTENT SIZE and NEXT SIZE is not
required.
12. Use the load script to load your tables again.
$ dbaccess stores_demo load.sql
13. Run the following commands on each table:
Use oncheck -pe to check your table to see how the extents were allocated.
Compare how extents were allocated when defaults were used compared to
their current layout.
$ oncheck -pe dbspace_name > filename

For example:
$ oncheck –pe dbspace1 > D1T3S6 (represents Demo 1, Task 3, Step 13)

• Query the systables system catalog for the customer table for the
following:
• First extent size
• Next extent size
• Type of table locking
• Row size
Is the row size correct? Why or why not?
The rowsize is correct for the data columns, but it does not include the
extra four bytes for the slot table entry.
SELECT * FROM systables
WHERE tabname = "customer";

© Copyright IBM Corp. 2001, 2017 1-65


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Task 5. Create a table for a simple large object.


You will create and load a table for a simple large object in a specified dbspace.
1. Create a catalog table in your database in the dbspace4 dbspace with the
following columns:
Column Description
Name
catalog_num Number starting at 10001 and increasing by 1 for every catalog
entry.
stock_num Stock number for the item. This refers to the stock number in the
stock table.
cat_descr Description of item. This is a text description and can be large.
cat_picture Picture of item. This is an image.
cat_advert Tag line below the picture. The tag line can be as many as 255
characters long, but is never fewer than 65.
Put the simple large objects in the table. Be sure and calculate the extent sizes
as in the previous task, allowing for 74 rows.
CREATE TABLE catalog (
catalog_num SERIAL(10001),
stock_num SMALLINT,
cat_descr TEXT,
cat_picture BYTE,
cat_advert VARCHAR(255,65)
)
IN dbspace4
EXTENT SIZE 30 NEXT SIZE 8
LOCK MODE ROW;

© Copyright IBM Corp. 2001, 2017 1-66


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Assuming the structure given above and a 2-kilobyte page size, the initial extent
calculation is:
Initial rows = 74
Rowsize = 4 + 2 + 56 + 56 + 256 + 4 (slot) = 378
Use the maximum length value for the cat_advert column. In this case it is 255
bytes, plus 1 for the length byte for the VARCHAR.
Pageuse = 2048 - 28 = 2020
Since rowsize is less than the page size:
Rows per page = (pageuse/rowsize) = 2020 / 378= 5.34 ==> 5
Number of data pages required = 74 / 5= 14.8 ==> 15
Number of kilobytes required = (15 * 2048) / 1024 = 30
The EXTENT SIZE is 30;
Assuming that the table will grow by 10 percent per year, the DBA can calculate
the NEXT SIZE by applying the calculations above for growth rows:
Number of rows at 10 percent growth = 74 / 10 = 7.4
Number of data pages required = 7.4 / 5 = 1.48 ==> 2
However, the minimum extent size is 4 pages, so the NEXT SIZE must be:
Number of kilobytes required = (4 * 2048) / 1024 = 8
The NEXT SIZE is 8.
This only calculates the extent sizes for the data rows, and does not include the
storage space required for the cat_descr and cat_picture objects themselves.
These objects will be stored in the same tablespace as the data rows, so their
size estimates must also be included.
2. Load the catalog table using the load script loadcat.sql.
$ dbaccess stores_demo loadcat.sql

© Copyright IBM Corp. 2001, 2017 1-67


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Task 6. Create a temporary table.


In this task, you will be using two windows: one window will be used to create a
temporary table (and write scripts) and the other to monitor the temporary table.
First, however, you must make sure that the database engine is configured to
recognize the temporary dbspaces.
1. Bring the engine offline by issuing the following command:
$ onmode -ky
2. Edit the onconfig file (using the vi editor) and make sure that the
DBSPACETEMP parameter has your two temporary dbspaces listed as shown
below. Important: Do not put any spaces or other white space in the list of
dbspace names.
DBSPACETEMP tempdbs1:tempdbs2
$ vi $INFORMIXDIR/etc/$ONCONFIG
3. Bring the engine online by issuing the following command:
$ oninit
4. Open (and log into = docker/tcuser) a 2nd PUTTY session window. Run the
command: docker exec -it iif_developer_edition bash. In the first window,
open a dbaccess session and create a temporary table called cust_temp.
Make it a no logging table. The table contains customer_num, fname, and
lname from the customer table.
Important: Do not exit this dbaccess session. You will continue with it in a later
exercise.
SELECT customer_num, fname, lname
FROM customer
INTO TEMP cust_temp WITH NO LOG;
5. In the second window, run oncheck -pe to see the temporary table usage.
$ oncheck -pe tempdbs1
The oncheck -pe tempdbs1 output displays the dbspace mapping with the
cust_temp temporary table using 16 pages in the tempdbs1 temporary
dbspace.

© Copyright IBM Corp. 2001, 2017 1-68


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Task 7. Use dbschema.


In this task, you will create files containing the statements to recreate the tables from
your database. Be sure to save your SQL scripts so you can use them again in future
tasks.
Use the dbschema utility to create the following files:
1. From the second window, create a file called customer_schema.sql, which will
contain the SQL statements necessary to recreate the customer table. Be sure
to use the -ss option to capture lock level and dbspace information.
$ dbschema -d stores_demo -t customer -ss customer_schema.sql
2. Create a file called orders_schema.sql, which will contain the SQL statements
necessary to recreate the orders table.
$ dbschema -d stores_demo -t orders -ss orders_schema.sql
3. Create a file called items_schema.sql, which will contain the SQL statements
necessary to recreate the items table.
$ dbschema -d stores_demo -t items -ss items_schema.sql
4. Create a file called stock_schema.sql, which will contain the SQL statements
necessary to recreate the stock table.
$ dbschema -d stores_demo -t stock -ss stock_schema.sql
5. Create a file called catalog_schema.sql, which will contain the SQL
statements necessary to recreate the catalog table.
$ dbschema -d stores_demo -t catalog -ss catalog_schema.sql
Results:
In this exercise, you created databases and tables, including accounting for
large object data types. You also used the dbschema utility for generating
database and table schemas.

© Copyright IBM Corp. 2001, 2017 1-69


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 1 Creating databases and tables

Unit summary
• Review prerequisites
• Create databases and tables
• Determine database logging and storage requirements
• Locate where the database server stores a table on disk
• Create temporary tables
• Locate where the database server stores temporary tables
• Use the system catalog tables to gather information
• Use the dbschema utility

Creating databases and tables © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 1-70


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Altering and dropping databases and tables

Altering and dropping


databases and tables

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Unit 2 Altering and dropping databases and tables

© Copyright IBM Corp. 2001, 2017 2-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Unit objectives
• Drop a database
• Drop a table
• Alter a table
• Convert a simple large object to a smart large object

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 2-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Altering a table
• The ALTER TABLE statement allows you to:
 Add new columns to the end of a table
 Add new columns before another column in the table
 Add a new column with a NOT NULL constraint or a default value
 Drop columns
 Add and drop integrity constraints
 Modify the definition of an existing column
 Change the size of successive extent allocations
 Change the lock mode for a table
• Must have exclusive access to table:
 Places exclusive lock
 Duration of lock depends on type of ALTER TABLE

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Altering a table
Informix offers a robust set of ALTER TABLE capabilities. It uses one of three
sophisticated algorithms to execute ALTER TABLE statements. The three algorithms
are:
• Fast alter
• In-place alter
• Slow alter
All ALTER TABLE statements require an exclusive lock on the table being altered, but
the duration of the lock depends on which alter method is used.

© Copyright IBM Corp. 2001, 2017 2-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Fast ALTER
A fast alter is performed when the ALTER TABLE command:
 Modifies the lock mode
ALTER TABLE orders
LOCK MODE (ROW);
 Changes the next extent size
ALTER TABLE customer
NEXT SIZE 20;
 Adds or drops a constraint
ALTER TABLE manufact
DROP CONSTRAINT con_name;

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Fast ALTER
When the ALTER TABLE statement performs an alter operation that does not affect
the table data, Informix performs a fast alter. Only the system catalog tables are
updated, since there is no need to modify any data pages. The table is unavailable
to users only for the brief time required to execute the update operation on the
system catalog tables.
Internally, a fast alter is performed when the ALTER TABLE statement modifies the lock
mode of an existing table, changes the next extent size of an existing table, or adds or
drops a constraint. (Adding and dropping a constraint are covered in a future unit.)

© Copyright IBM Corp. 2001, 2017 2-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

In-place ALTER
Generally, an in-place alter is performed when the ALTER TABLE
command:
 Adds a column or list of columns
ALTER TABLE customer
ADD birthday DATE;
 Drops a column
ALTER TABLE customer
DROP birthday;
 Modifies the data type of a column
ALTER TABLE customer
MODIFY birthday DATETIME YEAR TO MINUTE;
 Modifies a column that is part of a fragmentation expression

Altering and dropping databases and tables © Copyright IBM Corporation 2017

In-place ALTER
For most ALTER TABLE statements that actually modify rows and affect data pages,
in-place alter logic is applied. This sophisticated algorithm allows the server to simply
record the alterations in the system catalog tables, delaying the overhead of rewriting
data pages until other modifications necessitate page updates.
Table definition versions
The in-place alter table algorithm accomplishes the alter operation by creating a
new version of the table definition. Each data page is associated with a version.
After the in-place ALTER TABLE statement, new rows are inserted into data pages
with the new version only. When rows on old pages are updated, all the rows on the
data page are updated to the new version, if there is enough room. If there is not
enough room, the row is deleted from the old page and inserted into a page with the
new version. Up to 255 versions of a table definition are allowed by the database
server. Information about versioning is available by using the oncheck utility:
oncheck -pT database:table

© Copyright IBM Corp. 2001, 2017 2-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Each subsequent in-place ALTER TABLE statement on the same table takes more
time to execute. Informix recommends no more than 50 to 60 outstanding alters on
a table. If you want to eliminate multiple versions of a table, force an immediate
change to all rows. For example, use a dummy UPDATE statement that sets the
value of a column to itself.
Logging and in-place alter
The ALTER TABLE statement, like all DDL statements, creates log entries even
when the database is not logged. Using the in-place alter algorithm, each data page
is logged at the time that the change physically takes place (that is, when a row is
inserted or updated).
An in-place alter DOES NOT occur on fragmented tables that use ROWIDs.

© Copyright IBM Corp. 2001, 2017 2-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Slow ALTER
A slow ALTER is performed when the ALTER TABLE command:
• Adds or drops a column created with the ROWIDS or CRCOLS
keyword
• Drops a column of data type TEXT or BYTE
• Modifies the data type of a column in such a way that possible values
of the old type cannot be converted to the new type
• Modifies the data type of a column in a FRAGMENT clause requires
value conversion that might cause rows to move to another fragment

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Slow ALTER
There are a number of situations where Informix must perform a slow alter instead
of a fast ALTER or an in-place ALTER, including the ones listed here.
There are important considerations which must be taken into account when doing a
slow ALTER. These considerations are discussed on the following page.

© Copyright IBM Corp. 2001, 2017 2-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Slow ALTER process


A slow ALTER:
• Locks the table in exclusive mode for the duration of the ALTER
TABLE operation
• Makes a copy of the table with the new definition
 This requires enough free space in the dbspace for the 'new' table
• Copies the data rows to the new table
• Might treat the ALTER TABLE statement as a long transaction and
abort it if the LTXHWM threshold is exceeded

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Slow ALTER process


A slow ALTER process requires enough space in the dbspace of the table for two
copies of the data. It also requires sufficient logical-log space to record all of its
changes, without exceeding the limit on log space usage by a single process
(LTXHWM - long transaction highwater mark). If the table is very large, the time
required to rewrite each data page could be excessive, rendering the table unavailable
to users for an extended period.

© Copyright IBM Corp. 2001, 2017 2-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Data space reclamation: CLUSTER index


Example:
ALTER INDEX item_idx TO CLUSTER;

This rewrites the table,


freeing unused space for
other tables.

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Data space reclamation: CLUSTER index


Once an extent is allocated to a table, that extent is never automatically released for
other tables to use. If an extent should become empty (because of massive deletes
from the table), the extent remains part of the tblspace (segment).
Informix allows you to reclaim extent space by forcing the table to be physically
rewritten. One of the simplest ways to accomplish this is to issue an ALTER INDEX
index_name TO CLUSTER statement.
When you cluster an index in Informix, you are really forcing the data rows to be written
in the order of the index keys. When the ALTER INDEX index_name TO CLUSTER
statement is executed, every data page is rewritten to reorder the data rows even if they
are already in index order. The database server is able to reclaim unused extent space.
When a database is created with unbuffered logging, all transaction activity is written to
the logical-log buffers in shared memory and flushed to disk when the COMMIT WORK
statement is executed. This ensures that all completed work is saved on disk and
guarantees all committed transactions can be successfully recovered following any type
of system failure.

© Copyright IBM Corp. 2001, 2017 2-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Reducing the number of extents


ALTER INDEX creates a copy of the table and reclaims unused extents. The new table
might still have many extents even after having compressed all the rows. To restructure
your table so that you have fewer, larger extents, you must rebuild the table by
specifying an appropriate EXTENT SIZE and NEXT SIZE.
Maximum number of extents
The total number of extents allowed for a table varies depending on the page size,
number of indexes, the number of columns per index, and the type of columns in the
table (for instance, VARCHAR, TEXT, or BYTE). For systems with a 2 kilobyte page
size, the maximum number of extents is approximately 200. Systems with a 4 kilobyte
page size can have approximately 450 extents.
Having many extents can have a performance effect, particularly in a decision support
(DSS) environment where large groups of rows are selected. For tables with indexes or
VARCHAR data types, problems can occur with as few as 60 extents.
Another disadvantage of having too many extents is the possibility of reaching the
maximum number of extents. If a table grows unexpectedly and reaches the maximum
allowed number of extents, you must unload it, find enough contiguous space to
recreate the table with fewer extents, and then reload the data.

© Copyright IBM Corp. 2001, 2017 2-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Data space reclamation: TRUNCATE


• Deletes all rows from table
• Retains table and index structures
• Far more efficient than DELETE:
 Uses less log space
 Automatically updates system catalog and distributions
• Drop extra extents:
 Default action
 Keeps first (initial) extents
TRUNCATE tab1;
TRUNCATE TABLE tab1 DROP STORAGE;
• Keep all existing extents
TRUNCATE TABLE tab1 REUSE STORAGE;

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Data space reclamation: TRUNCATE


Another method of reclaiming space is to use the TRUNCATE statement. This
statement deletes all rows in the table and indexes associated with the table, but retains
the table and index structure. The default action is to release any extra extents. The
initial extent is retained at its original size.
To keep the existing storage, such as in the case where the table would be emptied
and then reloaded, use the REUSE STORAGE clause to keep the space already
allocated.
The TRUNCATE statement works on local tables and synonyms.

© Copyright IBM Corp. 2001, 2017 2-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Renaming columns, tables, and databases


• Rename a column:
RENAME COLUMN invoice.paid_date TO date_paid;
• Rename a table:
RENAME TABLE stock TO inventory;
• Rename a database:
RENAME DATABASE stores7 TO stores9;

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Renaming columns, tables, and databases


Columns, tables, or databases can be renamed using the RENAME COLUMN,
RENAME TABLE, or RENAME DATABASE statements.
If you rename a column that is referenced by a view in the database, the text of the
view in the sysviews system catalog table is updated with the new name. If the column
is referenced in a check constraint, the text of the check constraint is updated in the
syschecks system catalog table.
To use the ALTER TABLE statement, you must meet one of the following conditions:
• You must have the DBA privilege on the database where the table resides
• You must own the table
• You must have the Alter privilege on the specified table and the Resource
privilege on the database where the table resides
To add a referential constraint, you must have the DBA or References privilege on
either the referenced columns or the referenced table.
To drop a constraint, you must have the DBA privilege or be the owner of the constraint.
If you are the owner of the constraint but not the owner of the table, you must have
ALTER privilege on the specified table. You do not need the REFERENCES privilege to
drop a constraint.

© Copyright IBM Corp. 2001, 2017 2-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

When a table is renamed, references to the table within any views are changed. The
table name is replaced if it appears in a trigger definition. It is not replaced if it is inside
any triggered actions. The RENAME TABLE command operates on synonyms as well
as tables.
Column and table names within the text of routines are not changed by RENAME
COLUMN or RENAME TABLE. The routine returns an error when it references a non-
existent column or table.

© Copyright IBM Corp. 2001, 2017 2-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Converting simple objects to smart objects


• Text simple large objects (TEXT) can be converted to character smart
large objects (CLOB)
• Byte simple large objects (BYTE) can be converted to byte smart large
objects (BLOB)
• Example:
ALTER TABLE booklist
MODIFY content CLOB,
PUT content IN (sbsp1)(log);

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Converting simple objects to smart objects


The ALTER TABLE statement provides a method to convert the large objects within a
table. The example shows a TEXT column being converted to a CLOB column.

© Copyright IBM Corp. 2001, 2017 2-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Dropping tables and databases


• Syntax:
DROP TABLE tablename;
DROP DATABASE databasename;
• Examples:
DROP TABLE customer;
DROP DATABASE stores_demo;

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Dropping tables and databases


When you execute a DROP TABLE statement:
• All references to the table in system catalog tables are deleted
• The space occupied by the table is freed
When you execute a DROP DATABASE statement:
• The system catalog tables are dropped
• The space occupied by all tables is freed
You cannot ROLLBACK a DROP DATABASE or DROP TABLE statement. You must
recover from a backup if you want to restore a database or table that was dropped.

© Copyright IBM Corp. 2001, 2017 2-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Exercise 2
Alter and drop databases and tables
• create and drop databases
• create, drop, and alter tables
• convert simple large objects to smart large objects

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Exercise 2: Alter and drop databases and tables

© Copyright IBM Corp. 2001, 2017 2-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Exercise 2:
Alter and drop databases and tables
Purpose:
In this exercise, you will alter and drop databases and tables. You will also
convert a simple large object to a smart large object.

Task 1. Drop a database.


In this task, you will create a database in the dbspace dbspace4. You will create
the database with buffered logging, identify the database default location using
oncheck, and drop the database.
1. Create a database called mydb with buffered logging in dbspace4.
2. Use the oncheck utility to locate where your database has been created.
3. Drop your database.
What did you have to do before your database could be dropped?
How did you correct this problem?
Task 2. Drop a table.
In this task, you will drop the temporary table that was created in the previous exercise.
1. In the same dbaccess session you used in the previous exercise to create the
cust_temp temporary table, use the DROP TABLE SQL command to drop this
temporary table.
2. Verify that the temporary table was dropped using the oncheck -pe command.
Task 3. Alter tables.
In this task, you will alter the customer, orders, and catalog tables. You will use the
oncheck utility to monitor the version of each table.
1. Using the oncheck -pT command, identify the number of versions of each table
listed above.
2. Alter the customer table to add the following column after the zipcode column:
Column name Description
phone The phone number of the customer. Allow for a length of 18.

© Copyright IBM Corp. 2001, 2017 2-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

3. Alter the orders table to add the following columns before the po_num column.
The columns must appear in this order:
Column Description
name
ship_instruct The special shipping instructions. Allow for a length of 40.
backlog Flag to indicate whether the order has been filled or not (values
will be ‘y’ or ‘n’).
4. Alter the catalog table to add the following column after the stock_num
column:
Column Description
name
manu_code Manufacturer code for the item ordered. This refers to the
manufacturer code in the stock table.
5. Run the load script called alterload.sql. This script loads data into the new
columns for each table. To execute the load script, run the following command:
$ dbaccess stores_demo alterload.sql
6. Execute the oncheck -pT commands again and notice any differences in the
version information.
Hint: Since your tables are already created, you could query the systables table
to get information that is needed to calculate extent sizes. What information
would that be? Since your tables are already created, you could query the
systables table to get information that is needed to calculate extent sizes. What
information would that be?
Task 4. Convert a simple large object to a smart large object.
In this task, you alter the catalog table to change the cat_descr column from TEXT
to CLOB and relocate the column in an sbspace.
1. Alter the catalog table and change the cat_descr column from TEXT to CLOB
and relocate the column in the sbspace named s9_sbspc.
2. Verify that the table has been altered using dbaccess.
Results:
In this exercise, you altered and dropped databases and tables. You also
converted a simple large object to a smart large object.

© Copyright IBM Corp. 2001, 2017 2-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Exercise 2:
Alter and drop databases and tables - Solutions
Purpose:
In this exercise, you will alter and drop databases and tables. You will also
convert a simple large object to a smart large object.

Task 1. Drop a database.


In this task, you will create a database in the dbspace dbspace4. You will create
the database with buffered logging, identify the database default location using
oncheck, and drop the database.
1. Create a database called mydb with buffered logging in dbspace4.
CREATE DATABASE mydb IN dbspace4
WITH BUFFERED LOG;
2. Use the oncheck utility to locate where your database has been created.
$ oncheck -pe dbspace4 | more
3. Drop your database.
DROP DATABASE mydb;
What did you have to do before your database could be dropped?
Connect to a different database.
How did you correct this problem?
In dbaccess, select a different database and either drop the database
using the menu option from dbaccess or use the DROP DATABASE
syntax in an SQL statement.
Task 2. Drop a table.
In this task, you will drop the temporary table that was created in the previous exercise.
1. In the same dbaccess session you used in the previous exercise to create the
cust_temp temporary table, use the DROP TABLE SQL command to drop this
temporary table.
DROP TABLE cust_temp;
2. Verify that the temporary table was dropped using the oncheck -pe command.
$ oncheck -pe tempdbs1

© Copyright IBM Corp. 2001, 2017 2-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Task 3. Alter tables.


In this task, you will alter the customer, orders, and catalog tables. You will use the
oncheck utility to monitor the version of each table.
1. Using the oncheck -pT command, identify the number of versions of each table
listed above.
$ oncheck -pT stores_demo:customer

The customer table has only one version with all 73 data pages listed in
version 0, the current version.
$ oncheck -pT stores_demo:orders

© Copyright IBM Corp. 2001, 2017 2-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

The orders table has only one version with all 11 data pages listed in
version 0, the current version.

$ oncheck -pT stores_demo:catalog

The catalog table has only one version with all 9 data pages listed in
version 0, the current version.

© Copyright IBM Corp. 2001, 2017 2-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

2. Alter the customer table to add the following column after the zipcode column:
Column name Description
phone The phone number of the customer. Allow for a length of 18.
ALTER TABLE customer
ADD phone CHAR(18);
3. Alter the orders table to add the following columns before the po_num column.
The columns must appear in this order:
Column Description
name
ship_instruct The special shipping instructions. Allow for a length of 40.
backlog Flag to indicate whether the order has been filled or not (values
will be ‘y’ or ‘n’).
ALTER TABLE orders
ADD ship_instruct CHAR(40)
BEFORE po_num;
ALTER TABLE orders
ADD backlog CHAR(1)
BEFORE po_num;
4. Alter the catalog table to add the following column after the stock_num
column:
Column Description
name
manu_code Manufacturer code for the item ordered. This refers to the
manufacturer code in the stock table.
ALTER TABLE catalog
ADD manu_code CHAR(3)
BEFORE cat_descr;
5. Run the load script called alterload.sql. This script loads data into the new
columns for each table. To execute the load script, run the following command:
$ dbaccess stores_demo alterload.sql
6. Execute the oncheck -pT commands again and notice any differences in the
version information.
customer table:

© Copyright IBM Corp. 2001, 2017 2-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

The customer table now has two versions, but all 84 data pages are listed
in version 1, the current version.

orders table:

© Copyright IBM Corp. 2001, 2017 2-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Notice that the orders table now has three versions, while the others have
only two. This is because the changes to the orders table were done in
two separate ALTER TABLE steps instead of being combined into one
ALTER TABLE statement, each creating a new version. All 22 data pages
are shown in version 2, the current version.
A much better way of doing this alter, resulting in only one new version,
would have been:
ALTER TABLE orders
ADD (ship_inst CHAR(40)
BEFORE po_num,
backlog CHAR(1)
BEFORE po_num);

catalog table:

The catalog table now has two versions, but all nine data pages are
listed in version 1, the current version.

© Copyright IBM Corp. 2001, 2017 2-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Task 4. Convert a simple large object to a smart large object.


In this task, you alter the catalog table to change the cat_descr column from TEXT
to CLOB and relocate the column in an sbspace.
1. Alter the catalog table and change the cat_descr column from TEXT to CLOB
and relocate the column in the sbspace named s9_sbspc.
ALTER TABLE catalog
MODIFY cat_descr CLOB,
PUT cat_descr IN (s9_sbspc);
2. Verify that the table has been altered using dbaccess.
Run dbaccess, select your database, choose Table > Info, and note the
newly modified column.

Results:
In this exercise, you altered and dropped databases and tables. You also
converted a simple large object to a smart large object.

© Copyright IBM Corp. 2001, 2017 2-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

Unit summary
• Drop a database
• Drop a table
• Alter a table
• Convert a simple large object to a smart large object

Altering and dropping databases and tables © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 2-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 2 Altering and dropping databases and tables

© Copyright IBM Corp. 2001, 2017 2-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Creating, altering, and dropping indexes

Creating, altering, and


dropping indexes

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 3-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Unit objectives
• Build an index
• Alter, drop, and rename an index
• Identify the four index characteristics

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 3-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

B+ tree index structure

Leaf Node
56
Branch Node 57
59
59
292
95
Root Node
97 D
292 292 A
>
T
293 A
387 294
Level 0 > 387

393
394
397
401
Level 1

Level 2

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

B+ tree index structure


You use an index to find a row in a table quickly, similar to the way you use an index in
a book to find a page. Most Informix indexes are organized in B+ trees. A B+ tree is a
set of nodes that contain keys and pointers that are arranged in a hierarchy. The visual
shows an example of a B+ tree index structure.
The B+ tree is organized into levels. The highest level contains pointers, or addresses,
to the actual data. The other levels contain pointers to nodes on different levels that
contain keys that are less than or equal to the key in the higher level.
The lowest, or leaf level, is the level that actually points to the data rows. In this visual
this is level 2.
An index key is the set of column values on which the index is built. The index keys are
sorted in ascending order by default, but the index levels can be scanned in either
direction. An index created in ascending order can also be used to sort data in
descending order.
Example
In the example above, the 292 key has a pointer to the level 2 node with keys less than
or equal to 292 and greater than 59.

© Copyright IBM Corp. 2001, 2017 3-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

If you run oncheck -pT databasename:tablename on a table with an index that has two
levels, you can see that level 1 is the root node and level 2 is the leaf node. These are
actually nodes 0 and 1, but are displayed as levels 1 and 2.
When you access a row through an index, you read the B+ tree starting at the root
node and follow the nodes down to the lowest level, which contains the pointer to the
data. In the example above, three read operations are required to find the pointer to the
data.
Keep key size to a minimum for two reasons:
• A smaller key size means that one page in memory holds more key values, which
potentially reduces the number of read operations necessary to look up several
rows.
• A smaller key size can cause fewer B+ tree levels to be used. This is important
from a performance standpoint. An index with a 4-level tree requires one more
read per row than an index with a 3-level tree. If 100,000 rows are read in an
hour, this means that 100,000 fewer reads are required to obtain the same data.
For Informix, the size of a node is the size of one page.
R-tree indexes
Informix also provides R-tree indexes as a registered secondary access method for
tables. R-tree indexes are useful for searching multidimensional spatial data. They are
not covered in this course.

© Copyright IBM Corp. 2001, 2017 3-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

B+ tree splits

Before inserting row After inserting row


with key value 88 with key value 88 88
150
150 292 292
292 >
378
414 378
414
For this example, Level 0
assume only 4 keys
plus their pointers
fit on an index page.

Level 1
Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

B+ tree splits
When a new index item is inserted into a full index node, the node must split. B+
trees grow toward the root. Attempting to add a key into a full node forces a split into
two nodes and promotes the middle key value to a node at a higher level. If the key
value that causes the split is greater than the other keys in the node, it is put into a
node by itself during the split. The promotion of a key to the next higher level can
also cause a split in the higher level node. If the full node at this higher level is the
root, it also splits. When the root splits, the tree grows by one level and a new root
node is created.
In the example, key 88 needs to be added, but the node is full. A split forces half the
keys (378 and 414) into one node and half the keys (88, 150, and 292) into the
other node on the same level. Key 292 is promoted to the next highest level.

© Copyright IBM Corp. 2001, 2017 3-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Indexes: Unique and duplicate

Unique Index
Data
customer_num
105 105 Anthony Higgins Play Ball!
106 113 Lana Beatty Sportstown
113 114 Frank Albertson Sporting Place
114 115 Alfred Higgins Gold Medal
115
106 Philip Currie Phil's Sports

Duplicate Index
Data
lname
Albertson 105 Anthony Higgins Play Ball!
Beatty 113 Lana Beatty Sportstown
Currie 114 Frank Albertson Sporting Place
Higgins 115 Alfred Higgins Gold Medal
106 Philip Currie Phil's Sports

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Indexes: Unique and duplicate


The four characteristics associated with indexes are:
• Unique
• Duplicate
• Composite
• Cluster
A unique index allows no more than one occurrence of a value in the indexed
column. Therefore, a unique index prohibits users from entering duplicate data into
the indexed column. For columns serving as a primary key for a table, a unique
index ensures that the key is unique for every row.
A duplicate index allows identical (duplicate) values in different rows of an indexed
column.
Composite and cluster indexes are discussed in the following pages.

© Copyright IBM Corp. 2001, 2017 3-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Composite index
Index limitations:
 Maximum 16 columns
 Maximum length 380 bytes

stock_num manu_code 1 HSK baseball glove 20.00


1 HSK 4 SMT baseball glove 15.00
1 SMT
2 ANZ 1 SMT tennis balls 8.00
3 HRO 2 ANZ volleyball 30.00
4 SMT
3 HRO volleyball 34.00

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Composite index
An index on two or more columns is called a composite index.
The principal functions of a composite index are to:
• Facilitate multiple column joins
• Increase uniqueness of indexed values
Index limitations
The maximum number of columns that you can use in a composite index is 16.
In addition to the 16-column limit, the maximum size of an Informix index key is 380
bytes. This size is calculated by summing up the lengths of the data types in the index
columns.

© Copyright IBM Corp. 2001, 2017 3-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Using composite indexes


• Joins on customer_num, or customer_num and lname, or
customer_num, lname and fname
• Filters on customer_num, or customer_num and lname, or
customer_num, lname and fname
• ORDER BY on customer_num, or customer_num and lname, or
customer_num, lname, and fname
• Joins on customer_num and filters on lname and fname
• Joins on customer_num and lname, and filter on fname

customer_num lname fname

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Using composite indexes


Composite indexes can be very helpful with improving performance on a query. The
example shows different ways the optimizer can use a composite index.

© Copyright IBM Corp. 2001, 2017 3-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Cluster indexes

Index Data on disk

Adams 101|John|Adams|Adams Sp...

Jones 104|Alex|Jones|All Sports

Smith 103|Bill|Smith|Sports and ...


Wilson 106|Jack|Wilson|Sports to ...

CREATE CLUSTER INDEX ix_cust


ON customer (lname);

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Cluster indexes
When you create a cluster index or alter an existing index to a cluster index, the server
rewrites the data rows in the table to match the order of the index. Since the data is
physically written in the order of the cluster index, each table can have only one cluster
index.
When you use or consider cluster indexes, you need to know that:
• Informix does not maintain clustering of the data rows as new rows are inserted
or as existing key values are updated. Therefore, cluster indexes are most
effectively used on relatively static tables and are less effective on very dynamic
tables.
• You can recluster an index and the data rows at any time with the statement:
ALTER INDEX index_name TO CLUSTER;
ALTER INDEX requirements
When the ALTER INDEX index_name TO CLUSTER statement is executed, the server
makes a copy of the entire table on disk in the order of the index before dropping the
old table. You must have sufficient space available in the dbspace to hold a copy of the
table. ALTER INDEX also requires exclusive access to the table.

© Copyright IBM Corp. 2001, 2017 3-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

The CREATE INDEX statement


Examples:
CREATE INDEX ix_items ON
items(manu_code, stock_num);

CREATE UNIQUE INDEX ix_orders ON


orders(order_num)
IN idx_dbs ONLINE;

CREATE UNIQUE CLUSTER INDEX


ix_manufact ON manufact(manu_code);

CREATE INDEX ix_man_stk ON


items(manu_code desc, stock_num);

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

The CREATE INDEX statement


The examples show different ways to create an index:
• The first example creates a duplicate composite index on two columns:
manu_code and stock_num.
• The second example creates a unique index called ix_orders on the order_num
column and writes the index to the idx_dbs dbspace.
In addition, it specifies the ONLINE clause, which allows the DBA to create the
index while users are still accessing the table. Without the ONLINE clause, the
CREATE INDEX statement requires an exclusive lock on the table while the
index build is running.
You cannot use the ONLINE clause with a CREATE CLUSTER INDEX
statement.
• The third example creates a unique cluster index on the manu_code column.
• The fourth example creates a duplicate composite index with the manu_code in
descending order (the default is ascending order).

© Copyright IBM Corp. 2001, 2017 3-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Since Informix B+ tree indexes can be traversed in either direction, you do not need to
specify the ASC (ascending) or DESC (descending) keyword when you create an index
on a single column. However, you might find it useful to use the DESC keywords for
specific columns in multicolumn indexes. For example, perhaps your applications
frequently retrieve order information sorted by order number and order date in
descending order. An index, such as defined in the following example, eliminates
repeated sorts by the database server:
CREATE INDEX order_ix1 ON orders (order_num, order_date desc);

© Copyright IBM Corp. 2001, 2017 3-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Detached indexes
• A detached index is one that does not follow the table:
 Created in different dbspace from table
 Uses different fragmentation strategy
• Index extents are stored separately from table extents
• By default, index extents are created in the dbspace that holds the
data extents
• An index can be placed in a separate dbspace

CREATE INDEX customer_ix


ON customer (zipcode)
IN cust_ix_dbs;

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Detached indexes
The database server automatically determines the extent size for a detached index.
For a detached index, the database server uses the ratio of the index key size, plus
some overhead bytes, to the row size to assign the extent size for the index. The
server-generated index extent size is calculated as follows:
Index extent_size = ( (index_key_size + 9) / table row size ) * table_extent_size

© Copyright IBM Corp. 2001, 2017 3-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Index fill factor

The DBA can specify the


percentage of each page
that the index will fill
during index creation.

CREATE INDEX state_code_idx


ON state(code)
FILLFACTOR 80;

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Index fill factor


The index fill factor percentage can be set with the CREATE INDEX statement. If it is
not specified with CREATE INDEX, the default used is the value specified in the
Informix configuration parameter FILLFACTOR. If the fill factor is not specified in either
location, the default is 90 percent.
If you do not anticipate many new inserts into the table after the index is built, set the
FILLFACTOR higher when you create the index. If you are expecting many inserts into
the table, use a lower FILLFACTOR to prevent immediate node splits. If the
FILLFACTOR is set too low, you risk an unnecessary increase in the amount of disk
space that the index uses.
When is the fill factor used?
The fill factor is not kept during the life of the index. It is applied only once when the
index is built. It does not take effect unless the table is fragmented, or there are at least
5,000 rows in the table that occupy at least 100 pages of disk space.
The dbschema utility does not list the fill factor used when an index is created.

© Copyright IBM Corp. 2001, 2017 3-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Altering, dropping, and renaming indexes

Examples:
ALTER INDEX ix_man_cd TO CLUSTER;

DROP INDEX ix_stock ONLINE;

RENAME INDEX ix_cust TO new_ix_cust;

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Altering, dropping, and renaming indexes


The visual shows examples of statements to alter, drop, and rename an index.
It is not possible to alter, drop, or rename an index created by the system (such as an
index created to enforce a constraint) because the index name begins with a space. To
alter, drop, or rename a system-created index, you must first drop the related
constraint. Use the CREATE INDEX statement to create the appropriate index on the
specified column, then recreate the constraint. When a suitable index exists to enforce
the constraint, the server uses the existing index instead of generating a system-
created index.
The ONLINE clause on the DROP INDEX statement allows a user to drop an index
while other users are still accessing the table. This effectively stops new users from
using the index while waiting for existing users that have optimized with the index to
finish their SQL.

© Copyright IBM Corp. 2001, 2017 3-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

SYSINDICES and SYSINDEXES system catalogs

• The sysindices table describes each index in the database


• sysindexes is a view into sysindices that separates index columns
SELECT sysindexes.*
FROM sysindexes, systables
WHERE tabname = "items"
AND systables.tabid = sysindexes.tabid;

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

SYSINDICES and SYSINDEXES system catalogs


The sysindices catalog table contains one row for each index defined in the database.
Each row contains, in addition to other information:
• the index name
• the owner
• the tabid of the table
• the index type (unique or duplicate)
• information about whether the index is clustered and the degree of clustering
• the column numbers used in the index
• whether component columns are sorted in ascending or descending order
The sysindexes table contains fields part1 through part16, which identify the columns
on which each index is created. The columns are identified using the column number
from the colno field of the syscolumns table.
If a particular column has been defined to be sorted in descending order in the index,
that column number is displayed as a negative number.

© Copyright IBM Corp. 2001, 2017 3-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Forest of trees index

• A larger B+ tree index divided into smaller subtrees called buckets


• You define which columns are used to hash to a bucket
• You define the number of buckets
• Traditional B+ tree index:
CREATE UNIQUE INDEX security_idx ON
security (s_symb, s_co_id) IN dbs;
• Forest of trees index:
CREATE UNIQUE INDEX security_idx ON
security (s_symb, s_co_id) IN dbs
HASH ON (s_symb) WITH 1000 BUCKETS;

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Forest of trees index


While a B+ tree index has proven to be a very efficient way to store and retrieve data,
two potential problems have been identified.
• Root node contention can occur when many sessions are reading the same
index at the same time.
• A large B+ tree index requires more index levels, which requires more buffer
reads.
The forest of trees feature addresses these problems by splitting a B+ tree index into
smaller subtrees. Each of these has a separate root node, so queues are shorter and
quicker because mutex contention has been spread across many root nodes. Each
smaller subtree has fewer levels to navigate.
Use the CREATE INDEX command to create a forest of trees index. In addition to the
index name, table, and columns, you must also provide the hash column and the
number of buckets. The column specified in the HASH ON clause contains the key
values that are assigned to the buckets.

© Copyright IBM Corp. 2001, 2017 3-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Comparing B+ tree and forest of trees indexes

Traditional b-tree
index
Root node

Branches

Leaves

Forest of trees index

Bucket 1 Bucket 2 Bucket 3


Root nodes

Leaves

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Comparing B+ tree and forest of trees indexes


A traditional B+ tree index has one root node. Access to the underlying branch and
leaf pages are through this page, which is where there can be mutex contention.
In a forest of trees index, multiple buckets are allocated, with each bucket having a
different root node. Rows are assigned based on a hash on the index key values.
A forest of trees index is useful when you have very large tables that are accessed
by many concurrent users. By splitting the index into multiple buckets, all queries
can begin their scan at the root node of the appropriate bucket instead of at a single
root node. This reduces the amount of contention for a single root node page.

© Copyright IBM Corp. 2001, 2017 3-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Exercise 3
Create, alter, and drop indexes
• Create indexes
• Alter indexes
• Drop indexes

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Exercise 3: Create, alter, and drop indexes

© Copyright IBM Corp. 2001, 2017 3-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Exercise 3:
Create, alter, and drop indexes
Purpose:
In this exercise, you will create, alter, and drop indexes.

Task 1. Create indexes on the customer table.


In this task, you will create indexes for the customer table. You will verify that the
indexes were created using the sysindices or sysindexes system catalog tables
and use the oncheck utility to examine the index growth.
1. Run the load script called loadcustomer.sql. This script loads additional
customers into the customer table created in your database. Use the following
command to execute the load script:
$ dbaccess stores_demo loadcustomer.sql
2. The customer table is queried heavily using the customer last name. Create an
index called customer_ix to make this query more efficient.
3. Query the system catalog tables to find information about the customer_ix
index.
4. Use the oncheck utility to examine the index growth. Use the following
command and answer the following questions:
$ oncheck -pT stores_demo:customer | more
• How many indexes are on the customer table?
• How many pages are allocated to the table and index? What do they
contain?
• How many levels and average free bytes?
5. Alter the customer_ix index to change the physical order of the customer data
by customer last name.
6. Create an index called customer_dup on customer last and first names that
allows duplicates.
7. Query the system catalog tables to find information about the customer_dup
index.
8. Run the oncheck -pT report on the customer table again. How many index
pages were allocated for the customer_dup index?

© Copyright IBM Corp. 2001, 2017 3-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Task 2. Create, drop, and rename a composite index on the


orders table.
In this task, you will create, drop, and rename a composite index for the orders
table. You will verify that the index was created using the sysindices system
catalog and use the oncheck utility to examine the index growth.
1. The orders table is queried by the customer number and the date on which the
order was placed. Create an index called orders_ix to make this table more
efficient.
2. Query the system catalog tables to find information about the orders_ix index.
3. Use the oncheck utility to examine the index growth. Use the following
command and answer the following questions:
oncheck -pT stores_demo:orders | more
• How many indexes are on the orders table?
• How many pages are allocated to the table and index? What do they
contain?
• How many levels and average free bytes?
4. To see the most recent orders for a customer first, drop and recreate the
orders_ix index to put the order_date in descending order.
5. Rename the orders_ix index to dateorder_ix.
6. Query the system catalog tables again to find if your index name has changed.
What is the difference between the two system catalog table query results for
the orders table?
Task 3. Create an index on the items table in dbspace3.
In this task, you will create an index for the items table in dbspace3. You will verify
that the index was created using the sysindices system catalog table and use the
oncheck utility to examine the index growth.
1. The items table needs an index on the item_num and order_num columns.
Create an index called items_ix and place it in dbspace3.
2. Query the system catalog tables to find information about the items_ix index.
3. Use the oncheck utility to examine the index growth. Use the following
command to answer the following questions:
$ oncheck -pT stores_demo:items | more
• Where is the items_ix index located?
• How many pages are allocated to the index? What do they contain?
Results:
In this exercise, you created, altered, and dropped indexes.

© Copyright IBM Corp. 2001, 2017 3-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Exercise 3:
Create, alter, and drop indexes - Solutions
Purpose:
In this exercise, you will create, alter, and drop indexes.

Task 1. Create indexes on the customer table.


In this task, you will create indexes for the customer table. You will verify that the
indexes were created using the sysindices or sysindexes system catalog tables
and use the oncheck utility to examine the index growth.
1. Run the load script called loadcustomer.sql. This script loads additional
customers into the customer table created in your database. Use the following
command to execute the load script:
$ dbaccess stores_demo loadcustomer.sql
2. The customer table is queried heavily using the customer last name. Create an
index called customer_ix to make this query more efficient.
CREATE INDEX customer_ix ON customer(lname);
3. Query the system catalog tables to find information about the customer_ix
index (use one or the other, below - you have 2 options).
SELECT i.* FROM sysindices i, systables t
WHERE tabname = "customer"
AND t.tabid = i.tabid;

or…

SELECT i.* FROM sysindexes i, systables t


WHERE tabname = "customer"
AND t.tabid = i.tabid;

© Copyright IBM Corp. 2001, 2017 3-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

4. Use the oncheck utility to examine the index growth. Use the following
command and answer the following questions:
$ oncheck -pT stores_demo:customer | more
• How many indexes are on the customer table?
Count the number of Index Usage Reports displayed by the
command. You should have only one index on the customer table.

• How many pages are allocated to the table and index? What do they
contain?
Add up the 'Number of pages allocated' values from each TBLSpace
Usage Report. You should have around 489 pages for the customer
table and its index, including data pages, bit-map pages, and index
pages. (Your number may vary.)

• How many levels and average free bytes?


Obtain this information from the Index Usage Report. You should see
two levels and about 868 average free bytes. (Your number may vary.)
5. Alter the customer_ix index to change the physical order of the customer data
by customer last name.
ALTER INDEX customer_ix TO CLUSTER;
6. Create an index called customer_dup on customer last and first names that
allows duplicates.
CREATE INDEX customer_dup
ON customer(lname, fname);

© Copyright IBM Corp. 2001, 2017 3-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

7. Query the system catalog tables to find information about the customer_dup
index (use one or the other, below - you have 2 options).
SELECT i.* FROM sysindices i, systables t
WHERE tabname = "customer"
AND t.tabid = i.tabid;

or…

SELECT ix.* FROM sysindexes ix, systables t


WHERE tabname = "customer"
AND t.tabid = ix.tabid;
8. Run the oncheck -pT report on the customer table again. How many index
pages were allocated for the customer_dup index?
Refer to the number of index pages allocated in the TBLspace Usage
Report for this index from the oncheck -pT command. You should see
around 62 pages allocated. (Your number may vary.)

Task 2. Create, drop, and rename a composite index on the


orders table.
In this task, you will create, drop, and rename a composite index for the orders
table. You will verify that the index was created using the sysindices system
catalog and use the oncheck utility to examine the index growth.
1. The orders table is queried by the customer number and the date on which the
order was placed. Create an index called orders_ix to make this table more
efficient.
CREATE INDEX orders_ix
ON orders(customer_num, order_date, order_num);
2. Query the system catalog tables to find information about the orders_ix index.
SELECT ix.* FROM sysindexes ix, systables t
WHERE tabname = "orders"
AND t.tabid = ix.tabid;

© Copyright IBM Corp. 2001, 2017 3-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

3. Use the oncheck utility to examine the index growth. Use the following
command and answer the following questions:
oncheck -pT stores_demo:orders | more
How many indexes are on the orders table?
Count the number of Index Usage Reports in the oncheck output.
There should be only one index.
• How many pages are allocated to the table and index? What do they
contain?
Add up the "Number of pages allocated" values from each TBLSpace
Usage Report. You should have around 39 pages for the orders table
and its index including data pages, free pages, bit-map pages, and
index pages.
• How many levels and average free bytes?
Obtain this information from the Index Usage Report section. You
should see two levels and about 504 average free bytes. (Your
number may vary.)

4. To see the most recent orders for a customer first, drop and recreate the
orders_ix index to put the order_date in descending order.
DROP INDEX orders_ix;
CREATE INDEX orders_ix
ON orders(customer_num, order_date desc, order_num);
5. Rename the orders_ix index to dateorder_ix.
RENAME INDEX orders_ix TO dateorder_ix

© Copyright IBM Corp. 2001, 2017 3-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

6. Query the system catalog tables again to find if your index name has changed.
SELECT ix.* FROM sysindexes ix, systables t
WHERE tabname = "orders"
AND t.tabid = ix.tabid;
What is the difference between the two system catalog table query results for
the orders table?
The part2 column value is now a negative value for the descending
order_date column.
Task 3. Create an index on the items table in dbspace3.
In this task, you will create an index for the items table in dbspace3. You will verify
that the index was created using the sysindices system catalog table and use the
oncheck utility to examine the index growth.
1. The items table needs an index on the item_num and order_num columns.
Create an index called items_ix and place it in dbspace3.
CREATE INDEX items_ix
ON items(item_num, order_num)
IN dbspace3;
2. Query the system catalog tables to find information about the items_ix index.
SELECT i.* FROM sysindices i, systables t
WHERE tabname = "items"
AND t.tabid = i.tabid;
3. Use the oncheck utility to examine the index growth. Use the following
command to answer the following questions:
$ oncheck -pT stores_demo:items | more
• Where is the items_ix index located?
The dbspace location is indicated in the following report header:
Index items_ix fragment partition dbspace3 in DBspace dbspace3
• How many pages are allocated to the index? What do they contain?
Note the "Number of pages allocated" value from each TBLSpace
Usage Report. You should have about 23 pages allocated for the table
and 16 pages for the index including bit-map, data, and index pages.
(Your numbers may vary.)
Results:
In this exercise, you created, altered, and dropped indexes.

© Copyright IBM Corp. 2001, 2017 3-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

Unit summary
• Build an index
• Alter, drop, and rename an index
• Identify the four index characteristics

Creating, altering, and dropping indexes © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 3-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 3 C r e a t i n g , a l t e r i n g , a n d d r o p p i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 3-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Managing and maintaining indexes

Managing and maintaining


indexes

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Unit objectives
• Explain the benefits of indexing
• Evaluate the costs involved when indexing
• Explain the maintenance necessary with indexes
• Describe effective management of indexes
• Enable and disable indexes

Managing and maintaining indexes © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 4-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Benefits of indexing
• Use filtering to reduce the number of pages read (I/O)
• Eliminate sorts
• Ensure uniqueness of key values
• Reduce the number of pages read by using key-only reads

Managing and maintaining indexes © Copyright IBM Corporation 2017

Benefits of indexing
Filtering with indexes
An index on a column or columns can be used to filter the data to identify which data
pages must be read to complete the query.
Sorting with indexed reads
An index on a column or columns can be used to retrieve data in sorted order. By
reading the data using the index, the database server can return the data in the order
requested, in either ascending or descending order, while eliminating the need to
perform a sort operation.
Enforcing uniqueness
When you create an index on a column with the UNIQUE keyword, only one row in the
table can have a column with that value. This prevents the need to perform any
uniqueness checking through the application program.
Key-only selects
When all columns listed in the query are part of the same index, Informix does not read
the data rows (pages), as all of the data is already available in the index.

© Copyright IBM Corp. 2001, 2017 4-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Costs of indexing

Disk space costs Processing time costs

table

insert
update
data index
index delete

Managing and maintaining indexes © Copyright IBM Corporation 2017

Costs of indexing
Disk space costs
The first cost associated with an index is disk space. An index contains a copy of every
unique data value in the indexed columns and an associated 4-byte slot table entry. It
also contains a 4-byte pointer for every row in the table and a 1-byte delete flag. For
indexes on fragmented tables, the 4-byte pointer is expanded to 8 bytes to
accommodate a fragment ID. This can add many pages to the space requirements of
the table. It is not unusual to have as much disk space dedicated to index data as to
row data.
Processing time costs
The second cost is the processing time required while the table is modified. Before a
row is inserted, updated, or deleted, the index key must be located in the B+ tree.
Assume that you need an average of two I/O operations to locate an index entry. Some
index nodes might be in shared memory, while other indexes that need modification
might have to be read from disk. Under these assumptions, index maintenance requires
more time to handle different kinds of modifications as the following sections show.

© Copyright IBM Corp. 2001, 2017 4-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Delete overhead
When a row is deleted from a table, delete flags are set for the keys in all indexes for
the row for later deletion (more overhead). The slot entry in the data page is set to zero.
Insert overhead
When a row is inserted, the related entries are inserted in all indexes. The node for the
inserted row entry is found and rewritten for each index.
Update overhead
When a row is updated, the related entries are located in each index that applies to a
column that was altered. The index entry is rewritten to eliminate the old entry; the new
column value is then located in the same index or a new entry is made.
Many insert and delete operations can also cause a major restructuring of the B+ tree
index, which requires more I/O activity.

© Copyright IBM Corp. 2001, 2017 4-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

B+ tree maintenance
Inserting 'Brown'

Adams
Before After Brown
Downing
Adams Downing
Downing >
Johnson
Smith
Johnson
Smith

Managing and maintaining indexes © Copyright IBM Corporation 2017

B+ tree maintenance
B+ tree maintenance requires that nodes are split, merged, and shuffled to maintain an
efficient tree while accommodating inserts, updates, and deletes of key items. Informix
has built sophisticated node-management techniques to minimize the performance
effect of B+ tree management.
Delete compression
To free index pages and maintain a compact tree, Informix evaluates each node after
physical deletes of index items to determine if the node is a candidate for compression.
If the node is not a root node and it has fewer than three index items, the node
becomes a compression candidate.
Merging
If either the right or left sibling node can accommodate the index keys that remain in the
compression candidate, the index items are merged to the sibling node and the
candidate node is freed for reuse.
Shuffling
If neither sibling can accommodate the index items, the server attempts to balance the
nodes by selecting the sibling node with the most items and shuffling some of those
items to the compression candidate so that both nodes have an equal or nearly equal
number of keys.

© Copyright IBM Corp. 2001, 2017 4-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Shuffling helps maintain a balanced tree and prevents a node from becoming full if its
adjacent nodes are not. This in turn helps reduce the likelihood that node splitting will
be required.
Splits
When a new key value must be added to a full index node, the node must split to make
room for the new key. To perform a node split, the database server must write three
pages.
Goal of index management
The goal of index management is to minimize splits, merges, and shuffles because:
• Extra processing is required.
• The affected index pages must be locked and written by the database server.
• Increased disk space is required.
• After a split, there are new pages that are not full. The partially full nodes increase
the disk space requirements for storing your index.
• Many splits reduce caching effectiveness. If one full node holds 200 key values;
after a split, it might be necessary to cache three pages in memory to have
access to the same 200 keys.

© Copyright IBM Corp. 2001, 2017 4-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Indexing guidelines
• Create an index on:
 Join columns
 Selective filter columns
 Columns frequently used for ordering
• Avoid highly duplicate indexes
• Limit the number of indexes on tables used primarily for data entry
• Keep key size small
• Use composite indexes to increase uniqueness
• Use clustered indexes to speed up retrieval
• Disable indexes before large update, delete, or insert operations

Managing and maintaining indexes © Copyright IBM Corporation 2017

Indexing guidelines
The following visuals review some general guidelines that you can apply to help
determine which indexes to create to ensure optimal query performance without placing
unnecessary overhead on the system.

© Copyright IBM Corp. 2001, 2017 4-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Index join columns

Index columns used to join tables

customer orders

customer_num order_num customer_num


104 1001 104
101 1002 101
105 1003 104
106 1004 106
… … …

Unique index Duplicate index

Managing and maintaining indexes © Copyright IBM Corporation 2017

Index join columns


At least one column named in any join expression should have an index.
If there is no index, the database server will either:
• Build a temporary index before the join and use a sort-merge join or nested loop
join.
• Sequentially scan the table using a hash join.
When an index is present on both columns in a join expression, the optimizer has more
options when it constructs the query plan.
OLTP
As a rule, in OLTP environments place an index on any column that is frequently used
in a join expression. Primary and foreign keys are automatically indexed by the system.
If you decide to index only one of the tables in a join, index the table with unique values
for the key corresponding to the join columns. A unique index is preferable to a
duplicate index for implementing joins.
Indexing in DSS environments
As a rule, in decision support (DSS) environments where large amounts of data are
read and sequential table scans are performed, indexes might not play an optimal role
in implementing joins since using hash joins is the preferred method.

© Copyright IBM Corp. 2001, 2017 4-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Index filter columns


Index selective filter columns
mail

… zipcode

94086
94117
94303 index
94115
94062
92117
92117 94062
95086 94086
94115

Managing and maintaining indexes © Copyright IBM Corporation 2017

Index filter columns


If a column is often used to filter the rows of a large table, consider placing an index on
it. The optimizer can use the index to pick out the needed rows, avoiding a sequential
scan of the entire table. An example is a table that contains a large mailing list. If you
find that a zipcode column is often used to filter out a subset of rows, you should
consider defining an index for it even though it is not used in joins.
This strategy yields a net savings of time only when the selectivity of the column is high,
that is, only when that column does not contain many duplicate values. Non-sequential
scanning through an index takes more disk I/O operations to retrieve more rows than
sequential access. If a filter expression causes a large percentage of the table to be
returned, the database server might as well read the table sequentially.
Generally, indexing a filter column saves time when:
• The column is used in filter expressions in many queries or in queries of a large
table.
• Relatively few duplicate values occur.

© Copyright IBM Corp. 2001, 2017 4-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Index column involved in sorting

Index columns frequently used in ORDER BY or GROUP BY

index orders

order_date customer_num order_date


01/20/2017 101 06/01/2017
03/23/2017 104 01/20/2017
06/01/2017 104 10/12/2017
10/12/2017 104 03/23/2017
… …

Managing and maintaining indexes © Copyright IBM Corporation 2017

Index columns involved in sorting


When a select on a table includes an ORDER BY or GROUP BY clause, the database
server has to put the rows in order. The database server sorts the selected rows by
using an internal sort routine before it returns them to the front-end application. If,
however, an index is present on the ordering columns, the optimizer can read the rows
in sorted order through the index and avoid the final sort. Whether the index is used or
not depends on the complexity of the query.
Since the keys in an index are in sorted sequence, an index on the ordering column(s)
can eliminate sorts during queries.
The example shows a table whose data is not sorted. Without an index on order_date,
the database server would have to sort the data. With an index on order_date, the
database server only needs to read the index (which is in order) to retrieve the data by
order date.

© Copyright IBM Corp. 2001, 2017 4-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Avoid highly duplicate indexes

table

Avoid indexing columns


with many duplicate gender
values
m
f
m
m
f
m

Managing and maintaining indexes © Copyright IBM Corporation 2017

Avoid highly duplicate indexes


When duplicate keys are permitted in an index, entries with any single value are
grouped in a list. When the selectivity of the column is high, these lists are short. But
when there are only a few unique values, the lists become quite long.
In an index on a column whose only values are m for male and f for female, all the
index entries are contained in just two lists of duplicates. Such an index is not very
useful.
When an entry has to be deleted from a list of duplicates, the database server must
read the whole list and rewrite some part of it. When it adds an entry, the database
server puts the new row at the end of the list. Neither operation is a problem until the
number of duplicate values becomes very high. The database server is forced to
perform many I/O operations to read all the entries to find the end of the list. When it
deletes an entry, it typically has to update and rewrite half of the entries in the list.
When such an index is used for querying, performance can also degrade because the
rows addressed by a key value might be spread out over the disk. Imagine an index
that addresses rows whose location alternates from one part of the disk to the other. As
the database server tries to access each row with the index, it must perform one I/O
operation for every row it reads. The database server is better off reading the table
sequentially and applying the filter to each row in turn. If it is important to index a highly
duplicate column, consider forming a composite key with another column that has few
duplicate values.

© Copyright IBM Corp. 2001, 2017 4-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Volatile tables
Avoid heavy indexing of
volatile tables

table

updates updates
inserts inserts
deletes deletes

Managing and maintaining indexes © Copyright IBM Corporation 2017

Volatile tables
Because of the extra reads that must occur when indexes are updated, some
degradation occurs when many indexes are placed on a table that is updated
frequently.
A volatile table should not be heavily indexed unless the amount of querying on the
table outweighs the overhead of maintaining the index file.
During periods of heavy querying (for example, reports), you can improve
performance by creating an index on the appropriate column. Creating indexes for a
large table, however, can be a time-consuming process. Also, while the index is
being created, the table might be exclusively locked preventing other users from
accessing it.

© Copyright IBM Corp. 2001, 2017 4-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Keeping key size small

Key size should be small

More key values can be


stored in a node of a B+
tree if the values are
small

Managing and maintaining indexes © Copyright IBM Corporation 2017

Keeping key size small


Because an index can require a substantial amount of disk space to maintain, it is best
to keep the size of the index small. This is because of the way key values are stored in
the index: the larger the key value, the fewer the keys that can fit in a node of the B+
tree. More nodes (pages) require more I/O operations to access the rows indexed.
When the rows are short or the key values are long, it might be more efficient to read
the table sequentially. There is a certain break-even point between the size of a key
and the efficiency of using that index. This varies according to the number of rows in the
table.
An exception is key-only selects. If all the columns selected in the query are in the
index, the table data is not read, thus the efficiency of such an index is increased.

© Copyright IBM Corp. 2001, 2017 4-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Composite indexes

Use composite indexes to Partial key search:


increase uniqueness
Composite index on
columns a, b, and c

table

a b c

One index can be


used to query:

abc
ab
a

Managing and maintaining indexes © Copyright IBM Corporation 2017

Composite indexes
When you create a composite index to improve query performance, some of the
component columns can also take advantage of this index.
If several columns of one table join with several columns in another table, create a
composite index on the columns of the table with the larger number of rows. If several
columns in a query have filter conditions placed on them regularly, create a composite
index corresponding to filter columns used in the query.
Use a composite index to speed up an INSERT into an indexed column that contains
many duplicate values. Adding a unique (or more unique) column to a column that has
many duplicate values increases the uniqueness of the keys and reduces the length of
the duplicate lists. The query can perform a partial key search by using the first (highly
duplicate) column, which is faster than searching the duplicate lists.
When a table is commonly sorted on several columns, a composite index
corresponding to those columns can help avoid repetitive sorts.

© Copyright IBM Corp. 2001, 2017 4-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Clustered indexes
Clustered indexes can speed up retrieval

customer
After clustering by
customer_num lname lname
101 Pauli

customer_num lname
102 Sadler
103 Currie

104 Higgins
103 Currie 101 Pauli
… 102 Sadler

104 Higgins

Rows are contiguous on disk

Managing and maintaining indexes © Copyright IBM Corporation 2017

Clustered indexes
Clustering is most useful for relatively static tables.
Clustering and reclustering take a lot of space and time. You can avoid some clustering
by loading data into the table in the desired order. The physical order of the rows is their
insertion order, so if the table is initially loaded with ordered data, no clustering is
needed.

© Copyright IBM Corp. 2001, 2017 4-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Drop versus disable indexes

Step 1: Disable index


Step 2: Load or update rows
table Step 3: Enable index

Massive updates or
large loads

Managing and maintaining indexes © Copyright IBM Corporation 2017

Drop versus disable indexes


When you insert or update a large percentage of the rows in a table, consider disabling
or dropping the indexes.
This can have two positive effects:
• First, because there are fewer indexes to update, the process is likely to run
faster. Frequently, the improved performance of the batch process offsets the
time required to rebuild the index after the DISABLE or DROP command.
• Second, newly created indexes are typically more compact and more efficient.
To disable the indexes for a table before a large batch or load operation, use the
syntax:
SET INDEXES FOR tablename DISABLED;
To recreate the index once your operation is complete, use the syntax:
SET INDEXES FOR tablename ENABLED;
The primary benefit of disabling rather than dropping indexes is that entries do not need
to be deleted and reinserted into the system catalog tables.
As another time-saving measure, if you choose not to disable or drop an index, be sure
that the batch operation processes rows in the sequence defined by the primary key
(unique) index. This allows the database server to read the index pages in order and
read each index page only once.

© Copyright IBM Corp. 2001, 2017 4-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Parallel index builds

B-Tree Appender

Parallel Sort

Exchange

Parallel Scan

Chunks

Managing and maintaining indexes © Copyright IBM Corporation 2017

Parallel index builds


Informix implements a sophisticated design to enable fast index builds. This design
divides the index build process into three subtasks. This division of duties is referred to
as vertical parallelism.
• First, scan threads read the data from disk. A scan thread is spawned for each
chunk the table data resides in.
• Next, the data is passed to the sort threads.
• Finally, the sorted sets are appended into a single index tree by the B+ tree
appender thread.
If PDQ PRIORITY is set to 0, the maximum memory allocated for a sort is 128 KB. This
can prevent the server from spawning multiple sort threads for the index build process.
If PDQ is turned on (PDQPRIORITY > 0), the performance gain can be significant.

© Copyright IBM Corp. 2001, 2017 4-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Calculating index size


• Sum up lengths of data types in index columns
• Add 4 for key slot table entry
• Determine number of unique values
• Estimate number of entries per key
• Calculate entry size:
 Detached index on fragmented table = #keys * 9
 Attached index or non-fragmented table = #keys * 5
• Estimate number of keys per page
• Calculate number of leaf nodes
• Calculate number of branch nodes

Managing and maintaining indexes © Copyright IBM Corporation 2017

Calculating index size


Calculating index storage requirements
All indexes are stored in extents separate from the table’s data extents. The database
server calculates an appropriate extent size, but the DBA is responsible for ensuring
that adequate disk space is allocated.
To calculate a conservative estimate of the number of pages required to store an index,
you can use the following formula:
1. Add the total number of bytes size of all the columns in the index. This value is
referred to as colsize.
2. Add 4 to colsize to obtain keysize, the actual size of a key in the index.
3. Calculate the expected proportion of unique entries to the total number of
rows. This value is referred to as propunique. If the index is unique or has few
duplicate values, use 1 for propunique. If the index has a significant proportion
of duplicate entries, divide the number of unique index entries by the number
of rows in the table to obtain a fractional value for propunique. If the resulting
value for propunique is less than 0.01, use 0.01 in the calculations that follow.

© Copyright IBM Corp. 2001, 2017 4-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

4. Estimate the size of a typical index entry, entrysize, with one of the following
formulas, depending on whether the table is fragmented or not:
- For non-fragmented tables or fragmented tables with attached indexes,
use the following formula:
entrysize = keysize * propunique + 5
- For fragmented tables with detached indexes, use the following formula:
entrysize = keysize * propunique + 9
5. Estimate the number of entries per index page with the following formula:
pagents = trunc(pagefree / entrysize)
The trunc function notation indicates that you should round down to the
nearest integer value; pagefree is the pagesize minus the page header (2,020
for a 2-kilobyte pagesize).
6. Estimate the number of leaf pages with the following formula:
leaves = ceiling(rows/pagents)
The ceiling function notation indicates that you should round up to the nearest
integer value; rows is the number of rows that you expect to be in the table.
7. Estimate the number of branch pages at the second level of the index with the
following formula:
branches_0 = ceiling(leaves / pagents)
If the value of branches_0 is greater than 1, more levels remain in the index.
To calculate the number of pages contained in the next level of the index, use
the following formula:
branches_n+1 = ceiling(branches_n / pagents)
where:
- branches_n is the number of branches for the last index level that you
calculated.
- branches_n+1 is the number of branches in the next level.
8. Repeat the calculation in step 7 for each level of the index until the value of
branches_n+1 equals 1.
9. Add the total number of pages for all branch levels calculated in steps 7
through 8. This sum is called the branchtotal.

© Copyright IBM Corp. 2001, 2017 4-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

10. Use the following formula to calculate the number of pages in the compact
index:
compactpages = (leaves + branchtotal)
11. If necessary, incorporate the fill factor into your estimate for index pages:
indexpages = 100 * compactpages / FILLFACTOR
The default fill factor for indexes is determined by the value of the FILLFACTOR
configuration parameter. The fill factor for a specific index can be modified by
including the FILLFACTOR clause in the CREATE INDEX statement in SQL.

© Copyright IBM Corp. 2001, 2017 4-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Exercise 4
Managing and maintaining indexes
 manage database indexes
 decide which columns to use for indexes
 create and drop indexes

Managing and maintaining indexes © Copyright IBM Corporation 2017

Exercise 4: Managing and maintaining indexes

© Copyright IBM Corp. 2001, 2017 4-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Exercise 4:
Managing and maintaining indexes

Purpose:
In this exercise, you will learn how to manage and maintain indexes.

Task 1. Deciding what to index.


In this task, you will decide what columns should be indexed in your database
and calculate the storage requirement for the customer table.
1. Examine each table in your database and find a unique value that should be
indexed for that table.
Customer table:
Catalog table:
Items table:
Orders table:
Stock table:
2. Calculate the storage requirements for the unique index in the customer table
assuming a 2-kilobyte page size.
3. Examine each table in your database and find the duplicate values that should
be indexed for that table.
Customer table:
Catalog table:
Items table:
Orders table:
Stock table:
4. Examine the following tables in your database and find the values that should
be indexed for reordering the table.
Customer table:
Catalog table: :
Items table:
Orders table:
Stock table: :
5. For each table, consider the following questions. If in a classroom, answer these
questions as a group:
• Did I ensure uniqueness of key values?
• Did I index columns that will most likely be included in a SELECT
statement?

© Copyright IBM Corp. 2001, 2017 4-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

• Would a cluster index help with retrieval of data on the table?


• Did I index columns that will most likely be used for ordering the data?
Task 2. Drop and create indexes.
In this task, you will drop the existing indexes and create the new indexes for your
database.

1. Drop any existing indexes on your tables.


2. Create your indexes for each table. For the customer table, create the unique
index on customer_num in dbspace4.
3. Examine the indexes using the sysindices or sysindexes system catalog
tables and index location either using oncheck -pe or oncheck -pT.

Task 3. Enable and disable indexes.


In this task, you will unload and reload the customer table.
1. Unload the customer table to a file using the following command:
UNLOAD TO "cust.unl"
SELECT * FROM customer;
2. Delete all the rows in the customer table using the following command:
TRUNCATE TABLE customer;
3. Reload the customer table using the following command:
LOAD FROM "cust.unl"
INSERT INTO customer;
What should you do before running the load statement on the customer
table?
4. Delete all the rows in the customer table using the following command:
TRUNCATE TABLE customer;
5. Disable the indexes in the customer table.
6. Reload the customer table using the following command:
LOAD FROM "cust.unl"
INSERT INTO customer;
7. Enable the indexes in the customer table.
Results:
In this exercise, you learned how to manage and maintain indexes.

© Copyright IBM Corp. 2001, 2017 4-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Exercise 4:
Managing and maintaining indexes - Solutions
Purpose:
In this exercise, you will learn how to manage and maintain indexes.

Task 1. Deciding what to index.


In this task, you will decide what columns should be indexed in your database
and calculate the storage requirement for the customer table.
1. Examine each table in your database and find a unique value that should be
indexed for that table.
Customer table: customer_num
Catalog table: catalog_num
Items table: item_num, order_num
Orders table: order_num
Stock table: stock_num, manu_code
2. Calculate the storage requirements for the unique index in the customer table
assuming a 2-kilobyte page size.
4 (customer_num) + 4 = 8 keysize
1 = propunique
8(keysize) * 1 (propunique) + 5 = 13 (entrysize)
trunc(2020/13) = 155 pagents
ceiling(4612/155) = 30 leaves
ceiling(30/155) = 1 branches
1 = branchtotal
30 + 1 = 31 compact pages
Indexpages = ceiling(100*31/90) =35

Examine each table in your database and find the duplicate values that should
be indexed for that table.
Customer table: zipcode
Catalog table: stock_num, manu_code
Items table: order_num; stock_num, manu_code
Orders table: customer_num
Stock table: manu_code

© Copyright IBM Corp. 2001, 2017 4-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

3. Examine the following tables in your database and find the values that should
be indexed for reordering the table.
The following answers will vary depending on how the data is accessed.
• Customer table: cluster by lname, fname or by zipcode Catalog table:
cluster by catalog_num
• Items table: cluster by order_num
• Orders table: cluster by order_date or customer_num
• Stock table: cluster by stock_num, manu_code
4. For each table, consider the following questions. If in a classroom, answer these
questions as a group:
• Did I ensure uniqueness of key values?
• Did I index columns that will most likely be included in a SELECT
statement?
• Would a cluster index help with retrieval of data on the table?
• Did I index columns that will most likely be used for ordering the data?
Task 2. Drop and create indexes
In this task, you will drop the existing indexes and create the new indexes for your
database.
1. Drop any existing indexes on your tables.
DROP INDEX index_name;
Hint You can get a list of index names for your tables by running the
following query:
SELECT idxname FROM sysindexes WHERE tabid > 99;
2. Create your indexes for each table. For the customer table, create the unique
index on customer_num in dbspace4.
Customer table: This index was created to ensure unique values in the
customer_num column.
CREATE UNIQUE INDEX customer_num_ix
ON customer(customer_num) IN dbspace4;
Customer table: This index was created as the clustering index.
CREATE CLUSTER INDEX zipcode_ix
ON customer(zipcode);

© Copyright IBM Corp. 2001, 2017 4-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Catalog table: This index was created to ensure unique values in the
catalog_num column, since the SERIAL data type will allow duplicates into the
column. It is also the clustering index.
CREATE UNIQUE CLUSTER INDEX catalog_num_ix
ON catalog (catalog_num);
Catalog table: This index was created to improve performance on queries.
CREATE INDEX catalog_stock_manu_ix
ON catalog(stock_num, manu_code);
Items table: This index was created to ensure unique values. Because the
item_num column can contain duplicates, you need the order_num column to
ensure a unique row is returned to the query.
CREATE UNIQUE INDEX item_num_ix
ON items(item_num,order_num);
Items table: This index was created as the clustering index.
CREATE CLUSTER INDEX item_order_num_ix
ON items(order_num);
Items table: This index was created to improve query performance.
CREATE INDEX item_stock_num_ix
ON items(stock_num, manu_code);
Orders table: This index was created to ensure unique values in the order_num
column.
CREATE UNIQUE CLUSTER INDEX ordernum_ix
ON orders(order_num);
Orders table: This index was created to improve query performance.
CREATE INDEX order_custnum_ix
ON orders(customer_num);
Stock table: This index was created to ensure unique values. Because the
stock_num column can contain duplicates, you need the manu_code
column to ensure a unique row is returned to the query. It is also the
clustering index.
CREATE UNIQUE CLUSTER INDEX stocknum_manucode_ix
ON stock(stock_num, manu_code);
Stock table: This index was created to improve query performance.
CREATE INDEX manu_code_ix
ON stock(manu_code);

© Copyright IBM Corp. 2001, 2017 4-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

3. Examine the indexes using the sysindices or sysindexes system catalog


tables and index location either using oncheck -pe or oncheck -pT.
Output will vary. Results should be similar to the following:
Customer table:
The following customer_num_ix index output from oncheck -pT.

© Copyright IBM Corp. 2001, 2017 4-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

The following zipcode_ix index output is from oncheck -pT.

© Copyright IBM Corp. 2001, 2017 4-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Catalog table:
The following catalog_num_ix index output is from oncheck -pT.

© Copyright IBM Corp. 2001, 2017 4-33


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-34


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

The following catalog_stock_manu_ix index output is from oncheck -pT.

© Copyright IBM Corp. 2001, 2017 4-35


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-36


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Items table:
The following item_num_ix index output is from oncheck -pT.

© Copyright IBM Corp. 2001, 2017 4-37


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-38


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

The following item_order_num_ix index output is from oncheck -pT.

© Copyright IBM Corp. 2001, 2017 4-39


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

The following item_stock_num_ix index output is from oncheck -pT.

© Copyright IBM Corp. 2001, 2017 4-40


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-41


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Orders table:

The following order_custnum_ix index output is from oncheck -pT.

© Copyright IBM Corp. 2001, 2017 4-42


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-43


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

The following ordernum_ix index output is from oncheck -pT.

© Copyright IBM Corp. 2001, 2017 4-44


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-45


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Stock table:

The following manu_code_ix index output is from oncheck -pT.

© Copyright IBM Corp. 2001, 2017 4-46


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-47


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

The following stocknum_manucode_ix index output is from oncheck -


pT.

© Copyright IBM Corp. 2001, 2017 4-48


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-49


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Task 3. Enable and disable indexes.


In this task, you will unload and reload the customer table.
1. Unload the customer table to a file using the following command:
UNLOAD TO "cust.unl"
SELECT * FROM customer;
2. Delete all the rows in the customer table using the following command:
TRUNCATE TABLE customer;
3. Reload the customer table using the following command:
LOAD FROM "cust.unl"
INSERT INTO customer;
What should you do before running the load statement on the customer
table?
Disable the indexes on the customer table.
4. Delete all the rows in the customer table using the following command:
TRUNCATE TABLE customer;
5. Disable the indexes in the customer table.
SET INDEXES FOR customer DISABLED;
6. Reload the customer table using the following command:
LOAD FROM "cust.unl"
INSERT INTO customer;
7. Enable the indexes in the customer table.
SET INDEXES FOR customer ENABLED;
Results:
In this exercise, you learned how to manage and maintain indexes.

© Copyright IBM Corp. 2001, 2017 4-50


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

Unit summary
• Explain the benefits of indexing
• Evaluate the costs involved when indexing
• Explain the maintenance necessary with indexes
• Describe effective management of indexes
• Enable or disable indexes

Managing and maintaining indexes © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 4-51


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 4 M a n a g i n g a n d m a i n t a i n i n g i n d e xe s

© Copyright IBM Corp. 2001, 2017 4-52


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Table and index partitioning

Table and index partitioning

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

© Copyright IBM Corp. 2001, 2017 5-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Unit objectives
• List the ways to fragment a table
• Create a fragmented table
• Create a detached fragmented index
• Describe temporary fragmented table and index usage

Table and index partitioning © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 5-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

What is fragmentation?
Fragmentation is the distribution of data from one table across separate
dbspaces

mytable

Table and index partitioning © Copyright IBM Corporation 2017

What is fragmentation?
Informix supports intelligent horizontal table and index partitioning, referring to it as table
and index fragmentation. Fragmentation allows you to create a table that is treated as a
single table in SQL statements, but consists of multiple tblspaces.
Normal fragmentation calls for one fragment per dbspace. This effectively breaks up the
larger table into multiple smaller table spaces (tables) since a table space cannot span
dbspaces.
The feature called partitioning allows multiple fragments from a fragmented table to co-
exist in the same dbspace. With partitioning, the dbspace name can no longer
represent the fragment since more than one fragment can be in the same dbspace.
Therefore, a partition name is added.
All fragments/partitions of a table must exist in dbspaces with the same pagesize.

© Copyright IBM Corp. 2001, 2017 5-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Fragments and extents

Each fragment is stored in


its own tblspace

tblspace1 tblspace2
EXTENT 1 EXTENT 1

EXTENT 2 EXTENT 2

dbspace_1 dbspace_2

Table and index partitioning © Copyright IBM Corporation 2017

Fragments and extents


Table fragments and index fragments are placed in designated dbspaces. Each
fragment has a separate tblspace ID. The tblspace ID is also known as the fragment ID.
Each tblspace contains separate extents.
Extent sizes
You need to recalculate extent sizes for a fragmented table. When you create
fragmented tables, you specify the extent size for the fragment. In a non-fragmented
table, the extent size is specified for the entire table. You do not specify an extent size
for fragmented indexes. The extent size used for the index fragment is proportional to
the size of the data fragment based on the ratio of index-key data size (including
internal-key overhead) to row size.

© Copyright IBM Corp. 2001, 2017 5-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Advantages of fragmentation
• Advantages of fragmentation include:
 Parallel scans and other parallel operations
 Balanced I/O
 Finer granularity of archives and restores
 Higher availability
 Increased security
 Joins, sorts, aggregates, groups, and inserts

Table and index partitioning © Copyright IBM Corporation 2017

Advantages of fragmentation
The primary advantages of fragmentation include:

Parallel scans If you are in a decision support (DSS) environment


and use the parallel database queries (PDQ)
features in Informix, the database server can read
multiple fragments in parallel. This is advantageous
to DSS queries where large amounts of data are
read.

Balanced I/O By balancing I/O across disks, you can reduce disk
contention and eliminate bottlenecks. This is
advantageous in OLTP systems where a high
degree of throughput is critical.

Archive and Fragmentation provides for a finer granularity of


restore archives and restores. You can perform an archive
and restore at the dbspace level. Since a fragment
resides in a dbspace, this means that you can
perform an archive and restore at the fragment level.

© Copyright IBM Corp. 2001, 2017 5-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Higher You can specify whether to skip unavailable


availability fragments in a table. This is advantageous in DSS
where large amounts of data are read and
processing should not be interrupted if a particular
fragment is unavailable.

Increased You can grant different permissions to each


security fragment, thereby increasing security.

Other Other parallelized operations include: joins, sorts,


operations aggregates, groups, and inserts.

© Copyright IBM Corp. 2001, 2017 5-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Parallel scans and fragmentation

scan scan scan


thread thread thread

fragment 1 fragment 2 fragment 3

Table and index partitioning © Copyright IBM Corporation 2017

Parallel scans and fragmentation


One of the benefits of fragmentation is that it enables parallel scans. A parallel scan is
the simultaneous access of multiple fragments from the same table. In Informix, a single
query can have multiple threads of execution and each thread can potentially access a
different fragment. A query can have multiple threads on a single processor computer
but only one thread executes at a time. The optimum situation is to execute parallel
queries on a multiprocessor computer where many threads can execute
simultaneously.

© Copyright IBM Corp. 2001, 2017 5-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Parallel scans (PDQ queries)


• PDQPRIORITY:
 Enables parallelism for a process or user
 Set the PDQPRIORITY environment variable
export PDQPRIORITY=40

• To enable parallelism for one or more queries or data manipulation


language statements, use the Informix SQL extension:
 SET PDQPRIORITY
SET PDQPRIORITY 40;

Table and index partitioning © Copyright IBM Corporation 2017

Parallel scans (PDQ queries)


Parallel database query, or PDQ, is the feature of Informix that permits queries to be
parallelized. Informix allows to you turn parallelism on and off by using the SET
PDQPRIORITY SQL statement or the PDQPRIORITY environment variable.
To enable parallelism for a group of SQL statements, use the SET PDQPRIORITY
statement. The degree of parallelism specified remains in effect until the next SET
PDQPRIORITY statement is executed or until the end of the process.
To enable parallelism for all statements that a user or process executes, set the
PDQPRIORITY environment variable. The SET PDQPRIORITY statement overrides
the environment variable setting.

© Copyright IBM Corp. 2001, 2017 5-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

DSS queries
• Read many rows resulting in little or no transaction activity
• Read data sequentially
• Execute complex SQL operations
• Create large temporary files
• Measure response times in hours and minutes
• Relatively few concurrent queries

Table and index partitioning © Copyright IBM Corporation 2017

DSS queries
Since PDQ can take up more resources than non-PDQ, this feature should generally be
reserved for decision support (DSS) queries with the characteristics listed in the visual.
Always enable parallelism for DSS queries by using the SET PDQPRIORITY statement
or the PDQPRIORITY environment variable.

© Copyright IBM Corp. 2001, 2017 5-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Balanced I/O and fragmentation

fragment 1 fragment 2 fragment 3

Table and index partitioning © Copyright IBM Corporation 2017

Balanced I/O and fragmentation


You can use fragmentation to balance I/O across disk drives. Individual users can
access different fragments of the same table and not be in contention. Balanced I/O is
more important than parallelism in online transaction processing (OLTP) environments
because maximum throughput of many concurrent queries is critical.

© Copyright IBM Corp. 2001, 2017 5-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

OLTP queries
• Relatively few tables are accessed and few rows are read
• Transaction activity (inserts, updates, and deletes)
• Data is accessed by using indexes
• Simple SQL operations
• Response times measured in seconds and fractions of a second
• Many concurrent queries

Table and index partitioning © Copyright IBM Corporation 2017

OLTP queries
OLTP queries are characterized by:
• Relatively few rows and tables are read
• Transaction activity (inserts, updates, and deletes)
• Data is accessed by using indexes
• Simple SQL operations
• Response times measured in seconds and fractions of a second
• Many concurrent queries
Be sure to use a PDQPRIORITY value of 0 for OLTP queries. This ensures that OLTP
queries are not limited by PDQ resource allocations. The database server is still able to
perform fragmentation elimination for these queries, which is the primary benefit of table
and index partitioning in an OLTP environment.

© Copyright IBM Corp. 2001, 2017 5-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Types of distribution schemes

• Round robin
INSERT INTO t1 VALUES (…);
INSERT INTO t1 VALUES (…);
INSERT INTO t1 VALUES (…);

• Expression-based
INSERT INTO t1 (col1, col2)
VALUES (800,"Active");
INSERT INTO t1 (col1, col2)
VALUES (220,"Active");
INSERT INTO t1 (col1, col2)
VALUES (240,"Active");
col1 <= 100 col1 > 100 and remainder
and col1 < 500
col2 = "Active" and col2 =
"Active"

Table and index partitioning © Copyright IBM Corporation 2017

Types of distribution schemes


Informix provides two types of distribution schemes:
Round robin
• Round robin fragmentation creates even data distributions by randomly
placing rows in fragments.
• For insert statements, the database server uses a hash function on a random
number to determine in which fragment to place the row.
• For insert cursors, the database server places the first row in a random
fragment, the second in the next fragment, and so on, in a true round robin
fashion.
Expression-based
• Expression-based fragmentation allows you to use any WHERE condition as a
fragmentation expression for your table or index. The expression can be any
valid expression that the database server recognizes. You can specify a
fragmentation expression based on a range condition for one column and
equality conditions for another column. There is literally no limit to the possible
expressions. Informix also allows you to specify a remainder fragment to store
all rows that do not match criteria specified for any other fragment.

© Copyright IBM Corp. 2001, 2017 5-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

In the visual example, the row where col1 = 800 is put in the remainder
fragment because it does not match the criteria for the first (col1 < =100) or
second (col1 > 100 and col1 < 500) fragment.

© Copyright IBM Corp. 2001, 2017 5-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Round robin fragmentation


CREATE TABLE table1(
col_1 SERIAL,
col_2 CHAR(20))
FRAGMENT BY ROUND ROBIN
IN dbspace1, dbspace2
EXTENT SIZE 10000
NEXT SIZE 3000;

• Rows are placed randomly among the listed dbspaces


• EXTENT SIZE and NEXT SIZE refer to the size of each table
fragment, not the size of the entire table

Table and index partitioning © Copyright IBM Corporation 2017

Round robin fragmentation


If used, the FRAGMENT BY option is placed before the EXTENT or LOCK MODE
options. The FRAGMENT BY ROUND ROBIN option must specify at least two
dbspaces where the fragments are to be placed.
When a table is created, one extent of the size specified by EXTENT SIZE is reserved
in each dbspace listed. Calculate EXTENT SIZE and NEXT SIZE based on an average
size fragment.
Advantages and disadvantages
The major advantage of the round robin strategy is that you do not need knowledge of
the data to achieve an even distribution among the fragments. Also, when column
values are updated, rows are not moved to other fragments because the distribution
does not depend on column values.
A disadvantage of the round robin strategy is that the query optimizer is not able to
eliminate fragments when it evaluates a query.

© Copyright IBM Corp. 2001, 2017 5-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

When to use round robin


Use the round robin distribution strategy when your queries perform sequential scans
and you have little information about the data being stored. For example, consider using
round robin when the data access method or the data distribution is unknown. Round
robin can also be useful when your application is update intensive, or when loading
data quickly is important.

© Copyright IBM Corp. 2001, 2017 5-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Round robin for smart large objects

CREATE TABLE movie(


movie_num INTEGER,
movie_title CHAR(50),
video BLOB,
audio BLOB,
description CLOB),
PUT video IN (sbsp3, sbsp6, sbsp7),
audio IN (sbsp1, sbsp2, sbsp4),
description IN (sbsp5);

Table and index partitioning © Copyright IBM Corporation 2017

Round robin for smart large objects


Unless the PUT clause is used, the database server stores smart large objects in the
default sbspace (identified in the system catalog table SBSPACENAME.)
It is possible to fragment smart large objects over multiple sbspaces. Although smart
large objects might be distributed among multiple sbspaces, each individual smart large
object is stored entirely within one sbspace. The only method of fragmentation available
is round robin. If your business needs require expression-based fragmentation, it needs
to be implemented at the DataBlade level where you have direct control over where
each smart large object is stored.
In the example, three smart large object columns are fragmented over seven sbspaces.
The order in which the sbspaces are used is not guaranteed, but the distribution is
relatively even. Each time the database server is restarted, the distribution starts over
again from the first sbspace in the list.
The fragmentation for smart large objects is also independent of the storage of the data
rows associated with them. The traditional columns associated with the smart large
objects (here, movie_num and movie_title) can be fragmented by expression, round
robin, or not fragmented at all. In this example, they are not fragmented at all.

© Copyright IBM Corp. 2001, 2017 5-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Expression-based fragmentation

CREATE TABLE table1(


col_1 SERIAL,
col_2 CHAR(20),
...)
FRAGMENT BY EXPRESSION
col_1 <= 10000 AND col_1 >= 1 IN dbspace1,
col_1 <= 20000 AND col_1 > 10000 IN dbspace2,
REMAINDER IN dbspace3;

Each expression is evaluated in order. The row is placed in the first fragment where
the expression evaluates to true and the rest of the expressions are skipped.

Table and index partitioning © Copyright IBM Corporation 2017

Expression-based fragmentation
The FRAGMENT BY EXPRESSION option provides control in placing rows in
fragments. You specify a series of SQL expressions and designated dbspaces. If the
expression is evaluated to true, the row is placed in the corresponding dbspace.
A row should only evaluate to true for one expression. If a row evaluates to true for
more than one expression, it is placed in the dbspace for the first expression that is
true.
The REMAINDER IN clause specifies a dbspace that holds rows that do not evaluate
into any of the expressions.
You can use any column in the table as part of the expression. Columns in other local
or remote tables are not allowed. No subqueries or stored procedures are allowed as
part of the expression.
Using the syntax shown on the visual, only one fragment can exist in a dbspace,
meaning that a dbspace can only be listed once in the fragmentation scheme.

© Copyright IBM Corp. 2001, 2017 5-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Using PARTITIONING
• Allows multiple PARTITIONs to be stored in same dbspace:
 Use PARTITION keyword
 Name the partition

• Example:
CREATE TABLE tab1(a int)
PARTITION BY EXPRESSION
PARTITION part1 (a >=0 AND a < 5)
IN dbspace1,
PARTITION part2 (a >=5 AND a < 10)
IN dbspace1,
... ;

Table and index partitioning © Copyright IBM Corporation 2017

Using PARTITIONING
Partitioning, an enhancement to fragmentation, allows multiple fragments to be stored
in the same dbspace. In order to use this feature, use the PARTITION keyword and
give the partition a name.
Partition information is stored in the sysfragments table. If a fragmented table is created
with partitions, each row in the sysfragments catalog contains a partition name in the
partition column. If a regular fragmented table without partitions is created, the name of
the dbspace appears in the partition column.
You can also use the syntax FRAGMENT BY instead of PARTITION BY when creating
a partitioned table.

© Copyright IBM Corp. 2001, 2017 5-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Logical and relational operators


CREATE TABLE table1(
customer_num SERIAL,
col_2 CHAR(20), ...)
FRAGMENT BY EXPRESSION
customer_num IN (101,7924,9324,3288)
IN dbs1,
customer_num = 4983 OR zipcode = 01803
IN dbs2,
customer_num < 10000
IN dbs3,
customer_num BETWEEN 10000 AND 20000
IN dbs4,
REMAINDER IN dbs5;

Table and index partitioning © Copyright IBM Corporation 2017

Logical and relational operators


An expression-based distribution scheme uses an expression or rule to define which
rows are inserted into specific fragments. Each condition in the rule determines the
contents of one fragment. Up to 2048 fragments and their associated conditions can be
in one table.
You can use the following relational and logical operators in a rule:
• >, <, >=, <=, IN, BETWEEN
• AND, OR
Rules that use these operators are sometimes called range or arbitrary rules.
A single condition can use multiple operators and can reference multiple columns. It is
suggested, however, that you keep conditions as simple as possible to minimize CPU
use and to promote fragment elimination from query plans.

© Copyright IBM Corp. 2001, 2017 5-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Advantages and disadvantages


Distributing data by expression has many potential advantages:
• Fragments can be eliminated from query scans.
• Data can be segregated to support a particular archiving strategy.
• Users can be granted privileges at the fragment level.
• Unequal data distributions can be created to offset an unequal frequency of
access.
A disadvantage of distributing data by expressions is that CPU resources are required
for rule evaluation. As the rule becomes more complex, more CPU time is consumed.
Also, more administrative work is required with expression-based fragmentation than
with round robin. Finding the optimum rule can be an iterative process. Once found, it
might need to be monitored and modified over time.
When to use expression-based fragmentation
The goal of expression-based fragmentation is to increase I/O throughput and fragment
elimination during query optimization. The optimum situation for fragment elimination is
when expression conditions involve a single column and do not overlap.
Consider using an expression strategy when:
• Non-overlapping fragments on a single column can be created.
• The table is accessed with a high degree of selectivity.
• The data access is not evenly distributed.
• Overlapping fragments on single or multiple columns can be created.

© Copyright IBM Corp. 2001, 2017 5-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Fragmentation by expression guidelines


• Avoid REMAINDER IN clauses
• Attempt to balance I/O across disks
• Keep fragmentation expressions simple
• Arrange the conditions so the most restrictive part comes first
• Avoid any expression that must perform a conversion
• Optimize data loads by placing the most frequently accessed fragment
first in your fragmentation statement
• If a significant benefit is not expected, do not fragment the table

Table and index partitioning © Copyright IBM Corporation 2017

Fragmentation by expression guidelines


Once you have determined that fragmenting by expression is the optimal fragmentation
strategy for you, additional guidelines can help you maximize your strategy:
• A remainder fragment is always scanned.
• Distribute data so that I/O activity is balanced across disks. This does not
necessarily mean an even distribution of data.
• Fragmentation expressions can be as complex as you want. However, complex
expressions take more CPU time to evaluate and can prevent the database
server from eliminating fragments.
• In a logical AND operation, if the first clause is false, then the rest of the condition
for that dbspace is not evaluated.
For example, to insert the value 25, 6 evaluations are performed:
x >= 2 and x <= 10 in dbspace1,
x > 12 and x <= 19 in dbspace2,
x > 21 and x <= 29 in dbspace3,
remainder in dbspace4

© Copyright IBM Corp. 2001, 2017 5-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

In the rearranged condition, only four evaluations are required:


x <= 10 and x >= 2 in dbspace1,
x <= 19 and x > 12 in dbspace2,
x <= 29 and x > 21 in dbspace3,
remainder in dbspace4
• A data type conversion causes an increase in the time it takes to evaluate the
condition. For example, a date data type is converted internally to an integer.
• If data loads are one of your primary performance objectives, you might be able
to optimize your data loads by placing the most commonly accessed fragment
first in your fragmentation statement or by using round robin fragmentation.
• If a significant benefit is not expected, do not fragment the table.

© Copyright IBM Corp. 2001, 2017 5-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Using hash functions


CREATE TABLE table1(
customer_num SERIAL,
lname CHAR(20),
...)
FRAGMENT BY EXPRESSION
MOD(customer_num, 3) = 0 IN dbspace1,
MOD(customer_num, 3) = 1 IN dbspace2,
MOD(customer_num, 3) = 2 IN dbspace3;

Table and index partitioning © Copyright IBM Corporation 2017

Using hash functions


You can use a hash function to evenly distribute data across fragments, especially
when the column value might not divide commonly accessed data evenly across
fragments.
The example shows one way that you can create a hash function. The SQL algebraic
function MOD returns the modulus or remainder value for two numeric expressions.
You provide integer expressions for the dividend and the divisor. The value returned is
an integer.
An expression-based distribution scheme that uses a hash function is also referred to
as a hash rule.
Advantages and disadvantages
A hash expression yields an even distribution of data. It also permits fragment
elimination during query optimization when an equality search (including inserts and
deletes) occurs. Fragment elimination does not occur during a range search.

© Copyright IBM Corp. 2001, 2017 5-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

When to use hash expressions


Use a hash expression if data access is from a particular column but the distribution of
values within the column is unknown or unpredictable.
If the goal is to simply distribute data evenly, round robin might be a better choice for
fragmentation strategy since a hash function requires some overhead in performing the
calculation.

© Copyright IBM Corp. 2001, 2017 5-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Fragmentation based on a list


• Use when fragment key has finite set of values
• Example: Fragmentation based on selected states
CREATE TABLE customer (cust_id INT, name CHAR(128),
street CHAR(1024), state CHAR(2), zipcode CHAR(5),phone CHAR(12))
FRAGMENT BY LIST(state)
PARTITION p0 VALUES ('KS','IL') IN dbs0,
PARTITION p1 VALUES ('CA','OR') IN dbs1,
PARTITION p2 VALUES ('NY','MN') IN dbs2,
PARTITION p3 VALUES (NULL) IN dbs3,
PARTITION p4 REMAINDER IN dbs3;

 Table is fragmented on state column, also known as fragment key or


partitioning key
 The null fragment is the fragment that holds rows that have NULL
values for the fragment key column
 The remainder fragment holds rows that do not fit in the explicitly
defined fragments; it must be the last fragment
Table and index partitioning © Copyright IBM Corporation 2017

Fragmentation based on a list


To fragment based on a list of values, use the following syntax for the fragment
expression:
FRAGMENT BY LIST (column_name)
[PARTITION partn_name] VALUES (list_of_values) IN dbspace_name,

© Copyright IBM Corp. 2001, 2017 5-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Fragmentation based on an interval


• Fragment data based on an interval value
• Interval examples:
 Fragment for every month
 Fragment for every million customer records
• Example:
CREATE TABLE product (prod_id INT, prod_desc INT,
prod_price MONEY(10,2))
FRAGMENT BY RANGE(ROUND(prod_price))
INTERVAL(200) STORE IN (dbs1, dbs2, dbs3, dbs4)
PARTITION p0 VALUES < 100 IN dbs0

Table and index partitioning © Copyright IBM Corporation 2017

Fragmentation based on an interval


Use an interval fragmentation strategy when you want to test columns or column
expressions based on an interval of numeric values or dates.
The FRAGMENT BY RANGE key words are used to indicate an interval fragmentation.
Here is one form of the FRAGMENT BY RANGE syntax:
FRAGMENT BY RANGE (column_expression)
INTERVAL (interval_value) STORE IN (dbspace_list)
[PARTITION partn_name] VALUES range_expr IN dbspace_name,

• The column_expression must be a single column or column expression of DATE,
DATETIME, or numeric type.
• The interval_value must be a literal non-zero INT or INTERVAL value.
• The dbspace_list values are the target dbspaces for columns that fall into the
specified interval.
• The range_expr specifies a less than operator and a base value that defines the
range for the first fragment. The base value is used to calculate the range used in
subsequent fragments.
• The dbspace_name is the name of the first fragment used and is based on the
specified VALUES range.

© Copyright IBM Corp. 2001, 2017 5-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

The interval fragments are created in round-robin fashion in the dbspaces specified in
the STORE IN clause. If the dbspace selected for the interval fragment is full or down,
the system skips that dbspace and selects the next one in the list.
In the example, rows with a prod_price (after rounding) that is less than 100 are
assigned to a fragment in dbspace dbs0. If the next row inserted has a prod_price of
102, an interval fragment is created in dbs1 for rows with a prod_price in the range of
100–300 (base value + interval_value). If the next row has a prod_price of 502, then an
interval fragment is created in dbs2 for rows with a prod_price in the range of 500–700.

© Copyright IBM Corp. 2001, 2017 5-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Fragmented/partitioned indexes

Fragment 1 Fragment 2

dbspace1
dbspace1 dbspace2

A dbspace location is An expression-based


specified instead of a fragmentation scheme is
fragmentation strategy. specified. Each index
The entire index is placed fragment occupies a
in one dbspace. separate tblspace.

Table and index partitioning © Copyright IBM Corporation 2017

Fragmented/partitioned indexes
You can decide whether or not you want to fragment indexes. If you fragment your
indexes, you must use an expression-based fragmentation scheme. You cannot use
round robin fragmentation for indexes.
Non-fragmented indexes
If you do not fragment the index, you can put the entire index in a single dbspace. In
this strategy, the resulting index and data pages are separate.
When to use fragmented indexes
Since OLTP applications frequently use indexed access instead of sequential access, it
can be beneficial to fragment indexes in an OLTP environment. DSS applications
generally access data sequentially. Therefore, it is generally not recommended to
fragment indexes in a DSS environment.
System indexes, created to support constraints, remain unfragmented and are created
in the dbspace where the database is created.

© Copyright IBM Corp. 2001, 2017 5-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

If you do not specify a dbspace (nonfragmented) or list of dbspaces (fragmented), then


the index defaults to the same fragmentation strategy as the table. This scenario is not
desirable if your table is fragmented by round robin. It is recommended that you specify
a dbspace for your indexes in this scenario.
All fragments/partitions of an index must exist in dbspaces that have the same
pagesize.

© Copyright IBM Corp. 2001, 2017 5-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

CREATE INDEX statement


• By expression
CREATE INDEX idx1 ON table1(col_1)
FRAGMENT BY EXPRESSION
col_1 < 10000 IN dbspace1,
col_1 >= 10000 IN dbspace2;
• No fragmentation scheme is specified
CREATE INDEX idx1 ON table1(col_1) IN dbspace1;
• Partitioned
CREATE INDEX idx1 ON table1(col_1)
PARTITION BY EXPRESSION
PARTITION ix_part1 col_1 < 10000 IN dbspace1,
PARTITION ix_part2 col_1 >= 10000 IN dbspace1;

Table and index partitioning © Copyright IBM Corporation 2017

CREATE INDEX statement


A FRAGMENT BY EXPRESSION option is available for the CREATE INDEX
statement. If you do not want to fragment your indexes, specify the dbspace where you
want the entire index to be located.
The index fragments are created in a separate tblspace with their own extents.
The PARTITION BY EXPRESSION and FRAGMENT BY EXPRESSION clauses are
interchangeable.

© Copyright IBM Corp. 2001, 2017 5-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

ROWIDS
• Fragmented tables do not contain unique rowids
• To access a fragmented table by rowid, you must explicitly create a
rowid column:
CREATE TABLE orders(
order_num SERIAL,
customer_num INTEGER,
part_num CHAR(20))
WITH ROWIDS
FRAGMENT BY ROUND ROBIN IN dbs1,dbs2;

ALTER TABLE items ADD ROWIDS;

ALTER TABLE items DROP ROWIDS;

Table and index partitioning © Copyright IBM Corporation 2017

ROWIDS
Rowid in a nonfragmented table is an implicit column that uniquely identifies a row in
the table. In a fragmented table, rowids are no longer unique because they can be
duplicated in different fragments. In a fragmented table, rowids require a 4-byte rowid
column.
When you add rowids to a fragmented table, the database server creates an index that
maps the internal unique row address to the new rowid. Access to the table using rowid
is always through the index.
If your application uses rowids, performance can be affected when it accesses
fragmented tables because an index is required for rowid mapping. It is recommended
that you use primary keys instead of rowids for unique row access.

© Copyright IBM Corp. 2001, 2017 5-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Selecting a fragmentation strategy


• Identify the tables being accessed
• Analyze how the tables are being accessed (selectivity, filters)
• Determine whether the environment is DSS or OLTP
• Answer the questions:
 How many CPUs and disks are available?
 Is data loading an important factor?
 Are fragment permissions an important factor?
• Evaluate I/O and adjust the distribution strategy

Table and index partitioning © Copyright IBM Corporation 2017

Selecting a fragmentation strategy


The following guidelines can help you determine how to build the most beneficial
fragmentation schemes for your tables.
• Identify the tables being accessed. Examine your critical SELECT statements
and identify the tables.
• Analyze how the tables are accessed.
• Identify whether the tables are accessed sequentially or by index.
• Determine what filters and join columns are used in the SELECT statements.
• Attempt to use one or more of these columns in an expression strategy. If no
suitable column is present for an expression strategy, or if the table is always
read sequentially, use a round robin distribution scheme.

© Copyright IBM Corp. 2001, 2017 5-33


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

• Determine whether the environment is DSS or OLTP.


• DSS performance increases linearly with the addition of fragments up to the
number of CPUs. In a DSS environment where many of the reads performed
are sequential table scans, you typically do not fragment indexes.
• OLTP performance might not improve linearly with the addition of fragments.
The primary goal in OLTP environments is to reduce the amount of data read
(as opposed to reading data in parallel). You can accomplish this by
fragmenting the table according to frequently used equality and range
conditions to eliminate fragments that must be scanned. OLTP environments
often benefit from the use of fragmented indexes.
• How many CPUs and disks are available?
• Is data loading an important factor?
• If data loading is a persistent issue then a round robin distribution scheme can
provide the best performance.
• Are fragment permissions an important factor?
• Permissions can be granted on a fragment basis. If this is a desired feature,
then distribution by expression must be used.
• Evaluate I/O and adjust the distribution strategy.
• Creating the optimum fragmentation strategy is an iterative process. After you
create your fragments, evaluate the I/O pattern and attempt to achieve
balanced I/O by adjusting the fragmentation rule. You might want to switch
from an expression strategy to a round robin strategy or vice versa.
Monitoring tools that are available include: SET EXPLAIN, onstat -D, onstat -g ppf, and
onstat -g iof.

© Copyright IBM Corp. 2001, 2017 5-34


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Fragmentation of temporary tables

tempdbs1
SELECT * FROM table1
table1 tempdbs2
INTO TEMP temp_table tempdbs3
WITH NO LOG;
temp_table

CREATE TEMP TABLE temp_table (


column1 INTEGER, temp_table
column2 CHAR(10)
) tempdbsn
WITH NO LOG;

Table and index partitioning © Copyright IBM Corporation 2017

Fragmentation of temporary tables


Informix determines whether to fragment temporary tables based on the command
used to create the table and on what temporary dbspaces are listed in the
DBSPACETEMP configuration parameter. You can override the DBSPACETEMP
configuration parameter by setting the DBSPACETEMP environment variable before
running your query. For example, to enable your application to create temporary table
fragments in tempdbs1, tempdbs2, and tempdbs3, run the following command:
export DBSPACETEMP=tempdbs1,tempdbs2,tempdbs3
In the first example above, the resulting temporary table is fragmented by round robin
into all of the temporary dbspaces available to the session based on the setting of the
DBSPACETEMP configuration parameter or environment variable.
In the second example, the temporary table is not fragmented, but is placed in one of
the available temporary dbspaces. If multiple temporary dbspaces are configured, the
next temporary table that is created is placed in a different temporary dbspace to help
balance I/O across all of the temporary dbspaces.

© Copyright IBM Corp. 2001, 2017 5-35


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Creating fragmented temporary tables


Round robin fragmentation:
CREATE TEMP TABLE temp_table (
column1 INTEGER,
column2 CHAR(10))
WITH NO LOG tempdbs1
FRAGMENT BY ROUND ROBIN
IN tempdbs1, tempdbs2, tempdbs3;
tempdbs2

Expression-based fragmentation:
tempdbs3
CREATE TEMP TABLE temp_table (
column1 INTEGER,
temp_table
column2 CHAR(10))
WITH NO LOG
FRAGMENT BY EXPRESSION
column1 < 1000 in tempdbs1,
column1 < 2000 in tempdbs2,
column1 >= 2000 in tempdbs3;
Table and index partitioning © Copyright IBM Corporation 2017

Creating fragmented temporary tables


You can also fragment temporary tables using round robin or expression-based
fragmentation, as shown in the examples.

© Copyright IBM Corp. 2001, 2017 5-36


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Fragmenting an index

Discussion: Is this statement valid?

CREATE UNIQUE INDEX idx1 ON tabl(col1)


FRAGMENT BY EXPRESSION
col2 <= 10 IN dbsp1,
col2 > 10 AND col2 <= 100 IN dbsp2,
col2 > 100 IN dbsp3;

Table and index partitioning © Copyright IBM Corporation 2017

Fragmenting an index
The database server has the following restrictions on temporary indexes:
• You cannot fragment indexes by round robin.
• You cannot fragment unique indexes by an expression that contains columns that
are not in the index key.

© Copyright IBM Corp. 2001, 2017 5-37


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

System Catalog: sysfragments

• The sysfragments table contains one row for each fragment


• Important columns:
 fragtype: Table or index.
 tabid: The unique table ID of the table.
 partn: The unique partition number for this fragment. This is the tablespace
identifier for the fragment. A fragmented table has multiple table spaces -
one for each fragment.
 strategy: Expression or round robin.
 partition: The partition name or dbspace name.

Table and index partitioning © Copyright IBM Corporation 2017

System Catalog: sysfragments


The sysfragments system catalog table contains information about each partition of
each fragmented table, and on each partition of every index.

© Copyright IBM Corp. 2001, 2017 5-38


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Optional discussion: Case study

Case
Study

Table and index partitioning © Copyright IBM Corporation 2017

Optional discussion: Case study


This example is from a case study published in Informix Tech Notes, 1995, Volume 5,
Issue 1; Planning and Tuning of Fragmentation Strategies by Subhash Bhatia.
A customer wants to use fragmentation to optimize their queries against their inventory
table. The inventory table row size is 43 bytes, it has 10 columns, and a maximum of
450 million rows. They expect to accommodate 20 gigabytes of data.
The database server is used mostly for OLTP queries. The system configuration
consists of an SMP computer with six processors and 30 single-ended SCSI disk drives
(2 gigabytes each) attached to 10 disk controllers (three disk drives per controller).
The analysis of the queries against the inventory table yielded the following information:
Query Characteristics Filters Scan type
1 None business_id, qty Index
2 None business_id Index
3 None business_id Index
4 None family_id Sequential
5 None family_id Sequential
6 GROUP BY, AVG Sequential
7 GROUP BY, AVG business_id Sequential

© Copyright IBM Corp. 2001, 2017 5-39


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Queries 1 and 2 are most often used.


12 fragments were selected based on the fact that the six hardware CPUs and a
parallel scan of two disks per CPU did not pose an I/O bottleneck. Each fragment was
placed in a dbspace located on a separate drive.
The business_id column was selected as the column on which to base the expression-
based fragmentation because it was used in most of the queries.
The index was also built using the same fragmentation strategy except that two
expressions used for the table were collapsed into one fragment for the index, and the
dbspaces selected were spaces not used for the table data.
After the fragmentation was completed, the queries were run with the following results:
Query Filters Scan type Fragments scanned

1 business_id, qty Index 1


2 business_id Index 1
3 business_id Index 1
4 family_id Sequential All
5 None Sequential All
6 Sequential All
7 business_id Sequential 1

© Copyright IBM Corp. 2001, 2017 5-40


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Exercise 5
Table and index partitioning
• work with fragmentation strategies on tables
• work with fragmentation strategies on indexes

Table and index partitioning © Copyright IBM Corporation 2017

Exercise 5: Table and index partitioning

© Copyright IBM Corp. 2001, 2017 5-41


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Exercise 5:
Table and index partitioning

Purpose:
In this exercise, you will learn how to partition tables and indexes.

Task 1. Expression fragmentation on the customer table.


In this task, you will run dbschema, and then unload and drop the customer table.
You will create fragmentation on the table using fragment by expression, and then
reload it. You will check to see that the table has been fragmented using information
from the system catalog tables.
1. Use the dbschema utility to generate a file named customer.sql, which
contains the SQL statements necessary to recreate the customer table. Make
sure you include the extent sizes and dbspace location. Important: Be sure to
save the index information in a separate file to use later in this exercise.
2. Unload the customer table to a file named customer.unl.
3. Drop the customer table.
4. Edit the customer.sql schema file:
• Make sure to start your customer_num column at 101.
• Create a fragmentation strategy on the customer table. The data will be
fragmented by expression on the lname column using dbspaces
dbspace2, dbspace3, and dbspace4. Use the following distribution
scheme for the lname column:
• A–I
• J–Q
• R–Z
• Save any CREATE INDEX statements into a separate file and delete
them from the schema file. The indexes will be created later.
Should the extent size be changed?
5. Create the new customer table by running your customer.sql file in dbaccess.
6. Query the sysfragments system catalog table to verify the information about
your fragmentation on the customer table.
7. Load the customer table using the customer.unl file. Use the following
command:
LOAD FROM "customer.unl"
INSERT INTO customer;

© Copyright IBM Corp. 2001, 2017 5-42


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Task 2. Round robin fragmentation on the orders table.


In this task, you will run dbschema, and then unload and drop the orders table. You
will create a round robin fragmentation on the table and reload it. You will check to
see that the table has been fragmented using information from the oncheck utility.
1. Use the dbschema utility to generate a file named orders.sql, which contains
the SQL statements necessary to recreate the table. Make sure you include the
extent sizes and dbspace locations. Important: Be sure to save the index
information in a separate file to use later in this exercise.
2. Unload the orders table to a file named orders.unl.
3. Drop the orders table.
4. Edit the orders.sql schema file.
• Make sure to start your order_num column in the orders table at 1001.
• Create a round robin fragmentation strategy using dbspaces dbspace3
and dbspace4.
• Save any CREATE INDEX statements into a separate file and delete
them from the schema file. The indexes will be created later.
5. Create the new orders table by running your orders.sql files in dbaccess.
6. Load the orders table using the orders.unl file.
7. Use the oncheck -pt command to verify the information about your
fragmentation on the orders table.
Task 3. Create indexes on the customer table.
In this task, you will create the indexes for the customer table from the file you
saved in the previous exercise. The indexes will be created in dbspace2 and
dbspace3. You will use the oncheck utility to verify the location of the indexes on
the customer table.
1. Using the file with the index statements you saved in the previous exercise,
create the customer_num_ix index in dbspace2 and the zipcode_ix index in
dbspace3 for the customer table.
2. Use the oncheck -pt command to verify the location of the customer_num_ix
and zipcode_ix indexes.

© Copyright IBM Corp. 2001, 2017 5-43


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Task 4. Create indexes on the orders table.


1. Create the indexes for the orders table using your saved index script file from
the previous exercise.
2. Why did Step 1 fail?
3. Drop the order_custnum_ix index.
4. Create the indexes for the orders table in dbspace3 and dbspace4 using
named partitions with the following distribution scheme:
Index order_custnum_ix (customer_num):
• customer_num < 500 in a partition named ordercust1 in dbspace3
• customer_num >= 500 in a partition named ordercust2 in dbspace3
Unique index ordernum_ix (order_num):
• order_num < 1250 in a partition named ordernum1 in dbspace4
• order_num >= 1250 in a partition named ordernum2 in dbspace4
Use the oncheck -pt command to verify the location of the
order_custnum_ix and ordernum_ix indexes on the orders table.
Task 5. Challenge task.
If you have completed the previous tasks, try this challenge task.
1. Examine the following fragment scheme.
CREATE TABLE ....
FRAGMENT BY EXPRESSION
lname [1,1] < "I" IN dbspace2,
lname [1,1] < "R" AND lname [1,1] >= "I" IN dbspace3,
lname [1,1] >= "R" IN dbspace4
EXTENT SIZE 60
...
Why might this fragmentation scheme be faster and more robust?
Results:
In this exercise, you learned how to partition tables and indexes.

© Copyright IBM Corp. 2001, 2017 5-44


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Exercise 5 :
Table and index partitioning - Solutions
Purpose:
In this exercise, you will learn how to partition tables and indexes.

Task 1. Expression fragmentation on the customer table.


In this task, you will run dbschema, and then unload and drop the customer table.
You will create fragmentation on the table using fragment by expression, and then
reload it. You will check to see that the table has been fragmented using information
from the system catalog tables.
1. Use the dbschema utility to generate a file named customer.sql, which
contains the SQL statements necessary to recreate the customer table. Make
sure you include the extent sizes and dbspace location. Important: Be sure to
save the index information in a separate file to use later in this exercise.
$ dbschema -d stores_demo -t customer -ss customer.sql
2. Unload the customer table to a file named customer.unl.
UNLOAD TO customer.unl
SELECT * FROM customer;
3. Drop the customer table.
DROP TABLE customer;
4. Edit the customer.sql schema file, and be sure to Save (not run) once it has
been modified. Take note of the following, when you Modify:
• Make sure to start your customer_num column at 101.
• Create a fragmentation strategy on the customer table. The data will be
fragmented by expression on the lname column using dbspaces
dbspace2, dbspace3, and dbspace4. Use the following distribution
scheme for the lname column:
• A–I
• J–Q
• R–Z
• Save any CREATE INDEX statements into a separate file and delete
them from the schema file. The indexes will be created later.

© Copyright IBM Corp. 2001, 2017 5-45


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Should the extent size be changed?


Yes, the extent sizes now apply to each fragment. You can calculate the
extent size using the percentage of records contained in each fragment
of the distribution scheme. Sometimes this might vary per fragment
depending on the distribution of the data.

CREATE TABLE customer (


customer_num SERIAL(101),
fname CHAR(15),
lname CHAR(15),
company CHAR(20),
address1 CHAR(20),
address2 CHAR(20),
city CHAR(15),
state CHAR(2),
zipcode CHAR(5),
phone CHAR(18)
)
FRAGMENT BY EXPRESSION
lname [1,1] < “J” AND lname [1,1] >= “A”
IN dbspace2,
lname [1,1] < “R” AND lname [1,1] >= "J"
IN dbspace3,
lname [1,1] <= "Z" AND lname [1,1] >= "R"
IN dbspace4
EXTENT SIZE 60 NEXT SIZE 8 LOCK MODE ROW;
5. Create the new customer table by running your customer.sql file in dbaccess.
$ dbaccess stores_demo customer.sql
6. Query the sysfragments system catalog table to verify the information about
your fragmentation on the customer table.
SELECT t.tabname, f.*
FROM systables t, sysfragments f
WHERE t.tabname = "customer"
AND t.tabid = f.tabid;
7. Load the customer table using the customer.unl file. Use the following
command:
LOAD FROM "customer.unl"
INSERT INTO customer;

© Copyright IBM Corp. 2001, 2017 5-46


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Task 2. Round robin fragmentation on the orders table.


In this task, you will run dbschema, and then unload and drop the orders table. You
will create a round robin fragmentation on the table and reload it. You will check to
see that the table has been fragmented using information from the oncheck utility.
1. Use the dbschema utility to generate a file named orders.sql, which contains
the SQL statements necessary to recreate the table. Make sure you include the
extent sizes and dbspace locations. Important: Be sure to save the index
information in a separate file to use later in this exercise.
$ dbschema -d stores_demo -t orders -ss orders.sql
2. Unload the orders table to a file named orders.unl.
UNLOAD TO "orders.unl" SELECT * FROM orders;
3. Drop the orders table.
DROP TABLE orders;
4. Edit the orders.sql schema file, and be sure to Save (not run) once it has been
modified. Take note of the following, when you Modify:
• Make sure to start your order_num column in the orders table at 1001.
• Create a round robin fragmentation strategy using dbspaces dbspace3
and dbspace4.
• Save any CREATE INDEX statements into a separate file and delete
them from the schema file. The indexes will be created later.
CREATE TABLE orders (
order_num SERIAL(1001),
order_date DATE,
customer_num INTEGER,
ship_instruct CHAR(20),
backlog CHAR(1),
po_num CHAR(10),
ship_date DATE,
ship_weight DECIMAL(8,2),
ship_charge MONEY(6,2),
paid_date DATE
)
FRAGMENT BY ROUND ROBIN
IN dbspace3, dbspace4;

© Copyright IBM Corp. 2001, 2017 5-47


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

5. Create the new orders table by running your orders.sql files in dbaccess.
$ dbaccess stores_demo orders.sql
6. Load the orders table using the orders.unl file.
LOAD FROM "orders.unl" INSERT INTO orders;
7. Use the oncheck -pt command to verify the information about your
fragmentation on the orders table.
$ oncheck -pt stores_demo:orders | more

Table orders:

© Copyright IBM Corp. 2001, 2017 5-48


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

© Copyright IBM Corp. 2001, 2017 5-49


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Task 3. Create indexes on the customer table.


In this task, you will create the indexes for the customer table from the file you
saved in a previous task. The indexes will be created in dbspace2 and dbspace3.
You will use the oncheck utility to verify the location of the indexes on the
customer table.
1. Using the file with the index statements you saved in the previous exercise,
create the customer_num_ix index in dbspace2 and the zipcode_ix index in
dbspace3 for the customer table.
CREATE UNIQUE INDEX customer_num_ix
ON customer (customer_num)
IN dbspace2;
CREATE INDEX zipcode_ix ON customer (zipcode)
IN dbspace3;
2. Use the oncheck -pt command to verify the location of the customer_num_ix
and zipcode_ix indexes.
$ oncheck -pt stores_demo:customer

The location of the customer_num_ix index:

© Copyright IBM Corp. 2001, 2017 5-50


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

The location of the zipcode_ix index:

Task 4. Create indexes on the orders table.


1. Create the indexes for the orders table using your saved index script file from
the previous exercise.
CREATE INDEX order_custnum_ix
ON orders (customer_num);
CREATE UNIQUE INDEX ordernum_ix
ON orders (order_num);

Attempts to execute these commands should result in the following error on the
second statement:
872: Invalid fragment strategy or expression for the unique index.

© Copyright IBM Corp. 2001, 2017 5-51


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

2. Why did Step 1 fail?


The orders table uses round robin fragmentation. Creating a unique index
without specifying a dbspace results in the error shown above.
Indexes cannot be explicitly fragmented by round robin. A dbspace needs
to be specified to create the indexes or the indexes need to be fragmented by
expression.
Duplicate indexes, however, can become fragmented round robin if the
table is and no dbspace or fragmentation expression is specified in the
CREATE INDEX statement. The duplicate index was created, but it is a round
robin index.
3. Drop the order_custnum_ix index.
DROP INDEX order_custnum_ix;
4. Create the indexes for the orders table in dbspace3 and dbspace4 using
named partitions with the following distribution scheme:
Index order_custnum_ix (customer_num):
• customer_num < 500 in a partition named ordercust1 in dbspace3
• customer_num >= 500 in a partition named ordercust2 in dbspace3
Unique index ordernum_ix (order_num):
• order_num < 1250 in a partition named ordernum1 in dbspace4
• order_num >= 1250 in a partition named ordernum2 in dbspace4
CREATE INDEX order_custnum_ix ON orders(customer_num)
PARTITION BY EXPRESSION
PARTITION ordercust1 (customer_num < 500) IN dbspace3,
PARTITION ordercust2 (customer_num >= 500) IN dbspace3;

CREATE UNIQUE INDEX ordernum_ix ON orders(order_num)


PARTITION BY EXPRESSION
PARTITION ordernum1 (order_num < 1250) IN dbspace4,
PARTITION ordernum2 (order_num >= 1250) IN dbspace4;

© Copyright IBM Corp. 2001, 2017 5-52


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

5. Use the oncheck -pt command to verify the location of the order_custnum_ix
and ordernum_ix indexes on the orders table.
$ oncheck -pt stores_demo:orders | more

Index order_custnum_ix:

© Copyright IBM Corp. 2001, 2017 5-53


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

© Copyright IBM Corp. 2001, 2017 5-54


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Index ordernum_ix:

© Copyright IBM Corp. 2001, 2017 5-55


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

© Copyright IBM Corp. 2001, 2017 5-56


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Task 5. Challenge task.


If you have completed the previous tasks, try this challenge task.
1. Examine the following fragment scheme.
CREATE TABLE ....
FRAGMENT BY EXPRESSION
lname [1,1] < "I" IN dbspace2,
lname [1,1] < "R" AND lname [1,1] >= "I" IN dbspace3,
lname [1,1] >= "R" IN dbspace4
EXTENT SIZE 60
...
Why might this fragmentation scheme be faster and more robust?
This fragmentation scheme covers all possible values of the first
character position of lname (punctuation and nulls included) not just the
values that you are expecting. Also, the fragmentation scheme for
dbspace3 puts more restrictive conditions in front of the and keyword.
Therefore, the second condition after the and keyword is not evaluated
as often.
Results:
In this exercise, you learned how to partition tables and indexes.

© Copyright IBM Corp. 2001, 2017 5-57


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 5 Ta b l e a n d i n d e x p a r t i t i o n i n g

Unit summary
• List the ways to fragment a table
• Create a fragmented table
• Create a detached fragmented index
• Describe temporary fragmented table and index usage

Table and index partitioning © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 5-58


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Maintaining table and index partitioning

Maintaining table and index


partitioning

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Unit 6 Maintaining table and index partitioning

© Copyright IBM Corp. 2001, 2017 6-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Unit objectives
• Change the fragmentation strategy of a table
• Change the fragmentation strategy of an index
• Explain how to skip inaccessible fragments

Maintaining table and index partitioning © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 6-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

The ALTER FRAGMENT statement


• ALTER FRAGMENT ... INIT
 Initialize a new fragmentation scheme
• ALTER FRAGMENT ... ADD
 Add an additional fragment
• ALTER FRAGMENT ... DROP
 Drop a fragment
• ALTER FRAGMENT ... MODIFY
 Modify a fragmentation expression or dbspace
• ALTER FRAGMENT ... ATTACH or DETACH:
 Combine tables into a single fragmented table, or
 Move a fragment into a separate table

Maintaining table and index partitioning © Copyright IBM Corporation 2017

The ALTER FRAGMENT statement


Use the ALTER FRAGMENT statement if you want to change your fragmentation
strategy. For example, if by monitoring the I/O on your table fragments you determine
that there is a bottleneck caused by unbalanced I/O, you should consider modifying
your original fragmentation strategy.
The individual options are discussed in detail on the following pages.
The ALTER FRAGMENT clause works the same for partitioned tables. However, the
SQL syntax is changed slightly to include the PARTITION keyword and name.

© Copyright IBM Corp. 2001, 2017 6-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Initializing a new fragmentation strategy


• Make a fragmented table non-fragmented
ALTER FRAGMENT ON TABLE table1
INIT IN dbspace2;
• Make a non-fragmented table fragmented
ALTER FRAGMENT ON TABLE table1
INIT FRAGMENT BY ROUND ROBIN
IN dbspace1, dbspace2;
• Completely change the fragmentation strategy
ALTER FRAGMENT ON TABLE table1
INIT FRAGMENT BY EXPRESSION
col_1 <= 10000 AND col_1 >= 1
IN dbspace1,
col_1 <= 20000 AND col_1 > 10000
IN dbspace2;

Maintaining table and index partitioning © Copyright IBM Corporation 2017

Initializing a new fragmentation strategy


The INIT clause of the ALTER FRAGMENT statement can do any of the following:
• Make a fragmented table non-fragmented. Include the dbspace name where the
non-fragmented table is to be placed.
• Make a non-fragmented table fragmented. Include the fragmentation scheme.
• Completely change the fragmentation strategy. The INIT clause is an easy way to
change the entire fragmentation scheme; for example, changing from round robin
to expression based.
Here is an example using a partitioned table:
ALTER FRAGMENT ON TABLE table1
INIT PARTITION BY EXPRESSION
PARTITION part1 col_1 BETWEEN 0 AND 5000 IN dbspace1,
PARTITION part2 col_1 > 5000 IN dbspace2;

© Copyright IBM Corp. 2001, 2017 6-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Adding an additional fragment


• Use the ADD clause to add additional fragments
• Expression:
ALTER FRAGMENT ON TABLE orders
ADD note_code > 3000 IN dbspace4;
ALTER FRAGMENT ON TABLE orders
ADD note_code <= 3000 OR note_code = 3500
IN dbspace3 BEFORE dbspace4;
• Round robin:
ALTER FRAGMENT ON TABLE customer
ADD dbspace3;

Maintaining table and index partitioning © Copyright IBM Corporation 2017

Adding an additional fragment


During execution of the ADD command, the rows are shuffled to comply with the new
distribution scheme.
Example
The first two examples add new dbspaces for expression-based fragmentation. The
BEFORE or AFTER clause is used to insert the new condition either before or after
existing conditions. This can be important because conditions within an expression are
evaluated sequentially. New conditions cannot be added after a remainder clause. If
BEFORE or AFTER is not specified, the dbspace is added at the end of the expression
but before any remainder clause.
The third example adds another dbspace for round robin fragmentation.
Add the PARTITION keyword and the partition name after the ADD keyword to add a
new partition to an existing table.

© Copyright IBM Corp. 2001, 2017 6-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Dropping a fragment
• Use the DROP clause to drop a fragment and move all the rows (or
index keys) in the dropped fragment to another fragment.

ALTER FRAGMENT ON TABLE table1


DROP dbspace1;

ALTER FRAGMENT ON TABLE table1


DROP PARTITION part1;

ALTER FRAGMENT ON INDEX table1_idx1


DROP dbspace4;

Maintaining table and index partitioning © Copyright IBM Corporation 2017

Dropping a fragment
When dropping a fragment, make sure the other fragments have enough space to hold
the rows that are to be moved there. For example, in an expression-based
fragmentation scheme, the rows in the dropped fragment are most likely to go to the
remainder fragment.
Dropping the number of fragments to less than two is not allowed.
To drop a partition, use the partition name instead of the dbspace in which it resides.

© Copyright IBM Corp. 2001, 2017 6-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Modifying an existing fragment

• Use the MODIFY clause to change the expression or dbspace for a


fragment
ALTER FRAGMENT ON TABLE table1
MODIFY dbspace1 TO col_1 > 30000
IN dbspace1;
ALTER FRAGMENT ON TABLE table1
MODIFY part1 TO
PARTITION part1 col_1 > 30000
IN dbspace1;
ALTER FRAGMENT ON TABLE table1
MODIFY dbspace3 TO
REMAINDER IN dbspace5;

Maintaining table and index partitioning © Copyright IBM Corporation 2017

Modifying an existing fragment


If you change the expression, rows in the existing fragment not matching the
expression are moved to the appropriate fragment. If no fragment exists for an
expression that does not evaluate to TRUE and the row must be moved, an error is
returned and the ALTER FRAGMENT fails.
To modify a partition, use the partition name instead of the dbspace in which it resides.
Remember to include the PARTITION keyword and the partition name in the TO
clause, or a regular fragment is created if one does not already exist in the dbspace
specified.

© Copyright IBM Corp. 2001, 2017 6-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Attaching and detaching fragments


• Use the ATTACH clause to combine two tables with identical schemas
into one fragmented table.
ALTER FRAGMENT ON TABLE table1
ATTACH table1, table2;
• Use the DETACH clause to separate a part of a fragmented table into
a non-fragmented table.
ALTER FRAGMENT ON TABLE table1
DETACH dbspace2 table2;
ALTER FRAGMENT ON TABLE table1
DETACH PARTITION part1 table2;

Maintaining table and index partitioning © Copyright IBM Corporation 2017

Attaching and detaching fragments


The ATTACH and DETACH clauses provide additional flexibility in modifying
fragmented tables.
ATTACH
You can combine two identical tables into one table that is fragmented by using the
ATTACH clause. In the example above, table1 and table2 are two tables that are
combined into one table, named table1. Both tables must have identical schemas and
must be in different dbspaces. No referential, primary key, or unique constraints are
allowed in either table. The consumed table cannot have serial columns and the
surviving table cannot have check constraints.
Index rebuilds with ATTACH
Rebuilding indexes can be avoided if the newly added fragment is symmetric to the
fragmentation of the table.
• There can be no data overlap of the newly added fragment with the existing table
fragments.
• The index for the newly added fragment must be on the same set of columns as
the index of the target table.

© Copyright IBM Corp. 2001, 2017 6-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

• The index must have the same properties (unique, duplicate) as the index of the
target table.
• The index for the newly added fragment cannot reside in any of the dbspaces
used by the index fragments of the target table.
DETACH
You can extract a fragment to create a separate table by using the DETACH clause.
Once a fragment is detached, the table that is created can be dropped. This is useful in
situations where a rolling set of fragments is maintained over time and new fragments
are added and old fragments removed.
Index rebuilds with DETACH
Index rebuilds on the original table are not necessary if the index fragmentation strategy
of the detached fragment is identical to or highly parallel with the table fragmentation. In
that case, the index fragments corresponding to the detached fragment are simply
dropped.
The DETACH command does not work on tables created WITH ROWIDS.
Use the same syntax to attach and detach partitions, except you would use the
PARTITION partition_name syntax instead of dbspace_name syntax, as shown in the
last example on the visual.

© Copyright IBM Corp. 2001, 2017 6-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

How is ALTER FRAGMENT executed?


• Databases with logging:
 ALTER FRAGMENT executes as a single transaction.
 The entire table is locked during the statement.
 If a row is moved to a fragment, it is deleted in the old location and added in
the fragment.
• Databases without logging:
 The old fragments are kept intact until the ALTER FRAGMENT operation
completes.
 The entire table is locked during the statement.

Maintaining table and index partitioning © Copyright IBM Corporation 2017

How is ALTER FRAGMENT executed?


For databases with logging, ALTER FRAGMENT is executed as follows:
• The statement executes as a single transaction, with the movement of each row
added as an entry in the logical log. Because of the potentially large number of
log entries, you might encounter a long transaction. For very large tables,
consider turning off logging during the alter - or separating the statement into
smaller ALTER FRAGMENT statements.
• The entire table is locked exclusively during execution of the statement.
• If a row needs to be moved to another fragment, it is deleted from the old location
and added to the new fragment. The disk space for the old row is freed as soon
as the row is moved, but the extent is still allocated until it is entirely emptied.
Make sure that you have enough disk space to accommodate the fragment from
which the row is deleted as well as the fragment to which it is added.
For databases without logging, ALTER FRAGMENT is executed as follows:
• The fragment is kept intact until the ALTER FRAGMENT statement completes.
Make sure that you have enough disk space to accommodate both the old and
new fragments.
• The entire table is locked during execution of the statement.

© Copyright IBM Corp. 2001, 2017 6-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Skipping inaccessible fragments


• Informix allows you to ignore unavailable fragments in a table using the
SET DATASKIP statement in SQL or the DATASKIP parameter:
 To turn on dataskip:
SET DATASKIP ON;
 To turn off dataskip:
SET DATASKIP OFF;
 To skip specific fragments:
SET DATASKIP ON dbspace1;
 Follow the skip strategy set by the configuration parameter:
SET DATASKIP DEFAULT;

Maintaining table and index partitioning © Copyright IBM Corporation 2017

Skipping inaccessible fragments


You can use the SQL statement SET DATASKIP or the DATASKIP configuration
parameter to choose whether to skip unavailable fragments during a SELECT
operation.
Whenever a fragment is skipped, the sqlca.sqlwarn.sqlwarn7 flag is set to W (Informix
ESQL/C and ESQL/COBOL).
An unavailable fragment cannot be skipped under the following circumstances:
• Referential integrity: In order to delete a parent row, the child rows must also be
available for deletion. In order to insert a child row, the parent row must be
available.
• Updates: An update that must move a row from one fragment to another requires
that both fragments be available.
• Inserts or deletes: A row that must be put in a specific fragment (because of
expression-based fragmentation) requires that the fragment be available.
• Indexes: An index key must be available if an INSERT, UPDATE, or DELETE
affects that key.
• Serial keys: The first fragment stores the current serial key value. An INSERT that
requires the next serial value requires the first fragment.

© Copyright IBM Corp. 2001, 2017 6-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Defragmenting partitions
• After appending data to partitions, you might end up with many
extents; mapping a logical page number to a physical address
becomes slow
• Chunk allocation (allocating space from a chunk) is also much slower if
you have many small extents—this is a common operation
• Defragment by table name (call task or admin):
EXECUTE FUNCTION task("defragment",
"database:[owner.]table");
• Defragment by partition number (call task or admin)
EXECUTE FUNCTION task("defragment partnum",
partition_number_list);
• Example
EXECUTE FUNCTION task("defragment","oltr:tab1");

Maintaining table and index partitioning © Copyright IBM Corporation 2017

Defragmenting partitions
As rows are inserted into tables over time, the number of extents allocated for the table
increases and may become fragmented within a partition. This could lead to a decrease
in server performance.
The Partition Defragmenter feature addresses this problem by reorganizing the table
into fewer, larger extents. The Defragmenter can be performed while the database
server is online, so no downtime is required.
The Partition Defragmenter is initiated by executing an administrative function (task or
admin) and specifying defragment as the first argument. If you specify a table name as
the next argument, then all partitions for the table are defragmented. To defragment just
one partition of a table, or to defragment index partitions, specify the partition number of
the table or index partition.

© Copyright IBM Corp. 2001, 2017 6-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Exercise 6
Maintaining table and index partitioning
 Modify fragmentation schemes

Maintaining table and index partitioning © Copyright IBM Corporation 2017

Exercise 6: Maintaining table and index partitioning

© Copyright IBM Corp. 2001, 2017 6-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Exercise 6:
Maintaining table and index partitioning

Purpose:
In this exercise, you will learn how to maintain table and index partitioning.

Task 1. Altering the customer table.


In this task, you will balance the data in the dbspaces of the customer table by
altering the distribution scheme. You will use the oncheck utility to determine the
correct distribution.
1. Using the oncheck command on the customer table, determine the number of
data pages used in each dbspace.
2. Is the data distributed evenly for the balance of I/O across disks?
3. Modify the existing fragments using the ALTER FRAGMENT statement to
create an even distribution of data in the customer table. Hint: You can get an
idea of the distribution of data by using the following SQL statement:
SELECT lname[1,1], count(*)
FROM customer
GROUP BY 1
ORDER BY 1;
4. Use the oncheck command to verify the data distribution is even.
Task 2. Adding a fragment to the orders table.
In this task, you will add a fragment to the orders table, load the table with
additional data, and check to see how the data was distributed using the oncheck
utility.
1. Using the oncheck command, how many rows are in each dbspace of the
orders table?
2. Add a fragment to the orders table fragmentation schema using dbspace2.
3. Run the load script called loadorders2.sql. This script loads additional rows
into the orders table. Use the following command to execute the load script:
$ dbaccess stores_demo loadorders2.sql
4. Using the oncheck command again, how many rows are now in each dbspace
of the orders table.
5. Is the data distributed evenly for the balance of I/O across disks?
Why or why not?

© Copyright IBM Corp. 2001, 2017 6-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Task 3. Alter fragmentation of an index on the customer table.


1. Initialize a fragmentation strategy on the customer_num_ix index of the
customer table using dbspaces dbspace2 and dbspace3. Use the following
distribution for the customer_num column:
• 100 - 2499 in dbspace2
• 2500 and greater in dbspace3
2. Using the oncheck command, how many used pages are in each dbspace of
the customer_num_ix index.
Results: In this exercise, you learned how to maintain table and index
partitions.

© Copyright IBM Corp. 2001, 2017 6-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Exercise 6 :
Maintaining table and index partitioning - Solutions

Purpose:
In this exercise, you will learn how to maintain table and index partitioning.

Task 1. Altering the customer table.


In this task, you will balance the data in the dbspaces of the customer table by
altering the distribution scheme. You will use the oncheck utility to determine the
correct distribution.
1. Using the oncheck command on the customer table, determine the number of
data pages used in each dbspace.
$ oncheck -pt stores_demo:customer

© Copyright IBM Corp. 2001, 2017 6-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

© Copyright IBM Corp. 2001, 2017 6-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

2. Is the data distributed evenly for the balance of I/O across disks? No.
3. Modify the existing fragments using the ALTER FRAGMENT statement to
create an even distribution of data in the customer table. Hint: You can get an
idea of the distribution of data by using the following SQL statement:
SELECT lname[1,1], count(*)
FROM customer
GROUP BY 1
ORDER BY 1;

ALTER FRAGMENT ON TABLE customer


INIT FRAGMENT BY EXPRESSION
lname[1,1] < "H" and lname[1,1] >= "A"
IN dbspace2,
lname[1,1] < "T" and lname[1,1] >= "H"
IN dbspace3,
lname[1,1] <= "Z" and lname [1,1] >= "T"
IN dbspace4;
4. Use the oncheck command to verify the data distribution is even.
$ oncheck -pt stores_demo:customer

© Copyright IBM Corp. 2001, 2017 6-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

© Copyright IBM Corp. 2001, 2017 6-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Task 2. Adding a fragment to the orders table.


In this task, you will add a fragment to the orders table, load the table with
additional data, and check to see how the data was distributed using the oncheck
utility.
1. Using the oncheck command, how many rows are in each dbspace of the
orders table?
$ oncheck -pt stores_demo:orders

© Copyright IBM Corp. 2001, 2017 6-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

© Copyright IBM Corp. 2001, 2017 6-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

2. Add a fragment to the orders table fragmentation schema using dbspace2.


ALTER FRAGMENT ON TABLE orders
ADD dbspace2;
3. Run the load script called loadorders2.sql. This script loads additional rows
into the orders table. Use the following command to execute the load script:
$ dbaccess stores_demo loadorders2.sql
4. Using the oncheck command again, how many rows are now in each dbspace
of the orders table.
$ oncheck -pt stores_demo:orders

© Copyright IBM Corp. 2001, 2017 6-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

© Copyright IBM Corp. 2001, 2017 6-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

5. Is the data distributed evenly for the balance of I/O across disks?
6. Why or why not? No, the data is not redistributed in round robin
fragmentation when you have altered the schema to add another dbspace.
The newly added fragment only contains its portion of the data just
inserted.
Task 3. Alter fragmentation of an index on the customer table.
1. Initialize a fragmentation strategy on the customer_num_ix index of the
customer table using dbspaces dbspace2 and dbspace3. Use the following
distribution for the customer_num column:
• 100 - 2499 in dbspace2
• 2500 and greater in dbspace3
ALTER FRAGMENT ON INDEX customer_num_ix
INIT FRAGMENT BY EXPRESSION
customer_num < 2500 AND customer_num >= 100
IN dbspace2,
customer_num >= 2500
IN dbspace3;
2. Using the oncheck command, how many used pages are in each dbspace of
the customer_num_ix index.
$ oncheck -pt stores_demo:customer

© Copyright IBM Corp. 2001, 2017 6-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Results: In this exercise, you learned how to maintain table and index
partitions.

© Copyright IBM Corp. 2001, 2017 6-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

Unit summary
• Change the fragmentation strategy of a table
• Change the fragmentation strategy of an index
• Explain how to skip inaccessible fragments

Maintaining table and index partitioning © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 6-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 6 Maintaining table and index partitioning

© Copyright IBM Corp. 2001, 2017 6-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
The Informix query optimizer

The Informix query optimizer

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

© Copyright IBM Corp. 2001, 2017 7-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Unit objectives
• Understand query plans, access plans, and join plans
• Write queries that produce various index scans

The Informix query optimizer © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 7-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

The query plan


• The query plan is the combination of plans that the optimizer chooses,
including:
 Access plan
 Join plan
• An optimal query plan:
 Minimizes the number of pages read
 Eliminates unnecessary sorts

The Informix query optimizer © Copyright IBM Corporation 2017

The query plan


A query plan is the road map that the query optimizer chooses for retrieving the
requested data. The optimizer evaluates the different ways in which a query could be
performed and selects the best way to access the requested data. The result is the
query plan which will be used to process the query.
A good query plan allows the Informix server to read the fewest pages possible, and
eliminates sorts for ORDER BY and GROUP BY clauses by reading data in the
required order. The query plan includes a plan for accessing data and a plan for
performing joins between tables.

© Copyright IBM Corp. 2001, 2017 7-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Access plan
• Which access method to use:
 Index scan
 Sequential scan
• Which order to process tables

Table B Table A Table C

Index B-03 Index A-01 Index C-02

The Informix query optimizer © Copyright IBM Corporation 2017

Access plan
Informix can choose from various methods to retrieve data from disk. The method
chosen for retrieving data is called the access plan. Informix uses the following access
plans:
• Sequential scan: The database server reads all the rows of the table in physical
order.
• Index scan: The database server reads the index pages, applying filters where
possible, and uses the ROWID values stored in the index to retrieve qualifying
rows.
• Key-only index scan: When all of the data required to satisfy the query is
contained within the index, the database server retrieves the data requested from
the index, eliminating the need to read the associated data pages.
• Key-first index scan: A key-first index scan uses index-key filters, in addition to
upper and lower filters, to reduce the number of rows that a query must read.

© Copyright IBM Corp. 2001, 2017 7-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

• Auto-index scan: The auto-index scan is a feature that allows the database server
to automatically build a temporary index on one or more columns used as query
filters. The database server reads this temporary index to locate the required
data. The index is only available for the duration of the query. The auto-index
feature of Informix can be beneficial to some OLTP batch activities. It allows the
query to benefit from the index without the overhead of index maintenance during
insert, update, and delete activity. In active OLTP environments, the overhead of
modifying indexes when a table has rows inserted, updated, or deleted can be
significant.

© Copyright IBM Corp. 2001, 2017 7-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Join plan
• Which join method to use:
 Nested-loop join
 Dynamic hash join
• Which order to join tables

Table B Table A

The Informix query optimizer © Copyright IBM Corporation 2017

Join plan
When a query contains more than one table, query optimizer must determine how and
in which order to join the tables using filters in the query. The way that the optimizer
chooses to join the tables is the join plan. Informix uses the following join plans:
• Nested-loop join: In a nested-loop join, the database server scans the first, or
outer table, and then joins each of the rows that pass table filters to the rows
found in the second, or inner table.
• Dynamic hash join: In a dynamic hash join, the rows of the first table, or build
table, are processed through an internal hash algorithm, and then the hashed
rows are stored in a hash table. The second table, or probe table, is then read
sequentially, and each row is checked against the hash table to see if there are
matches from the build table.

© Copyright IBM Corp. 2001, 2017 7-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Join method: Nested-loop join

Table 1 Table 2

The Informix query optimizer © Copyright IBM Corporation 2017

Join method: Nested-loop join


In a nested-loop join, a row from the first table, called the outer table, is read, and the
value in the join column is used to select corresponding rows from the second table,
called the inner table.
The database server accesses the outer table using either an index or a table scan,
applying any table filters. For each row that satisfies the filters on the outer table, the
database server reads the inner table to find a match (or matches). Thus, the database
server accesses the inner table once for every row in the outer table that satisfies the
filters. If the inner table is indexed, an index search can be used. If the inner table does
not have an index, the database server might construct an auto-index when the query is
executed, or it may choose to scan the inner table for each qualifying row in the outer
table.

© Copyright IBM Corp. 2001, 2017 7-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Join method: Dynamic hash join

Table 1 (build table)


hash
algorithm

hash
table
Table 2 (probe table)

hash
algorithm

The Informix query optimizer © Copyright IBM Corporation 2017

Join method: Dynamic hash join


In a dynamic hash join, the rows of the first table, the build table, are processed through
a hash algorithm, and the hashed rows are then stored in a hash table.
The second table, the probe table, is then read sequentially, and each row is checked
against its potential hash location to see if there are matches from the build table.
This is best used when there are tables of vastly different sizes, or when neither of the
two join tables has an index on the join column. No index and no sorting is required
when a dynamic hash join is performed. The smaller table should be the build table for
best performance, as the hash table might be stored in memory.

© Copyright IBM Corp. 2001, 2017 7-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Evaluate information for each table


• Examine selectivity of every filter
• Determine if indexes can be used for:
 Filters
 ORDER BY or GROUP BY clauses
• Find the best way to access a table:
 Sequentially
 With an index

The Informix query optimizer © Copyright IBM Corporation 2017

Evaluate information for each table


The optimizer compares the cost of each possible plan to determine which will be the
least costly. The cost is derived from estimates of the number of I/O operations
required, the number and types of calculations to be performed, the number of rows to
be accessed, the amount of sorting to be performed, and so on.
Since query costs are largely determined by the number of rows that must be read from
each table, the optimizer begins building a query plan by considering the conditional
expressions in the WHERE clause to determine how many rows qualify. These
conditional expressions are often referred to as filters.
Each filter is examined to determine selectivity. Selectivity is a number between 0 and 1
that indicates the fraction of rows that the optimizer estimates it needs to read. If all
rows meet the filter, the selectivity is 1. A very selective filter has a selectivity near 0.
To determine the selectivity for a filter, the optimizer analyzes the data distributions for
the column. If distributions are not available, the optimizer must apply a function to
calculate the anticipated selectivity. A list of filter expressions and the associated
selectivity assigned to these expressions is provided on the following page for your
reference.
Next, the optimizer determines whether an existing index can be used to facilitate
retrieval based on a filter condition, ORDER BY, or GROUP BY clause.

© Copyright IBM Corp. 2001, 2017 7-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Finally, the optimizer decides whether the table is to be scanned sequentially or with an
index.
FILTER SELECTIVITY ASSIGNMENTS

Filter Expression Selectivity ( F )


indexed-col = literal value F=1/(number of distinct keys in index)
indexed-col = host-variable "
indexed-col is NULL "
tab1.indexed-col = tab2.indexed-col F=1/ (number of distinct keys in the larger index)
indexed-col > literal value F= ( 2nd-max - literal-value)/(2nd-max - 2nd-min)
indexes-col < literal-value F = (literal-value - 2nd-min)/(2nd-max - 2nd-min)
any-col is NULL F=1/10
any-col = any-expression "
any-col > any-expression F=1/3
any-col < any-expression "
any-col MATCHES any-expression F=1/5
any-col LIKE any-expression "
EXISTS subquery F=1 if subquery estimated to return >0 rows, else 0
NOT expression F=1 - F(expression)
expr1 AND expr2 F=F(expr1) x F(expr2)
expr1 OR expr2 F=F(expr1) + F(expr2) - (F(expr1) x F(expr2))
any-col IN list Treated as any-col = item_1 OR ... OR anycol =
item_n
any-col relational_operator ANY Treated as any-col relational_operator value_1 OR
subquery ... OR any-col relational_operator value_n for
estimated size of subquery n
KEY:
indexed-col: First or only column in an index.
2nd-max, 2nd-min: Second largest and second smallest key values in indexed column.
any-col: Any column not covered by a preceding formula.

© Copyright IBM Corp. 2001, 2017 7-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Determining the query plan


• The optimizer first constructs all the possible join pairs by applying a
bottom-up, breadth-first search
• Next, the optimizer:
 Considers the I/O and CPU costs associated with each access path and join
pair by evaluating:
− Table information
− Index information
− Distribution data, if available
 Eliminates the more expensive of any two equivalent join pairs
 Selects the lowest cost of the remaining join pairs

ab ac ad bc ....
abc abd acb acd adb adc

abcd abdc acbd acdb adbc adcb


The Informix query optimizer © Copyright IBM Corporation 2017

Determining the query plan


In determining a query plan, the optimizer considers the possible access paths for each
table and the possible join methods for each pair of tables. Next, the optimizer might
eliminate the more expensive of any two equivalent join pairs.
If the query joins three or more tables, this process is continued. The costs of a join are
considered for each remaining join pair. Equivalent paths are eliminated.
Finally, the optimizer selects the remaining join sequence with the lowest estimated
cost.

© Copyright IBM Corp. 2001, 2017 7-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Generating the query plan

• Generate the query plan:


− SET EXPLAIN ON;
− SELECT ....;
− UPDATE ....; Executes query and
− SELECT ....; produces a text file that
contains the selected
− INSERT ....; query plans
− DELETE ....;
− SET EXPLAIN OFF;

• Generate the query plan without executing SQL


− SET EXPLAIN ON AVOID_EXECUTE;

The Informix query optimizer © Copyright IBM Corporation 2017

Generating the query plan


To capture the query plan and join plan information that the optimizer chooses, you
must enable the SET EXPLAIN feature.
After SET EXPLAIN ON is executed by a process or user, the database server
produces a text file that contains:
• An estimate cost for the query plan
• The order in which the tables are accessed
• Whether temporary tables are needed to process the query
• The access method for each table
• The join method for each join pair
When the optimizer has completed writing the query plan, the server executes the
query using this plan.
AVOID_EXECUTE
You can use the AVOID_EXECUTE option with SET EXPLAIN to generate a query
plan without actually executing the query, as shown here:
SET EXPLAIN ON AVOID_EXECUTE;

© Copyright IBM Corp. 2001, 2017 7-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

The explain file


• Default locations and name:
 UNIX:
− ${PWD}/sqexplain.out
− ${HOME}/sqexplain.out

 Windows:
− %INFORMIXDIR%\sqexpln\username.out

• Can specify output file location and name using


SET EXPLAIN FILE TO
 Turns Explain on
 Sets location and name
− SET EXPLAIN FILE TO 'sqexpl.out';
− SET EXPLAIN FILE TO '/tmp/expl/sqexpl.out';
− SET EXPLAIN FILE TO 'C:\tmp\sqexpl.out';
• Appends output to an already existing file
The Informix query optimizer © Copyright IBM Corporation 2017

The explain file


On UNIX and Linux systems, the default text file created is named sqexplain.out.
If the client application that issues the SET EXPLAIN ON statement is located on the
same computer as the database instance, the sqexplain.out file is created in the current
working directory. If the client application is located on a different computer, the
sqexplain.out file is created in the home directory on the database server computer.
You can specify an explicit location and name for the output file using the SET
EXPLAIN FILE TO syntax.
The output is always appended to any existing file.
The output file is written on the database server. The user must have write permissions
into the target directory or the file is not written and an error is returned.
The FILE TO option cannot be used with the AVOID_EXECUTE option in the same
SET EXPLAIN statement.
This functionality can be accomplished by issuing two consecutive SET EXPLAIN
statements:
SET EXPLAIN FILE TO '/tmp/sqexpl.out';
SET EXPLAIN ON AVOID_EXECUTE;

© Copyright IBM Corp. 2001, 2017 7-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

The query plan

The Informix query optimizer © Copyright IBM Corporation 2017

The query plan


The query plan includes the query being optimized, as well as the estimated cost and
number of rows returned. Additionally, it describes how each table will be accessed and
any joins that might be used.

© Copyright IBM Corp. 2001, 2017 7-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Query statistics
• Query statistics:
 Statistics aggregated and printed out at iterator level
 Only available after query completes
• Query statistics ON by default:
 Set with ONCONFIG parameter EXPLAIN_STAT:
−1 = on (Default)
−0 = off
 Configure instance using onmode –wf and onmode –wm
$ onmode –wf EXPLAIN_STAT=0
 Configure session using onmode –Y <session_id> [0|1|2]
− Dynamic explain feature
− Prints out query statistics by default when enabled:
• 0 = turn off dynamic explain
• 1 = turn on dynamic explain, query plan and statistics
• 2 = turn on dynamic explain, query plan only
The Informix query optimizer © Copyright IBM Corporation 2017

Query statistics
The output from SET EXPLAIN ON can include detailed information about the tables
scanned in a query. The output displays the estimated and actual number of rows
produced and scanned, and the amount of time it took for each scan and join.
These statistics are provided at the iterator level (each process/scan/join), but the
values are only available after the query completes.
Query statistics are on by default. They can be disabled by setting the onconfig
parameter EXPLAIN_STAT to 0. Query stats can also be configured dynamically at the
instance level by using the onmode -wf and onmode -wm commands with the
parameter EXPLAIN_STAT with the values 0 (to turn off) or 1 (to turn on).
Query stats for an individual session can be dynamically managed using the
onmode -Y command with a session ID and value.
0 – Turns off Dynamic Explain
1 – Enables Dynamic Explain with a query plan and query statistics
2 – Enables Dynamic Explain with a query plan only

© Copyright IBM Corp. 2001, 2017 7-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

EXPLAIN query statistics output


QUERY:
------
SELECT * FROM customer, orders
WHERE customer.customer_num = orders.customer_num;
Query statistics:
-----------------

The Informix query optimizer © Copyright IBM Corporation 2017

EXPLAIN query statistics output


The output from SET EXPLAIN can display scan information in addition to the query
plan. This output shows the additional information provided.
The data includes the estimated and actual number of rows produced and scanned,
and the amount of time it took for each scan and join.
The query must complete for the fields to be completely populated. If you use the
AVOID_EXECUTE parameter, you do not get this output in the explain plan.

© Copyright IBM Corp. 2001, 2017 7-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Analyzing query plans


 Sequential with a temporary file
 Sequential scan with filter
 Key-only index scan - Aggregate
 Index Scan: lower index filters
 Index Scan: lower and upper index filters
 Dynamic Hash Join
 Hash Join: parallel scan and sort threads
 Nested loop join
 Key-first index scan
 Key-only index scan
 Skip duplicate index scan
 Index self joins
 Multi-index scan

The Informix query optimizer © Copyright IBM Corporation 2017

Analyzing query plans


In the next pages, we will look at some of the query plans that can be chosen by the
optimizer, including those shown above.

© Copyright IBM Corp. 2001, 2017 7-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Sequential scan with temporary file


SELECT * FROM stock ORDER BY description;

Estimated cost: 20
Estimated # of Rows Returned 74
Temporary Files Required For Order by

1) informix.stock: SEQUENTIAL SCAN

The Informix query optimizer © Copyright IBM Corporation 2017

Sequential scan with temporary file


Even the simplest query is optimized to find the best access strategy. When a query
path is chosen before a query is run, the statistics that the optimizer keeps are put in
the sqexplain.out file.
Estimated cost
The estimated cost of each query is included in the query report. In the example above,
the estimated cost is 20. The units are relevant only in the context of comparison with
other possible paths for the same query. The estimated cost does not reflect either how
long the query will take or what the cost in resources will be.
Estimated rows returned
The optimizer also writes the estimated number of rows to be returned to the
sqexplain.out file. This is only an estimate, but usually comes reasonably close to the
actual number of rows returned. This estimate is most accurate when all filter and join
conditions are associated with indexes, and when the statistics for the tables involved in
the query are up-to-date.

© Copyright IBM Corp. 2001, 2017 7-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Temporary file
When a temporary table or file is created for the query, the reason for the temporary file
or table is listed in the sqexplain.out file. In the example, you can see that a sort was
required to process the ORDER BY clause. The sort requires space to hold
intermediate files. No temporary file is created if you can use an index to order the
tuples (rows).
Only the selected path is reported by the SET EXPLAIN command. You cannot
determine what alternate paths were considered.
Table access strategy
Finally, the access strategy for each table in the query is shown in the report. In the
example on the slide, the table is accessed by a SEQUENTIAL SCAN, which means
that the entire table is read from beginning to end.
Query statistics
If the EXPLAIN_STAT configuration parameter is enabled, a query statistics section is
also included in the explain output file. You can use this information to debug possible
performance problems with the SQL statement. For this query, the query statistics
section appears as follows:

© Copyright IBM Corp. 2001, 2017 7-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Sequential scan with filter


SELECT * FROM stock WHERE unit_price > 20;

Estimated cost: 5
Estimated # of Rows Returned: 25

1) informix.stock: SEQUENTIAL SCAN


Filters: informix.stock.unit_price > $20.00;

The Informix query optimizer © Copyright IBM Corporation 2017

Sequential scan with filter


When the query includes a WHERE clause, the expressions in that clause can either
specify join conditions (indicating the columns that join the tables in the query) or filter
conditions.
Filter
A filter condition is an expression that specifies a rule that is a condition to be placed on
the row to satisfy the query. When the optimizer chooses to access a table with a
SEQUENTIAL SCAN, and filter conditions are placed on columns in that table, the filter
is listed in the Filters section of the output. These are conditions that are placed on the
rows after they were read from the database but before they are returned to the user’s
program.
In the example shown, all the rows in the table are read during the SEQUENTIAL
SCAN. Before any row is returned to the user's program, however, the filter condition
stock.unit_price > 20 is applied to the row. If it meets this condition, the row is returned
to the user. If not, the row is not considered for any further processing.

© Copyright IBM Corp. 2001, 2017 7-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Query statistics
The query statistics for this query are as follows:

© Copyright IBM Corp. 2001, 2017 7-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Key-only index scan (aggregate)


SELECT max(order_num) FROM orders;

Estimated cost: 2
Estimated # Rows Returned: 1

1) informix.orders: INDEX PATH


(1) Index Name: informix.ordernum_ix
Index Keys: order_num (Key-Only)
(Aggregate)
(Serial, fragments: ALL

The Informix query optimizer © Copyright IBM Corporation 2017

Key-only index scan (aggregate)


When a query can take advantage of an index on one or more of the tables, the
optimizer chooses to use one or more of these indexes to retrieve rows from the table.
Index path
This type of access is known as an INDEX PATH. Generally, an INDEX PATH is the
fastest access method. The database server only has to look at rows that satisfy one or
more filter conditions. The fewer the number of rows to be read, the faster the query
runs.
When the optimizer chooses to use an index to access a table, it writes to the
sqexplain.out file keys that each index uses. This information is written as Index Keys
for the table.
Key-only select
In some cases, all the data you want to retrieve from a table is contained in the index. In
these situations, reading the data pages is unnecessary. In such a case, Informix
performs a key-only index scan. A key-only scan allows the database server to read
only the index pages. A key-only scan eliminates the I/O and CPU overhead associated
with an unnecessary read of data pages.
In the example, the optimizer has chosen a key-only index scan to return a value from
an aggregate.

© Copyright IBM Corp. 2001, 2017 7-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

The query statistics for this query are as follows:

© Copyright IBM Corp. 2001, 2017 7-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Index scan with lower index filter


SELECT * FROM stock, items
WHERE stock.stock_num = items.stock_num
AND items.quantity > 1;
Estimated cost: 105
Estimated # Rows Returned: 611
1) informix.stock: SEQUENTIAL SCAN
2) informix.items: INDEX PATH
Filters: client.items.quantity > 1
(1) Index Name: informix.item_stock_num_ix
Index Keys: stock_num manu_code (Serial, fragments ALL)
Lower Index Filter: informix.stock.stock_num =
informix.items.stock_num
NESTED LOOP JOIN

The Informix query optimizer © Copyright IBM Corporation 2017

Index scan with lower index filter


When a query accesses several tables, the explain output lists the tables in the order in
which they are accessed.
Lower index filter
To perform an indexed read, the optimizer begins by locating the first key value. Once
this position is found, the index can be read sequentially until the key value no longer
meets the condition set. The condition that defines the initial position in the index is
called a lower index filter. The Explain output includes the lower index filter for each
index used, when appropriate.
In the example above, the condition stock.stock_num = items.stock_num is used to
start the index search on the stock table. For each stock.stock_num value retrieved, an
index search of the items table is performed by using that value as a key value in a
nested-loop join.

© Copyright IBM Corp. 2001, 2017 7-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

The query statistics for this query are as follows:

© Copyright IBM Corp. 2001, 2017 7-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Index scan: Lower and upper index filters


SELECT * FROM customer
WHERE customer_num BETWEEN 104 AND 111;

Estimated cost: 2
Estimated # Row Returned: 8
1) informix.customer INDEX PATH
(1) Index Name: informix.customer_num_ix
Index Keys: customer_num (Serial, fragments: 0)
Fragments scanned: (0) dbspace2
Lower Index Filter:
(informix.customer.customer_num >= 104)
Upper Index Filter:
(informix.customer.customer_num <= 111)

The Informix query optimizer © Copyright IBM Corporation 2017

Index scan: Lower and upper index filters


Index read start and stop points
When you perform an indexed search of the tables, generally one of two conditions
define the indexed search: the condition that determines the start point and the
condition that determines the stop point of the index search.
In some cases, the search is started at the very beginning, that is, at the first position in
the index, called the start point. The index is then searched in order up to a particular
point, called the stop point.
Upper index filter
When an index has a stop point associated with it, the SET EXPLAIN output shows that
condition as an upper index filter.
Some queries have both lower and upper index filters for an indexed read. In such a
case, one condition defines where to start the index search, and another condition
defines where to stop the search.
A typical example of a query that uses both upper and lower index filters is a query with
a BETWEEN clause on an indexed column. Such a query is shown in the example.

© Copyright IBM Corp. 2001, 2017 7-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

The query statistics for this query are as follows:

© Copyright IBM Corp. 2001, 2017 7-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Dynamic hash join


SELECT * FROM items, stock
WHERE items.total_price = stock.unit_price;

Estimated cost: 220


Estimated # Rows Returned: 740
1) informix.items: SEQUENTIAL SCAN
2) informix.stock: SEQUENTIAL SCAN

DYNAMIC HASH JOIN


Dynamic Hash Filters: informix.items.total_price =
informix.stock.unit_price

The Informix query optimizer © Copyright IBM Corporation 2017

Dynamic hash join


The DYNAMIC HASH JOIN keywords indicate that a hash table is built on one table
and a dynamic hash join is performed. It includes the filter that is used for the join. By
default, the hash table is built on the second table listed in the SET EXPLAIN output. If
the term Build Outer is listed, the hash table is built on the first table listed.
In the example shown, a hash table is built on the unit_price column of the stock table.

© Copyright IBM Corp. 2001, 2017 7-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

The query statistics for this query are as follows:

© Copyright IBM Corp. 2001, 2017 7-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Hash join: Parallel scan and sort threads


SELECT sales_cd, product.prod_cd, manufact, company
FROM product, sales
WHERE sales_cd = 'new'
AND product.prod_cd = sales.prod_cd
GROUP BY 2,3,1,4;
Estimated cost: 6
Estimated # Rows Returned: 1
Temporary files required for: Group By
1) informix.product: SEQUENTIAL SCAN (Parallel, fragments: ALL)
2) informix.sales: SEQUENTIAL SCAN
Filters:
Table Scan Filters: informix.sales.sales_cd = 'new'
DYNAMIC HASH JOIN
Dynamic Hash Filters: informix.product.prod_cd = informix.sales.prod_cd
The Informix query optimizer © Copyright IBM Corporation 2017

Hash join: parallel scan and sort threads


Particularly in DSS environments, the optimizer does not assume that index lookups
are better than sequential table scans.
Dynamic hash join
Typically in DSS environments, large amounts of data are read and full table scans are
required. Dynamic hash joins can provide significant performance advantages over the
other join methods, especially when join tables are large. The DYNAMIC HASH JOIN
keywords indicate that a hash join is used. The output further displays the tables and
filters that are used in the join.
Sequential scan (parallel, fragments: all)
The SET EXPLAIN output indicates whether a sequential scan of a fragmented table is
performed in parallel, and which fragments are read. The ability to read table fragments
in parallel can greatly increase query performance.

© Copyright IBM Corp. 2001, 2017 7-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

The query statistics for this query are as follows:

Note: The product and sales tables were created for this and following queries.

© Copyright IBM Corp. 2001, 2017 7-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Nested loop join


SELECT customer.customer_num
FROM customer, orders
WHERE customer.customer_num = orders.customer_num;

1) informix.orders: INDEX PATH


(1) Index Name: informix.order_custnum_ix
Index Keys: customer_num (Key-Only) (Serial, fragments: ALL)
(2) Index Name: informix.customer_num_ix
Index Keys: customer_num (Key-Only) (Serial, fragments: ALL)
Lower Index Filter: informix.orders_customer_num =
informix.customer.customer_num
NESTED LOOP JOIN

The Informix query optimizer © Copyright IBM Corporation 2017

Nested loop join


In a nested-loop join, the database server scans the first, or outer table, and then joins
each of the rows that pass table filters to the rows found in the second, or inner table.
The database server accesses the outer table by an index or by a table scan. The
database server applies any table filters first. For each row that satisfies the filters on
the outer table, the database server reads the inner table to find a match. Thus, the
database server reads the inner table once for every row in the outer table that fulfills
the table filters. Because of the potentially large number of times that the inner table can
be read, the database server usually accesses the inner table by an index.

© Copyright IBM Corp. 2001, 2017 7-33


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

The query statistics for this query are as follows:

© Copyright IBM Corp. 2001, 2017 7-34


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Key-first index scan


SELECT * FROM customer
WHERE (customer_num > 120)
AND ( zipcode = '27406' OR zipcode = '05001')

Estimated Cost: 190


Estimated # of Rows Returned: 9

1) informix.customer: INDEX PATH


(1) Index Name: customer_num_zip_ix
Index Keys: customer_num zipcode (Key-First) (Serial, fragments:
ALL)
Lower Index Filter: informix.customer.customer_num > 120
Index Key Filters: ((informix.customer.zipcode = '27406' OR
informix.customer.zipcode = '05001')
The Informix query optimizer © Copyright IBM Corporation 2017

Key-first index scan


A key-first index scan uses other key filters, in addition to lower and upper filters, to
reduce the number of rows that a query reads. Although the server must eventually
read some rows of data from the table, it attempts to reduce the number of possible
rows by applying additional filters first.
The query statistics for this query are as follows:

Note: An index containing the customer_num and zipcode was created on the customer
table for this and following queries, as follows:
CREATE INDEX customer_num_zip_ix on customer(customer_num, zipcode);

© Copyright IBM Corp. 2001, 2017 7-35


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Key-only index scan


SELECT customer_num, zipcode
FROM customer
WHERE customer_num > 300;

Estimated Cost: 182


Estimated # of Rows Returned: 4412

1) informix.customer: INDEX PATH


(1) Index Name: customer_num_zip_ix
Index Keys: customer_num zipcode (Key-Only)
(Serial, fragments: ALL)
Lower Index Filter: informix.customer.customer_num > 300

The Informix query optimizer © Copyright IBM Corporation 2017

Key-only index scan


When all of the columns needed to execute a query are contained in a single index, the
Informix optimizer can use a key-only index scan. Because the query can be resolved
using only the index, there is no need to read any of the data pages from the table.
The query statistics for this query are as follows:

© Copyright IBM Corp. 2001, 2017 7-36


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Skip duplicate index scan


UPDATE customer SET cust_status = "C" WHERE EXISTS
(SELECT customer_num FROM orders
WHERE customer.customer_num = orders.customer_num )
Estimated Cost: 1207
Estimated # of Rows Returned: 825
1) informix.orders: INDEX PATH (Skip Duplicate)
(1) Index Name: informix.order_custnum_ix
Index Keys: customer_num (Key-Only)(Serial, fragments: ALL)
2) informix.customer: INDEX PATH
(1) Index Name: informix.customer_num_ix
Index Keys: customer_num (Serial, fragments: ALL)
Lower Index Filter: informix.customer.customer_num =
informix.orders.customer_num
NESTED LOOP JOIN
The Informix query optimizer © Copyright IBM Corporation 2017

Skip duplicate index scan


The skip duplicate index scan prevents multiple index lookups for the same value on a
secondary table. Consider the query shown in the visual. Multiple order records might
exist for the same customer. This could result in repeated searches into the orders table
for the same value and repeated updates to the same row.
The skip duplicate index scan ensures that only unique order_num values are returned,
eliminating repeat scans and updates for a single row.
The combination of a key-only scan and skip duplicate scan can result in a significant
reduction in I/O.
Note: The customer_status column was added to the customer table for this query.

© Copyright IBM Corp. 2001, 2017 7-37


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Index self joins


SELECT * FROM tab1
WHERE col1 >= 1 AND col1 <= 3
AND col2 >= 30 AND col2 <= 31
AND col3 >= 40 AND col3 <= 62;
Estimated Cost: 160
Estimated # of Rows Returned: 828
1) informix.tab1: INDEX PATH
(1) Index Name: informix.tab1_ix
Index Keys: col1 col2 col3 (Key-First) (Serial, fragments: ALL)
Index Self Join Keys (col1)
Lower bound: informix.tab1.col1 >=1
Upper bound: informix.tab1.col1 <= 3
Lower Index Filter: informix.tab1.col1 = informix.tab1.col1 AND informix.tab1.col2 >=20
AND (informix.tab1.col3 >=40)
Upper Index Filter: informix.tab1.col2 <=31 AND (informix.tab1.col3 <= 62)
Index Key Filters: (informix.tab1.col3 >=40) AND (informix.tab1.col3 <= 62)

The Informix query optimizer © Copyright IBM Corporation 2017

Index self joins


The optimizer in Informix has been enhanced to deal with range-based index scans
where the lead columns of an index have highly duplicate values. It does this by using
index self-joins, rather than by scanning the index tree. An index self-join can eliminate
the reading a significant portion of the index tree, and the results are manifested in
much less I/O and greatly improved performance.
The example depicts a query that uses an index with three columns, the first two of
which have highly duplicate values.
The query statistics for this query are as follows:

© Copyright IBM Corp. 2001, 2017 7-38


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Optimizing subqueries
Subquery:
SELECT * FROM customer
WHERE customer_num IN (
SELECT customer_num FROM orders
WHERE order_date = TODAY);
Correlated subquery:
SELECT * FROM customer c
WHERE exists (
SELECT customer_num FROM orders
WHERE orders.customer_num = c.customer_num
AND order_date = TODAY;
Join:
SELECT customer.* FROM customer, orders
WHERE customer.customer_num = orders.customer_num
AND orders.order_date = TODAY;

The Informix query optimizer © Copyright IBM Corporation 2017

Optimizing subqueries
A subquery is a SELECT statement that is contained in the WHERE clause of a
SELECT, INSERT, UPDATE, or DELETE statement. It is executed once and the
results passed back to the outer query.

© Copyright IBM Corp. 2001, 2017 7-39


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

The EXPLAIN output for the subquery would be as follows:

© Copyright IBM Corp. 2001, 2017 7-40


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

A special type of subquery is a correlated subquery. A correlated subquery is a


subquery that receives a value from the outer SQL statement. The subquery or inner
SQL statement must be executed once for each value retrieved and passed to it by the
outer SQL statement. Correlated subqueries are generally expensive in optimizer
terms.
The EXPLAIN output for the correlated subquery would be as follows:

© Copyright IBM Corp. 2001, 2017 7-41


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Most subqueries can also be written as join statements, which are much more efficient.
The join query shown on the visual retrieves the same data as the two subquery
examples.
The EXPLAIN output for the join query would be as follows:

© Copyright IBM Corp. 2001, 2017 7-42


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Multi-index scan
• Access method that allows optimizer to use multiple indexes
• Suppose two indexes are defined on table tab1
CREATE INDEX tab1_ix1 ON tab1(col1);
CREATE INDEX tab1_ix2 ON tab1(col2);
• Execute the following query on tab1
SELECT * FROM tab1
WHERE col1 = 1 and col2 BETWEEN 20 AND 30;
• Server performs index scans on idx1 and idx2, combines results, then
accesses only data rows that satisfy both index scans
• Use skip scan to access data rows; this looks like a sequential scan,
but it only retrieves necessary data rows

The Informix query optimizer © Copyright IBM Corporation 2017

Multi-index scan
Queries with AND and OR predicates may benefit from a multi-index scan, which can
choose indexes on individual columns.
Suppose two indexes are defined on table tab1. Index tab1_ix1 is defined on column
col1, and index tab1_ix2 on column col2. Execute the following query on table tab1:
SELECT ... FROM tab1 WHERE col1 = 1 AND col2 BETWEEN 20 AND 30;
The server performs index scans on both tab1_idx1 and tab2_idx2, and gets rowid lists
for each index. It then merges these rowid lists based on the Boolean expression
(AND/OR) in the WHERE clause. It fetches data using the rowids from the merged and
sorted rowid list using a skip scan. Skip scan is a table scan method similar to a
sequential scan, but uses the rowid lists.

© Copyright IBM Corp. 2001, 2017 7-43


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Skip scan
• Index scan retrieves rowids from the index and fetches data using the
rowid
• Skip scan is implemented on sorted rowid list
• Random I/O can be avoided; page is only read once
• Non-essential pages are skipped
• Significant CPU savings over full-table
• Can be used to retrieve rows from a single index or from multiple
indexes

The Informix query optimizer © Copyright IBM Corporation 2017

Skip scan
The skip scan works like this:
• An index scan is performed on the index(es) involved in the query.
• For each index, a list of matching rowids is created.
• The resulting lists are sorted into rowid order.
• The lists are then logically joined together (merged with an OR, intersected with
an AND) to produce one list of rowids for rows that match the query.
• The rows are then accessed from the table in rowid order using the resulting
single list. The rows are fetched in sequential order; but since some intervening
rows may not be needed, a skip scan is performed.
A skip scan implements functionality similar to sequential scan, but uses a sorted rowid
list. Since the values in the rowid list are sorted, random I/O can be avoided and a page
does not need to be read more than once. Non-essential (that is, unneeded) pages are
skipped. This results in significant CPU savings over a full-table scan because the
number of data rows retrieved and costly expression evaluation is reduced.
A skip scan can be used to retrieve rows corresponding to any rowid list, either one
constructed from a single index or from multiple indexes.

© Copyright IBM Corp. 2001, 2017 7-44


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Multi-index scan in sqexplain

The Informix query optimizer © Copyright IBM Corporation 2017

Multi-index scan in sqexplain


A multi-index scan (skip scan) is indicated in a sqexplain.out report as MULTI INDEX
PATH. All indexes used are listed along with the filter that was able to use the index.

© Copyright IBM Corp. 2001, 2017 7-45


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Current SQL information


$ onstat -g sql

$ onstat -g sql 338

The Informix query optimizer © Copyright IBM Corporation 2017

Multi-index
The onstat -g sql command includes summary information about the last SQL
statement that each session executed. The fields included in onstat -g sql are:
Session ID Session ID of the user who executes the SQL statement.

Statement type Statement type such as SELECT, UPDATE, DELETE, INSERT.

Current database Name of the current database for the session.

Isolation level Current isolation level.

Lock mode Current lock mode.

SQL ERR Last SQL error.

ISAM ERR Last ISAM error.

F.E. Vers Informix version of the client application.

Explain Status of EXPLAIN mode.

© Copyright IBM Corp. 2001, 2017 7-46


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

To retrieve more detailed information about a specific session, use:


onstat -g sql session_id
You can also retrieve a listing of active SQL sessions by executing the command:
SELECT * FROM sysmaster:syssqlcurses;
or
SELECT * FROM sysmaster:syssqlcurses
WHERE scs_session = session_id;

© Copyright IBM Corp. 2001, 2017 7-47


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Exercise 7
The Informix query optimizer
• use the SET EXPLAIN feature
• write SQL queries to illustrate various optimizer access and join plans

The Informix query optimizer © Copyright IBM Corporation 2017

Exercise 7: The Informix query optimizer

© Copyright IBM Corp. 2001, 2017 7-48


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Exercise 7:
The Informix query optimizer

Purpose:
In this exercise, you will learn how to use the SET EXPLAIN feature of the
optimizer to determine query access and join plans.

Task 1. Sequential scan with a temporary table.


In this task, you will turn on the SET EXPLAIN feature and run a query that will create a
temporary file. You will examine the output file for an explanation of the optimizer
choices.
First, make sure the EXPLAIN_STAT parameter in your ONCONFIG file is set to 1.
Determine EXPLAIN_STAT setting.
1. Run the following Informix command from the command line prompt:
onstat -c | grep EXPLAIN_STAT
2. If EXPLAIN_STAT is set to 1, skip the following of instructions (Enable the
EXPLAIN_STAT feature) and continue to the next set of instructions (Use SET
EXPLAIN to examine the query plan). If EXPLAIN_STAT is set to 0, perform
the following instructions (Enable the EXPLAIN_STAT feature) and then
continue with the next set of instructions (Use SET EXPLAIN to examine the
query plan).

Enable the EXPLAIN_STAT feature.

1. Edit the $ONCONFIG file.


2. Set the EXPLAIN_STAT parameter to 1.
3. Save the $ONCONFIG file.
4. Cycle the engine (bring it offline and back online).
Use SET EXPLAIN to examine the query plan
1. In a dbaccess session, run the following set of SQL statements:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM orders
ORDER BY customer_num;
SET EXPLAIN OFF;
2. Examine the ex7.expl file and notice the temporary file used for the ORDER BY
clause. Also notice the query statistics reported with the Explain plan.

© Copyright IBM Corp. 2001, 2017 7-49


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Task 2. Sequential scan with a filter.


In this task, you will run a query that will create a sequential scan using a filter
condition. You will examine the ex7.expl file for an explanation of the optimizer
choices.
1. To demonstrate that sequential scans can be used with a filter condition,
execute the following SQL statement.
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM orders
WHERE order_date < '06/01/17';
SET EXPLAIN OFF;
2. Examine the ex7.expl file and notice that a sequential scan was used with the
filter condition.
Task 3. Key-only index scan.
In this task, you will run a query that will create a key-only index scan. You will
examine the ex7.expl file for an explanation of the optimizer choices.
1. To demonstrate that a key-only index scan can be used when taking advantage
of an index, execute the following SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT order_num FROM orders
WHERE order_num > 1008;
SET EXPLAIN OFF;
2. Examine the ex7.expl file and notice that a key-only index scan was used on
the order_num column.
3. Write a SELECT statement that selects the minimum value in the zipcode
column of the customer table.
4. Examine the ex7.expl file and notice that a key-only index scan was used on
the zipcode column. Note also the Aggregate tag.

© Copyright IBM Corp. 2001, 2017 7-50


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Task 4. Index scan with lower and upper filters.


In this task, you will run queries that will create an index scan with upper and lower
filers. You will examine the output from the ex7.expl file
1. Run the following SQL statement. We will discuss UPDATE STATISTICS in the
next exercise:
UPDATE STATISTICS;
2. To demonstrate an index scan using a lower filter, execute the following SQL
statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.order_num > 1300;
SET EXPLAIN OFF;
3. Examine the ex7.expl file and that a notice the lower index filter was used on
the order_num and customer_num columns. Notice also the fragment
elimination on the index. Only fragment 1 was scanned.
4. Write a SELECT statement that selects customer_num > 2000 from the
customer table and where the zipcode is between "09*" and "6*".
5. Examine the ex7.expl file and notice that the index on the customer table was
used.
What steps did the optimizer take to reduce the number of rows first?
What was the order of index usage and why was this order used?
6. To demonstrate that an index scan is using a lower and upper filter, execute the
following SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
OTUPUT TO /dev/null
SELECT * FROM catalog
WHERE catalog_num BETWEEN 10010 AND 10040;
SET EXPLAIN OFF;
7. Examine the ex7.expl file and notice that the lower and upper index filter was
used on the catalog_num column.
8. Write a SELECT statement that selects the fname, lname, and company
columns from the customer table where the company name starts with “Golf"
and the customer number is between 110 and 115.
9. Examine the ex7.expl file and notice the lower and upper index filters that were
used on the customer_num column.

© Copyright IBM Corp. 2001, 2017 7-51


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Task 5. Joins.
In this task, you will run queries that will perform a join. You will examine the output
from the ex7.expl file. You will use the onstat -g sql utility to monitor your SQL
session.
1. To demonstrate an index scan using nested loop join, execute the following
SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND c.lname MATCHES "Ed*";
SET EXPLAIN OFF;
2. Examine the ex7.expl file and notice the nested loop join that was used on the
customer and orders tables. Which table was used as the outer table and which
was used as the inner table?
Task 6. Key-first index scan.
In this task, you will run queries that will use the key first index scan. You will then
examine the ex7.expl file generated by the query.
1. To demonstrate a key first index scan, execute the following SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM customer
WHERE (zipcode > "111")
AND ((zipcode = "20744")
OR (zipcode = "30127"));
SET EXPLAIN OFF;
2. Examine the ex7.expl file and notice that the key first index scan was
performed on the zipcode column.
What steps did the optimizer take to reduce the number of rows read?
Results:
In this demonstration, you learned how to use the SET EXPLAIN feature of the
optimizer to determine query access and join plans.

© Copyright IBM Corp. 2001, 2017 7-52


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Exercise 7:
The Informix query optimizer - Solutions

Purpose:
In this exercise, you will learn how to use the SET EXPLAIN feature of the
optimizer to determine query access and join plans.

Task 1. Sequential scan with a temporary table.


In this task, you will turn on the SET EXPLAIN feature and run a query that will create a
temporary file. You will examine the output file for an explanation of the optimizer
choices.
The command will be run at the command line prompt. Exit out of dbaccess, before
beginning.
First, make sure the EXPLAIN_STAT parameter in your ONCONFIG file is set to 1.
Determine EXPLAIN_STAT setting
1. Run the following Informix command from the command line prompt:
onstat -c | grep EXPLAIN_STAT
Results:
EXPLAIN_STAT 1
-or-
EXPLAIN_STAT 0
2. If EXPLAIN_STAT is set to 1, skip the following set of instructions (Enable the
EXPLAIN_STAT feature) and continue to the next set of instructions (Use SET
EXPLAIN to examine the query plan). If EXPLAIN_STAT is set to 0, perform
the following instructions (Enable the EXPLAIN_STAT feature) and then
continue with the next set of instructions (Use SET EXPLAIN to examine the
query plan).
Enable the EXPLAIN_STAT feature
1. Edit the $ONCONFIG file.
vi $INFORMIXDIR/etc/$ONCONFIG
2. Set the EXPLAIN_STAT parameter to 1.
EXPLAIN_STAT 1
3. Save the $ONCONFIG file.
4. Cycle the engine (bring it offline and back online).
onmode -ky
oninit -v

© Copyright IBM Corp. 2001, 2017 7-53


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Use SET EXPLAIN to examine the query plan


1. In a dbaccess session, run the following set of SQL statements:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM orders
ORDER BY customer_num;
SET EXPLAIN OFF;
2. Examine the ex7.expl file (at the command prompt, type: ‘more ex7.expl’ (no
quotes)) and notice the temporary file used for the ORDER BY clause. Also
notice the query statistics reported with the Explain plan.

© Copyright IBM Corp. 2001, 2017 7-54


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Task 2. Sequential scan with a filter.


In this task, you will run a query that will create a sequential scan using a filter
condition. You will examine the ex7.expl file for an explanation of the optimizer
choices.
1. To demonstrate that sequential scans can be used with a filter condition,
execute the following SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM orders
WHERE order_date < '06/01/17';
SET EXPLAIN OFF;
2. Examine the exy.expl file and notice that a sequential scan was used with the
filter condition.

© Copyright IBM Corp. 2001, 2017 7-55


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Task 3. Key-only index scan.


In this task, you will run a query that will create a key-only index scan. You will
examine the ex7.expl file for an explanation of the optimizer choices.
1. To demonstrate that a key-only index scan can be used when taking advantage
of an index, execute the following SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT order_num FROM orders
WHERE order_num > 1008;
SET EXPLAIN OFF;
2. Examine the ex7.expl file and notice that a key-only index scan was used on
the order_num column.

3. Write a SELECT statement that selects the minimum value in the zipcode
column of the customer table.
SET EXPLAIN FILE TO 'ex7.expl';
SELECT min(zipcode) FROM customer;
SET EXPLAIN OFF;

© Copyright IBM Corp. 2001, 2017 7-56


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

4. Examine the ex7.expl file and notice that a key-only index scan was used on
the zipcode column.

Task 4. Index scan with lower and upper filters.


In this task, you will run queries that will create an index scan with upper and lower
filers. You will examine the output from the ex7.expl file
1. Run the following SQL statement. We will discuss UPDATE STATISTICS in the
next exercise:
UPDATE STATISTICS;
2. To demonstrate an index scan using a lower filter, execute the following SQL
statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.order_num > 1300;
SET EXPLAIN OFF;

© Copyright IBM Corp. 2001, 2017 7-57


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

3. Examine the ex7.expl file and that a notice the lower index filter was used on
the order_num and customer_num columns. Notice also the fragment
elimination on the index. Only fragment 1 was scanned.

© Copyright IBM Corp. 2001, 2017 7-58


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

4. Write a SELECT statement that selects customer_num > 2000 from the
customer table and where the zipcode is between "09*" and "6*".
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM customer
WHERE customer_num > 2000
AND zipcode BETWEEN "09" AND "6";
SET EXPLAIN OFF;
5. Examine the ex7.expl file and notice that an index path on the customer table
was used, with filters applied.

6. To demonstrate that an index scan is using a lower and upper filter, execute the
following SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
OUTPUT TO /dev/null
SELECT * FROM catalog
WHERE catalog_num BETWEEN 10010 AND 10040;
SET EXPLAIN OFF;

© Copyright IBM Corp. 2001, 2017 7-59


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

7. Examine the ex7.expl file and notice that the lower and upper index filter was
used on the catalog_num column.

8. Write a SELECT statement that selects the fname, lname, and company
columns from the customer table where the company name starts with “Golf"
and the customer number is between 110 and 115.
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT fname, lname, company FROM customer
WHERE company MATCHES "Golf*"
AND customer_num BETWEEN 110 AND 115
ORDER BY lname;
SET EXPLAIN OFF;

© Copyright IBM Corp. 2001, 2017 7-60


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

9. Examine the ex7.expl file and notice the lower and upper index filters that were
used on the customer_num column.

Task 5. Joins.
In this task, you will run queries that will create a dynamic hash join. You will
examine the output from the ex7.expl file.You will use the onstat -g sql utility to
monitor your SQL session.
1. To demonstrate an index scan using dynamic hash join, execute the following
SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND c.lname MATCHES "Ed*";
SET EXPLAIN OFF;

© Copyright IBM Corp. 2001, 2017 7-61


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

2. Examine the ex7.expl file and notice the nested loop join that was used on the
customer and orders tables. Which table was used as the outer table and which
was used as the inner table?

The customer table was the outer table and the orders table was the
inner table.

© Copyright IBM Corp. 2001, 2017 7-62


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Task 6. Key-first index scan.


In this task, you will run queries that will use the key first index scan. You will then
examine the ex7.expl file generated by the query.
1. To demonstrate a key first index scan, execute the following SQL statement:
SET EXPLAIN FILE TO 'ex7.expl';
UNLOAD TO /dev/null
SELECT * FROM customer
WHERE (zipcode > "111")
AND ((zipcode = "20744")
OR (zipcode = "30127"));
SET EXPLAIN OFF;
2. Examine the ex7.expl file and notice that the key first index scan was
performed on the zipcode column.

© Copyright IBM Corp. 2001, 2017 7-63


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

What steps did the optimizer take to reduce the number of rows read?
The optimizer first applies the filter conditions where zipcode > 111 and
equal to 20744. Then the optimizer filters zipcode where the values are
> 111 and equal to 30127. Applying a key first index filter ensures that
the fewest number of data pages are read.
Results:
In this demonstration, you learned how to use the SET EXPLAIN feature of the
optimizer to determine query access and join plans.

© Copyright IBM Corp. 2001, 2017 7-64


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

Unit summary
• Understand query plans, access plans, and join plans
• Write queries that produce various index scans

The Informix query optimizer © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 7-65


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 7 Th e I n f o r m i x q u e r y o p t i m i ze r

© Copyright IBM Corp. 2001, 2017 7-66


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Updating_statistics_and_data_distributions

Updating statistics and data


distributions

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Unit 8 Updating statistics and data distributions

© Copyright IBM Corp. 2001, 2017 8-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Unit objectives
• Execute the UPDATE STATISTICS statement and explain the results
• Use the system catalog tables to monitor data distributions

Updating statistics and data distributions © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 8-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

What does UPDATE STATISTICS do?

systables sysindices

sysdistrib

syscolumns

Updating statistics and data distributions © Copyright IBM Corporation 2017

What does UPDATE STATISTICS do?


When you update the statistics for an Informix database, table, column, or stored
procedure, you populate the system catalog tables with all the information the query
optimizer needs to determine lowest-cost access paths for retrieving data.
Without the information that the UPDATE STATISTICS statement provides, the
optimizer cannot make accurate decisions.

© Copyright IBM Corp. 2001, 2017 8-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

UPDATE STATISTICS modes


• LOW: Collects table, index, and column statistics but does not build
distributions
• MEDIUM: Collects table, index, and column statistics and builds
distribution bins representing an 85 - 99 percent accurate sampling of
values
• HIGH: Collects table, index, and column statistics and builds data
distribution bins that represent exact data values

Updating statistics and data distributions © Copyright IBM Corporation 2017

UPDATE STATISTICS modes


The UPDATE STATISTICS statement offers three modes for gathering system
information. The default is LOW mode. HIGH and MEDIUM modes collect the same
statistics that LOW mode collects and, in addition, build data distributions that detail the
various data values stored in each column.
To view the distributions, use the following command:
dbschema -d database -hd tablename

© Copyright IBM Corp. 2001, 2017 8-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

UPDATE STATISTICS statement


• Update statistics for the entire database.
UPDATE STATISTICS [LOW|MEDIUM|HIGH];

• Update statistics for a specific table and its indexes.


UPDATE STATISTICS [LOW|MEDIUM|HIGH]
FOR TABLE [tabname];

• Update statistics for a specific column.


UPDATE STATISTICS [LOW|MEDIUM|HIGH]
FOR TABLE tabname (colname);

Updating statistics and data distributions © Copyright IBM Corporation 2017

UPDATE STATISTICS statement


The system catalog columns that the UPDATE STATISTICS statement updates are
generally not automatically updated. Because of the performance effect (memory, CPU,
and I/O utilization) required to collect statistics, Informix allows the DBA to choose how
and when information about the database data is collected.
Distributions
The HIGH and MEDIUM modes allow UPDATE STATISTICS to create data distribution
information about the range of values stored in each column.
The optimizer uses this information to make more informed decisions about:
• Selectivity of filter columns
• Access method for filter columns and tables
• Best join technique
By using distributions, you can significantly improve the execution time of your queries.

© Copyright IBM Corp. 2001, 2017 8-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Statistics available with LOW mode


• systables:
 nrows: Number of rows in the table
 npused: Number of pages on disk used for table
• sysindices:
 leaves: Number of pages on the 0 level of the b-tree
 levels: Number of b+ tree levels
 nunique: Number of unique key values
 clust: Degree of clustering
• syscolumns:
 colmin: Second minimum value of column
 colmax: Second maximum value of column

Updating statistics and data distributions © Copyright IBM Corporation 2017

Statistics available with LOW mode


When you execute an UPDATE STATISTICS statement by using LOW (default) mode,
the systables, syscolumns, and sysindexes tables are populated.
The systables columns are used to estimate the cost of a physical read of the table.
The sysindexes columns are used to estimate the cost of performing an indexed read of
a table. The nunique column is used to estimate the selectivity of equality filters. The
nunique column only applies to the first key of a composite index, so the nunique value
for an index created on (a, b, c) only reflects the number of unique values of column a.
The value in the clust column specifies the extent to which the rows in the table are in
the same order as the index. Smaller numbers correspond to greater clustering.
The syscolumns columns are used to estimate the selectivity of inequality (greater
than/less than) filters. If data distributions are not available, the optimizer assumes that
data is evenly distributed between colmin and colmax.

© Copyright IBM Corp. 2001, 2017 8-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

UPDATE STATISTICS LOW information


• UPDATE STATISTICS LOW information is stored in systables.ustlowts
• Indicates timestamp when table, row, and page-count statistics were
last recorded

UPDATE STATISTICS LOW FOR TABLE tab1;


SELECT ustlowts FROM systables
WHERE tabname = "tab1";
ustlowts 2017-06-28 16:31:40.00000

Updating statistics and data distributions © Copyright IBM Corporation 2017

UPDATE STATISTICS LOW information


The timestamp of the last time that UPDATE STATISTICS LOW was run is stored in
the ustlowts column of the systables table and can be accessed via SQL.

© Copyright IBM Corp. 2001, 2017 8-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

MEDIUM and HIGH modes


• The MEDIUM and HIGH mode options cause update statistics to
compile data distribution information for the columns in the database or
table specified.
• MEDIUM:
 All rows are read
 Distributions built on a sample of rows
• HIGH:
 All rows are read
 Distribution data accounts for every value

Updating statistics and data distributions © Copyright IBM Corporation 2017

MEDIUM and HIGH modes


The UPDATE STATISTICS MEDIUM and UPDATE STATISTICS HIGH statements
cause the database server to read every value that the column contains. The MEDIUM
statement sorts a sample set of the column values and populates distribution data in the
sysdistrib table that represents this sample. The HIGH statement sorts all the column
values and populates the sysdistrib table with exact (100%) distribution information
about all values in the table at the time of execution.
For large tables, HIGH mode uses more resources and takes more time during
UPDATE STATISTICS than the sampling method of MEDIUM mode. However,
MEDIUM mode can be less accurate than HIGH mode.
Distributions are not created on TEXT or BYTE columns.

© Copyright IBM Corp. 2001, 2017 8-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

UPDATE STATISTICS MEDIUM


• User-configured sampling size:
 UPDATE STATISTICS MEDIUM SAMPLING SIZE <number>
− Number <= 1.0 interpreted as percent of rows to be sampled
− Number > 1.0 interpreted as number of rows to be sampled
 Sampling size must be greater than sampling size calculated for default
resolution 2.5 and confidence .95
• SAMPLING SIZE value stored in sysdistrib.smplsize
• Actual number of rows sampled stored in sysdistrib.rowssmpld
• Keyword SAMPLING SIZE
 Alternate way of indicating sample size
 Table name SAMPLING cannot be used in UPDATE STATISTICS
statement

Updating statistics and data distributions © Copyright IBM Corporation 2017

UPDATE STATISTICS MEDIUM


UPDATE STATISTICS MEDIUM has been enhanced to allow the user to specify a
sampling size.
One way that the size of the sample that is taken for UPDATE STATISTICS MEDIUM
is calculated is based on the resolution and confidence, and does not consider the size
(population) of the table. By increasing the confidence or decreasing the resolution
value, the sample size increases, but for very large tables the sample size might be too
small to optimally represent the true data distribution.
The size of the sample can also be included in the UPDATE STATISTICS MEDIUM
statement using the SAMPLING SIZE parameter. The number provided might
represent either a percentage of the rows or an absolute number of rows.
• A value of 1 or less is interpreted as a percentage of the rows to be sampled.
• A value of greater than 1 is interpreted as an absolute value for the number of
rows to sample.
The number of rows specified must be equal to or greater than the sampling size
calculated from the default resolution and confidence values. If the number of rows
specified is fewer than the default calculated value, the default value is used for the
sampling size.

© Copyright IBM Corp. 2001, 2017 8-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

The sysdistrib table has columns for storing distribution and sampling information.
If SAMPLING SIZE is specified, either as a percentage or a number of rows, the
value is stored in the smplsize column.
The actual number of rows sampled is stored in the rowssmpld column.
With the keyword SAMPLING, a table named sampling cannot be used in an
UPDATE STATISTICS statement.

© Copyright IBM Corp. 2001, 2017 8-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

SAMPLING SIZE examples


• UPDATE STATISTICS MEDIUM
FOR TABLE tab1 (col1)
SAMPLING SIZE 0.5;
 Creates sample size of 50% of rows
 sysdistrib.smplsize = 0.5
• UPDATE STATISTICS MEDIUM
FOR TABLE tab1 (col1)
SAMPLING SIZE 30;
 Creates sample size of 30 rows
 sysdistrib.smplsize = 30

Updating statistics and data distributions © Copyright IBM Corporation 2017

SAMPLING SIZE examples


The examples illustrate the syntax for creating distributions using the SAMPLING SIZE
parameter.
In the first example, SAMPLING SIZE is defined as a percentage of the table (in this
case 50%), and in the second example as an absolute number of rows.
The visual also indicates the values stored in the smplsize column of the sysdistrib table
for each of these statements.

© Copyright IBM Corp. 2001, 2017 8-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

UPDATE STATISTICS HIGH/MEDIUM information


• UPDATE STATISTICS HIGH/MEDIUM info stored in sysdistrib table:
 constr_time: Timestamp stored
 smplsize: User-specified sample size
 rowssmpld: Number of rows sampled

UPDATE STATISTICS MEDIUM FOR TABLE tab1 SAMPLING SIZE 1000;


SELECT constr_time, smplsize, rowssmpld FROM sysdistrib
WHERE colno = 1
AND tabid = (SELECT tabid FROM systables
WHERE tabname='tab1');

constr_time smplsize rowssmpld


2017-06-28 17:48:16.00000 1000.0000000000 1832.0000000000

Updating statistics and data distributions © Copyright IBM Corporation 2017

UPDATE STATISTICS HIGH/MEDIUM information


In addition to the values stored in the smplsize and rowssmpld columns, the timestamp
(DATETIME YEAR TO FRACTION(5)) of when the UPDATE STATISTICS MEDIUM
command was run is stored in the column constr_time.
The result of a select on these three columns is shown in the example.
Note that the number of rows sampled is greater than the number of rows specified in
the SAMPLING SIZE parameter. This is because the number of rows specified is fewer
than the default number of rows calculated from the resolution and confidence values.

© Copyright IBM Corp. 2001, 2017 8-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

How distributions are created

1. Read rows from table.


Column A data
2. Sort rows.
3. Divide rows into bins.

147

86 20
123 20 20
32 90

Bin 1: 1-50 Bin 2: 51-90 Bin 3: 91-150 Overflow bin

Updating statistics and data distributions © Copyright IBM Corporation 2017

How distributions are created


To build distribution data for a column, Informix:
1. Reads the column value for every row in the table.
The server reads only index pages and eliminates data page reads when the
following conditions are met:
• The column is the first column in an existing index.
• The UPDATE STATISTICS statement is run for a single column.
Otherwise, Informix reads the entire data page to extract the column value for
each row.
2. Sends columns values to the sort routine. For MEDIUM mode, only a sample of
the column values is sent to the sort routine. The resolution and confidence (or
the value for SAMPLING SIZE) specified by the statement determines the size of
the sample set. For HIGH mode, all values are sent to the sort routine.
3. Uses the sort routine to sort the values. A sort pool is allocated in the database
server virtual shared-memory segment to hold the values being sorted. For large
tables, temporary sort space can also be allocated on disk to hold intermediate
sort runs.
If the columns are read in order using an index, a sort is not necessary.

© Copyright IBM Corp. 2001, 2017 8-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

4. Scans each sorted value and retrieves the first value, last value, and every Nth
value where N is:
resolution / 100 * number_of_values.
This information is used to divide the data into bins with each bin containing an
equal number of values. If 10 bins are created, each bin holds one tenth of the
rows in the set. If the database server finds a value that has many duplicates of a
particular value, that value is placed in an overflow bin.
The first and last values are always obtained from the true data, not from the
sample.

© Copyright IBM Corp. 2001, 2017 8-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

What information is kept?


• The sysdistrib system catalog table stores:
 The maximum value
 The number of distinct values
 Timestamp of when the distributions were created
 User-specified sample size
 Actual number of rows sampled
• For the distribution, the following information is kept:
 The number of rows that each bin represents
 The minimum and maximum value for the column
 The last bin size

Updating statistics and data distributions © Copyright IBM Corporation 2017

What information is kept?


To prevent skewing the number of distinct values in the bins, statistics for any highly
duplicate values are kept separately. A highly duplicate value is defined as a value
that occurs in more instances (rows) than 25 percent of the number of rows in a bin.
The data kept for these values include:
• The column value
• The number of rows that contain the column value

© Copyright IBM Corp. 2001, 2017 8-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

The sysdistrib System Catalog table

Column Name Type Description


tabid integer Table ID found in systables
colno smallint Column number found in syscolumns

sequence integer Sequence number for multiple entries

constructed date Date the distribution was created


mode char(1) L=Low, M=Medium, H=High
resolution float Resolution used to create distribution
confidence float Confidence used to create distribution

encdat char(256) Encoded histogram information

Updating statistics and data distributions © Copyright IBM Corporation 2017

The sysdistrib System Catalog table


The columns of the sysdistrib table that store distribution information are shown in the
visual. There can be several rows for a single column. Each row represents a range of
values and contains such information as the number of distinct values within the range
and the number of rows that each range represents. The sequence column tracks
sysdistrib entries for a single column.
You can query this table directly, but the distribution information is stored in an encoded
format. To make it simple to extract and interpret the distribution information, use the
dbschema utility.
dbschema -d database -hd tablename

© Copyright IBM Corp. 2001, 2017 8-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

dbschema -hd display

• dbschema output

dbschema -hd account -d bank

DBSCHEMA Schema Utility INFORMIX-SQL Version 12.10.FC8DE

Distribution for informix.account.branch_nbr

Constructed on 2017-06-28 17:03:52.00000


Medium Mode, 0.7000 Sampling Size, 10.0000 Resolution, 0.8000 confidence

Updating statistics and data distributions © Copyright IBM Corporation 2017

dbschema -hd display


Run the dbschema -hd command to display the distribution information stored in the
sysdistrib table for the columns in the specified tables.
For each table, the mode used to gather the distribution information, the sampling size,
resolution, and confidence settings are displayed. This is followed by the distribution
information held for the columns in the table, as shown in the example on the next
page.

© Copyright IBM Corp. 2001, 2017 8-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Distribution output dbschema -hd account –d bank


--- DISTRIBUTION ---

( 0)
Distribution for
1: ( 6005, 16, 16) branch_nbr
2: ( 6005, 9, 26)
3: ( 6005, 4, 31)
4: ( 6005, 4, 36)
5: ( 6005, 4, 41)
6: ( 6005, 4, 46) count distinct high value
7: ( 6005, 4, 51)
8: ( 6005, 4, 56)
count value
9: ( 6005, 13, 69)
10: ( 5957, 29, 99)

--- OVERFLOW ---


1: ( 1946, 80)

Updating statistics and data distributions © Copyright IBM Corporation 2017

Distribution output
The hd option of dbschema displays the information kept for each bin, as well as the
overflow values and their frequency. The -hd option requires a table name or ALL for all
tables.
Only the owner of the table, users that have SELECT permission on the column, or the
DBA can list distributions for a column with dbschema.
The sample dbschema output shown has two sections: the distribution and the overflow
section.
The distribution section shows the values in each bin.
In the example, bin 1 represents 6005 instances of values between 0 and 16. Within
this interval are 16 unique values.
The overflow section shows each value that has many duplicates. For example, value
80 has 1946 duplicates.

© Copyright IBM Corp. 2001, 2017 8-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Resolution

Resolution: The percentage of the data that is put in each bin.


Determines the number of bins that are created (excluding overflow).

UPDATE STATISTICS HIGH FOR TABLE tabname


RESOLUTION 10;

Updating statistics and data distributions © Copyright IBM Corporation 2017

Resolution
You can use resolution to specify the number of bins to use.
The formula for calculating the number of bins is:
100/resolution = number of bins
A resolution of 1 means that one percent of the data goes into each bin (100/1 = 100
bins). A resolution of 10 means that 10 percent of the data goes into each bin
(100/10 = 10 bins). The resolution can be a number between 0.005 and 10. However,
you cannot specify a number less than 1/(rows in the table).
The lower the resolution value, the more bins are created. The more bins you have, the
more accurate the optimizer can regard the number of rows that satisfy the SELECT
filter. However, if too many bins are allocated, the optimization time can increase
slightly because the system catalog pages that hold the distribution must be read (from
memory if they are in the cache or from disk if they are not).
In actuality, the number of bins allocated in a data distribution can vary slightly from the
results of the formula in the example due to highly duplicate values in a column and the
degree to which column values are clustered.

© Copyright IBM Corp. 2001, 2017 8-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

The following are statistics for a bin for column x:


count: 1000
distinct: 100
high value: 10000
Suppose this is the first bin and you know the bin contains a count for the number of
rows between 1 and 10,000. There are 1000 rows in this range but only 100 rows have
unique values. The optimizer assumes that the duplicates are evenly spread among the
distinct values. This means that each column value in this bin has 1000/100 = 10 rows.
A SELECT statement with an equality filter, such as:
SELECT * FROM tab WHERE x = 250
would return 10 rows, according to the estimate of the optimizer.
Now suppose you decrease the resolution value so that there are more bins and each
bin represents less data. As an example, suppose the first bin contained the following
statistics:
count: 300
distinct: 60
high value: 5000
Now the optimizer can estimate that the SELECT statement will return (300/60) = 5
rows.
Default resolution
The default resolution for HIGH mode is 0.5. The default resolution for MEDIUM mode
is 2.5.
Are more bins better?
Even though the optimizer can estimate the number of rows to be returned more
accurately, increasing the number of bins might not obtain a better or different path.
This means that the SELECT statement cannot run any faster with a better estimate.
The optimizer depends entirely on the distribution and the SELECT statement that is
retrieving the data.

© Copyright IBM Corp. 2001, 2017 8-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Confidence

The resolution and


confidence determine the
sample size.

resolution

confidence

UPDATE STATISTICS MEDIUM


FOR TABLE tabname RESOLUTION 1 .99

Confidence is a statistical measure of the reliability of the


sample (if UPDATE STATISTICS MEDIUM is used).

Updating statistics and data distributions © Copyright IBM Corporation 2017

Confidence
Confidence is an estimate of the probability that you will stay within the resolution you
choose.
In the example, with a confidence value of 99 percent (confidence .99), your confidence
should be high that the results (that is, the number of rows per bin) of a sample taken to
create the distribution is roughly equivalent to what you would get if all the rows were
examined.
Default confidence
The confidence is expressed as a value between 0.80 and 0.99. The default value is
0.95. Confidence is used only when sampling data for a medium distribution (UPDATE
STATISTICS MEDIUM). The resolution and confidence are used to determine the
sample size for medium distributions.
Default sample size
By default, the size of the sample that is taken for UPDATE STATISTICS MEDIUM
depends on the resolution and confidence. By increasing the confidence or decreasing
the resolution value, the sample size increases.
The sample size does not depend on the size (population) of the table, so for larger
tables it might not be truly representative of the data.
For larger tables, the sample size can be specified using the SAMPLING SIZE
parameter.

© Copyright IBM Corp. 2001, 2017 8-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Updating distributions only

• Use the DISTRIBUTIONS ONLY clause to compile new data


distributions without collecting systables and sysindices information.

UPDATE STATISTICS MEDIUM FOR TABLE customer


DISTRIBUTIONS ONLY;

Updating statistics and data distributions © Copyright IBM Corporation 2017

Updating distributions only


In some situations, it is beneficial to create new distribution data without collecting table
and index information. You can accomplish this by using the DISTRIBUTIONS ONLY
clause of the UPDATE STATISTICS statement.

© Copyright IBM Corp. 2001, 2017 8-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Create index/build distribution process


• Distributions created automatically when index created:
 Either implicit or explicit index
 UPDATE STATISTICS LOW created for all indexes
 UPDATE STATISTICS HIGH created for lead column of most indexes
• Sample taken from existing data
• Uses sorted data produced by create index
• Data passed to mini-bin process
• Feature enabled by default
 No ONCONFIG parameter to switch feature on/off

Updating statistics and data distributions © Copyright IBM Corporation 2017

Create index/build distribution process


Statistics are usually generated automatically whenever an index is created.
An UPDATE STATISTICS LOW is generated for all indexes. UPDATE STATISTICS
LOW updates the system catalog tables systables, sysindexes, and syscolumns.
An UPDATE STATISTICS HIGH is also generated for the lead column on most
indexes, the exception being indexes having a lead column with an opaque data type.
The resolution is 1.0 if the table has fewer than a million rows, and 0.5 for larger table
sizes.
This feature of running UPDATE STATISTICS LOW and creating medium distributions
during the create index process is automatically enabled, and cannot be turned off.

© Copyright IBM Corp. 2001, 2017 8-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Create index/build distribution architecture

BT Merger

BT BT BT
Appender Appender Appender

Queue
Mini-bin Mini-bin Mini-bin
builder builder builder

Sort Sort Sort

Mini-bin
Sorter

Scan Scan Scan


Mini-bin
Merger

Updating statistics and data distributions © Copyright IBM Corporation 2017

Create index/build distribution architecture


The chart is a graphical depiction of the create index and create distribution process.
The mini-bin process has been inserted into the index build process to create
distributions and statistics at the time the index is created.
While the index is being created, the data values are passed to sort threads, which
handle the basic index build process.
Each sort process creates miniature distribution bins (“mini-bins”) from the values
scanned.
These mini-bins are then shipped via a queue to a mini-bin collector thread, which sorts
the data values received from the mini-bins.
Then a mini-bin merger thread merges the sorted data into a final distribution bin,
creating the actual data distribution information in the sysdistrib table.
The performance impact of this extra step is only minimal, and takes significantly less
time than running UPDATE STATISTICS manually as a stand-alone process.

© Copyright IBM Corp. 2001, 2017 8-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Automatic UPDATE STATISTICS exceptions


• This feature is disabled when:
 Environment variable NOSORTINDEX set:
− Undocumented variable
− Forces top-down index build
 UPDATE STATISTICS MEDIUM and:
− Lead column of index is UDT (built-in or non-built-in)
− Lead column of index is opaque data type
− Index is of type Functional
− Index is of type VII (virtual-index interface)
− Number of rows in table < 2
− Index is attached index

Updating statistics and data distributions © Copyright IBM Corporation 2017

Automatic UPDATE STATISTICS exceptions


There are some situations where the automatic distribution feature is disabled.
One is when the undocumented environment variable NOSORTINDEX is set. When
this variable is set, the index is forced into a top-down build process instead of building
the index on pre-sorted key values.
When creating an index, there are a few conditions where the UPDATE STATISTICS
MEDIUM distributions are not created:
• The lead column of the index is a user-defined type (UDT)
• The lead column of the index is an opaque data type (including BOOLEAN and
LVARCHAR)
• The index is a functional index
• The index is a VII (virtual-index interface) index
• There are fewer than two rows in the table
• The index is an attached index

© Copyright IBM Corp. 2001, 2017 8-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

How to update statistics (1 of 2)

It is recommended that you use these guidelines for updating


table statistics:
1. For each table, execute:
UPDATE STATISTICS MEDIUM FOR TABLE
table_name DISTRIBUTIONS ONLY;
 If the table is large, you might want to specify a resolution of 1.0 and a
confidence of .99 or use SAMPLING SIZE
2. For the first column in each index, run
UPDATE STATISTICS HIGH FOR TABLE
table_name ( column );
 Execute one statement for each column

Updating statistics and data distributions © Copyright IBM Corporation 2017

How to update statistics


To ensure the best statistics for optimum query performance and to incur the least
overhead when executing the UPDATE STATISTICS command, follow these
guidelines:
1. Run UPDATE STATISTICS MEDIUM... DISTRIBUTIONS ONLY for each table.
2. Run UPDATE STATISTICS HIGH for the first column in each index.
Do not forget that primary and foreign key constraints are implemented with
indexes. Execute a separate UPDATE STATISTICS statement for each column.

© Copyright IBM Corp. 2001, 2017 8-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

How to update statistics (2 of 2)

3. If two multicolumn indexes begin with the same subset of columns,


run UPDATE STATISTICS HIGH for the first column that differs.
For example, given the index definitions:
CREATE INDEX ix1 ON tab1 (a, b, c, d);
CREATE INDEX ix2 ON tab1 (a, b, e, f);
Execute these statements:
UPDATE STATISTICS HIGH FOR TABLE tab1 (c);
UPDATE STATISTICS HIGH FOR TABLE tab1 (e);
4. For each multicolumn index, execute UPDATE STATISTICS LOW
statement for all of its columns.

Updating statistics and data distributions © Copyright IBM Corporation 2017

The next step is required to distinguish multicolumn indexes.


3. If two multicolumn indexes begin with the same subset of columns, run UPDATE
STATISTICS HIGH for the first column that differs.
As in Step 2, issue one UPDATE STATISTICS statement for each column.
4. For each multicolumn index, execute UPDATE STATISTICS LOW for all of its
columns. Include all columns in the multicolumn index in a single update statistics
low statement to update sysindexes for each index, as the following examples
show for the two indexes in Step 3:
• UPDATE STATISTICS FOR TABLE tab1(a,b,c,d);
• UPDATE STATISTICS FOR TABLE tab1(a,b,e,f);
Remember, the order of these statements is important. If you run an UPDATE
STATISTICS MEDIUM statement after an UPDATE STATISTICS HIGH
statement, the medium distributions will overwrite the high distributions.

© Copyright IBM Corp. 2001, 2017 8-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Updating statistics on small tables


• For small tables, it is acceptable to execute a HIGH statement for the
table:

UPDATE STATISTICS HIGH FOR TABLE small_table;

Updating statistics and data distributions © Copyright IBM Corporation 2017

Updating statistics on small tables


The steps provided for performing UPDATE STATISTICS are designed to achieve
the fastest performance by using computer resources as efficiently as possible.
Because the overhead to update small tables is much less, performance and
resource use is less of a concern. For relatively small tables, you might find it
acceptable to execute a single UPDATE STATISTICS HIGH statement.

© Copyright IBM Corp. 2001, 2017 8-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

UPDATE STATISTICS on temporary tables


• Users are not required to run UPDATE STATISTICS LOW on temp
tables:
 Number of rows and pages is updated every time the temp table's data
dictionary entry is accessed
 Adding indexes to temp tables will automatically create distributions and
statistics for the temp table
• Temporary table statistics and distribution information retained

Updating statistics and data distributions © Copyright IBM Corporation 2017

UPDATE STATISTICS on temporary tables


UPDATE STATISTICS is run automatically on temporary tables. The information is
automatically updated every time the temp table data dictionary is accessed.
Also, as in the case of regular tables, statistics and distributions are created
automatically whenever an index is created on the temporary table.
In past releases, any time the temporary table was altered, the statistics which
might have been created were lost. Now the statistics and distribution information is
retained, allowing for significant improvement in performance.

© Copyright IBM Corp. 2001, 2017 8-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

When to update statistics


• Always be sure that you update catalog statistics after:
 Loading data into a table
 Running update operations that significantly change the distribution of
values that a column contains
 Running delete or insert statements that change the number of rows that a
table contains
 Dropping or altering an index on a table

Updating statistics and data distributions © Copyright IBM Corporation 2017

When to update statistics


Good database administration practices require routinely updating catalog statistics.

© Copyright IBM Corp. 2001, 2017 8-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Problem queries (1 of 2)
• To resolve a problem query:
 Run the query with SET EXPLAIN ON to record the query plan.
 Run UPDATE STATISTICS HIGH for the columns listed in the WHERE
clause.
 Run the query with SET EXPLAIN ON. Was the estimated cost less?
 Compare the query plans.
 Compare runtime statistics stored in syssqlcurses.

syssqlcurses

Updating statistics and data distributions © Copyright IBM Corporation 2017

Problem queries
In most cases, the recommended strategy should yield a good enough sample size for
the optimizer to pick the correct path for most queries.
However, if a query is a problem (one that you perceive to be running slower than it
should), then take the following steps:
• Run the query with SET EXPLAIN ON to record the query plan.
• Run UPDATE STATISTICS HIGH for the columns listed in the WHERE clause.
• Run the query with SET EXPLAIN ON. Was the estimated cost less?
• Compare the query plans. If UDPATE STATISTICS HIGH produced a different
query plan and the estimated cost is less, the optimizer made a better choice and
the SELECT statement benefited from having more data available to the
optimizer.
• In addition to comparing before and after query plans, you might want to compare
the corresponding runtime statistics. To do so, set the SQLSTATS environment
variable to 2, run the query, and then use your session ID to query the
syssqlcurses table in the sysmaster database.

© Copyright IBM Corp. 2001, 2017 8-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Problem queries (2 of 2)
• If UPDATE STATISTICS HIGH produced improved query
performance:
 Run UPDATE STATISTICS MEDIUM with CONFIDENCE of .99 and an
increased RESOLUTION.
 Rerun the query with SET EXPLAIN ON.
 Check the query plan to see if it produced the same results as with
UPDATE STATISTICS HIGH.

syssqlcurses

Updating statistics and data distributions © Copyright IBM Corporation 2017

You might have received better results with UPDATE STATISTICS HIGH. However,
it might not be feasible for you to take the extra time each day to run HIGH mode on
these columns. Instead, you can move back to UPDATE STATISTICS MEDIUM for
the columns involved in the query, but this time set the confidence to 0.99 and
adjust the resolution value slightly lower so that the sample size is higher. Then
rerun the query and check the query plan to see if it returned the same results as
HIGH mode. You can repeat this process until the query plan matches the query
plan of HIGH mode.

© Copyright IBM Corp. 2001, 2017 8-33


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Dropping distributions
• To drop distributions while you update other statistics:

UPDATE STATISTICS LOW DROP DISTRIBUTIONS;

UPDATE STATISTICS LOW FOR TABLE customer


DROP DISTRIBUTIONS;

UPDATE STATISTICS LOW FOR TABLE orders(order_num)


DROP DISTRIBUTIONS;

Updating statistics and data distributions © Copyright IBM Corporation 2017

Dropping distributions
You can use the DROP DISTRIBUTIONS clause in an UPDATE STATISTICS LOW
statement to drop the existing distributions while you update other statistics such as
the number of levels of the B+ tree, the number of pages that the index uses, and so
on.
When you run UPDATE STATISTICS LOW without the DROP DISTRIBUTIONS
clause, only the statistics in systables, sysindexes, and syscolumns are updated.
The distributions are not dropped or altered in any way.
When you run UPDATE STATISTICS LOW on a table or specific column with the
DROP DISTRIBUTIONS clause, the statistics in systables, sysindexes, and
syscolumns for that table or specific column are updated and any distributions for
the table or specific column listed are dropped.
Who can drop distributions?
Only a DBA-privileged user or the owner of a table can drop distribution information.

© Copyright IBM Corp. 2001, 2017 8-34


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

When table changes affect distributions


• If the column data type or size is altered, the distribution is dropped
• If the column is dropped, the distribution is also dropped

sysdistrib

Updating statistics and data distributions © Copyright IBM Corporation 2017

When table changes affect distributions


If the ALTER TABLE statement alters the column data type or size, the distribution is
dropped and must be recreated with UPDATE STATISTICS. If the ALTER TABLE
drops the column, the distribution is also dropped.

© Copyright IBM Corp. 2001, 2017 8-35


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Space utilization

• DBUPSPACE
 Limits the amount of disk space used for sorts during UPDATE STATISTICS
− Minimum used is 5 MB
 Also limits amount of memory used for sorts to 4 MB

export DBUPSPACE=max_disk_space:max_memory

Maximum amount of Maximum amount of


disk space that can memory to use
be used to sort without using PDQ
values

Updating statistics and data distributions © Copyright IBM Corporation 2017

Space utilization
The UPDATE STATISTICS statement attempts to construct distributions
simultaneously for as many columns as possible. This minimizes the number of scans
needed for a table, and makes UPDATE STATISTICS run more efficiently. However,
with more distributions being created at once, the need for temporary disk space
increases.
Environment variable DBUPSPACE
You can set the DBUPSPACE environment variable before you run UPDATE
STATISTICS to constrain the amount of temporary disk space used for sorts. The
database server calculates how much disk space is needed for each sort and starts as
many distributions at once as can fit in the space allocated. At least one distribution is
created at one time, even if DBUPSPACE is set too low to accommodate it. If
DBUPSPACE is set to any value less than 1000 kilobytes, it is ignored and the value of
5000 kilobytes is used.
Memory utilization
In addition to limiting the amount of disk space used for sorts during UPDATE
STATISTICS, the database server limits the amount of memory used to 4 megabytes.
However, at least one distribution is created at one time, even if more than 4
megabytes are needed.

© Copyright IBM Corp. 2001, 2017 8-36


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Thread use
A sort can occur for every column for which you are building a distribution. If
PSORT_NPROCS is set, each sort can use up to PSORT_NPROCS threads.

© Copyright IBM Corp. 2001, 2017 8-37


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Update statistics tools


• Programs available at www.iiug.org
• Shell script variations:
− output to "updstats.sql" without headings
SELECT UNIQUE "UPDATE STATISTICS HIGH FOR TABLE "
|| trim(tabname) || " ( " || trim(colname) ||
" )", "distributions only;"
FROM systables, sysindexes, syscolumns
WHERE systables.tabid = sysindexes.tabid
AND systables.tabid = syscolumns.tabid
AND sysindexes.part1 = syscolumns.colno;

Updating statistics and data distributions © Copyright IBM Corporation 2017

Update statistics tools


Obviously, implementing the recommendations for updating statistics when you
have several hundred or even thousands of tables in your database is not a simple
matter. To assist you, a number of tools are available on the Informix International
Users Group website. Alternatively, you can develop your own script by using
variations of the SQL statement in the example to collect the appropriate column
names and build the HIGH, MEDIUM, or LOW statements you need.
The script shown does not work against the sysindices table. It only works against
the sysindexes view.

© Copyright IBM Corp. 2001, 2017 8-38


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Fragment-level statistics
• Store statistics at the fragment level and aggregating table-level
statistics from the statistics of the constituent fragment
• Catalog table sysfragdist stores statistics for each fragment for each
table and column
• Fragment-level statistics of constituent fragments are merged to form
table-level statistics and stored in sysdistrib
• Fragment-level statistics are encrypted and stored in an sbspace
defined by SYSSBSPACENAME in the ONCONFIG file. Column
encdist in the sysfragdist catalog stores the large object specifications
• Controlled by table property STATLEVEL

Updating statistics and data distributions © Copyright IBM Corporation 2017

Fragment-level statistics
When tables are stored across multiple partitions, statistics for the table can be
maintained at the fragment level. Fragment-level statistics are stored in the
sysfragdist catalog table.

© Copyright IBM Corp. 2001, 2017 8-39


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

STATLEVEL property
• STATLEVEL defines the granularity of statistics for a table
• Set using CREATE or ALTER TABLE
• Syntax:
CREATE TABLE...STATLEVEL {TABLE | FRAGMENT | AUTO}
 TABLE – the entire table dataset is read and table-level statistics are stored
in sysdistrib catalog
 FRAGMENT – the dataset of each fragment is read and fragment-level
statistics are stored in sysfragdist catalog (only allowed for fragmented
tables)
 AUTO (default)– the system determines whether TABLE or FRAGMENT
statistics are created when UPDATE STATISTICS is run

Updating statistics and data distributions © Copyright IBM Corporation 2017

STATLEVEL property
When you create or alter a table, you can set the granularity of statistics that are
maintained by specifying the STATLEVEL. You can specify a STATLEVEL of TABLE,
FRAGMENT, or AUTO.

© Copyright IBM Corp. 2001, 2017 8-40


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Update statistics extensions


• Automatic mode of update statistics aims to improve update statistics
turnaround time by "skipping" the rebuild of statistics not considered
stale for optimization purposes
• Controlled by ONCONFIG parameter AUTO_STAT_MODE, which is
set to 1 (ON) by default. Set AUTO_STAT_MODE to 0 (OFF) to
disable the automatic mode
• Syntax
UPDATE STATISTICS FOR TABLE ... [FORCE | AUTO]
 FORCE instructs UPDATE STATISTICS to collect distribution for all
fragments
 AUTO collects distributions only if they are considered to be stale
 If neither is specified, the behavior is that set by AUTO_STAT_MODE

Updating statistics and data distributions © Copyright IBM Corporation 2017

Update statistics extensions


Set the AUTO_STAT_MODE configuration parameter to 1 to enable UPDATE
STATISTICS to skip rebuilding of statistics that are not considered stale. You can
also set the update statistics mode for a single session by setting the
AUTO_STAT_MODE environment variable to 1.
Set the update statistics mode in the application by including the AUTO or FORCE
keywords. For example:
UPDATE STATISTICS FOR TABLE tab1 AUTO;
UPDATE STATISTICS HIGH FOR TABLE tab1 FORCE;
UPDATE STATISTICS MEDIUM FOR TABLE tab1
SAMPLING SIZE 0.8 RESOLUTION 1.0 AUTO;
The mode specified in UPDATE STATISTICS statement overrides the
AUTO_STAT_MODE environment variable, and the environment variable overrides
the AUTO_STAT_MODE parameter in ONCONFIG.

© Copyright IBM Corp. 2001, 2017 8-41


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Exercise 8
Updating statistics and data distributions
• create and run UPDATE STATISTICS commands
• compare optimizer query plans based on generated statistics

Updating statistics and data distributions © Copyright IBM Corporation 2017

Exercise 8: Updating statistics and data distributions

© Copyright IBM Corp. 2001, 2017 8-42


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Exercise 8:
Updating statistics and data distributions
Purpose:
In this exercise, you will learn how to use the UPDATE STATISTICS statement
and how to create data distributions using UPDATE STATISTICS.

Task 1. Using UPDATE STATISTICS.


In this task, you will decide the update statistics statements to be executed on the
customer, orders, items, stock, and catalog tables in your database, run these
statements, and monitor the changes using the sysdistrib system catalog and
dbschema.
1. Write UPDATE STATISTICS statements for the customer, orders, items,
stock, and catalog tables in your database. You will be entering these
statements into SQL files. Due to the SQL Edit buffer capacity, these cannot be
all saved to one file. Thus, you will create 2 files, each of which will be run – to
execute all of the statements required.
2. Save the first set of statements as: upd_stats1.sql
3. Write the remaining UPDATE STATISTICS statements in the second file, and
save the second file as: upd_stats2.sql.
4. Create a duplicate index on the lname and fname columns of the customer
table. Name this index cust_name_idx.
5. Drop all data distributions. Since data distributions were automatically built when
you created your indexes, they must be dropped in order to see the effect of
distributions on the optimizer. Execute the following SQL statement in your
stores_demo database to drop the existing distributions:
UPDATE STATISTICS LOW DROP DISTRIBUTIONS;
6. Execute the following query and capture the Explain plan:
SET EXPLAIN FILE TO 'ex08.expl';
UNLOAD TO /dev/null
SELECT * FROM customer
WHERE FNAME = 'Douglas';
SET EXPLAIN OFF;
7. Examine the ex08.expl file and make a note of the optimizer’s choices for
access method and cost.
8. Execute the UPDATE STATISTICS statements you’ve saved in both the
upd_stats1.sql and upd_stats1.sql files, in your stores_demo database.
$ dbaccess stores_demo upd_stats1.sql
$ dbaccess stores_demo upd_stats2.sql

© Copyright IBM Corp. 2001, 2017 8-43


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

9. Query the sysdistrib system catalog table and gather the following information
for column 1 on the customer table.
SELECT *
FROM sysdistrib d, systables t
WHERE t.tabname = "customer"
AND t.tabid = d.tabid
AND d.colno = 1;

Table customer:
What is the resolution used to create the distribution?
What is the confidence used to create the distribution?
What is the mode?
10. Use dbschema to generate distribution information to a file for the items and
customer tables and answer the following questions about the stock_num
column from the items table and lname column from the customer table:

Table items.stock_num:
How many bins were created?
How many distinct values are represented by each regular bin?
How many overflow bins are created?
Do all the overflow bins contain the same number of values?

Table customer.lname:
How many bins were created?
How many distinct values are represented by each regular bin?
How many overflow bins are created?
Do all the overflow bins contain the same number of values?
Task 2. Understanding the optimizer choice.
1. Using the following query from the previous task, capture the Explain plan and
compare the optimizer’s choices with the choices from the previous exercise.
SET EXPLAIN FILE TO 'ex08.expl';
UNLOAD TO /dev/null
SELECT * FROM customer
WHERE fname = 'Douglas';
SET EXPLAIN OFF;
How have the optimizer’s choices changed?
How has the optimizer’s choice changed?
Result:
In this exercise, you learned how to use the UPDATE STATISTICS statement
and how to create data distributions using UPDATE STATISTICS.

© Copyright IBM Corp. 2001, 2017 8-44


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Exercise 8:
Updating statistics and data distributions - Solutions
Purpose:
In this exercise, you will learn how to use the UPDATE STATISTICS statement
and how to create data distributions using UPDATE STATISTICS.

Task 1. Using UPDATE STATISTICS.


In this task, you will decide the update statistics statements to be executed on the
customer, orders, items, stock, and catalog tables in your database, run these
statements, and monitor the changes using the sysdistrib system catalog and
dbschema.
1. Write UPDATE STATISTICS statements for the customer, orders, items,
stock, and catalog tables in your database.
Due to the SQL Edit buffer capacity, these cannot be all saved to one file. Thus,
you will create 2 files, each of which will be run - to execute all of the statements
required.

--MEDIUM for each table, DISTRIBUTIONS ONLY

UPDATE STATISTICS MEDIUM FOR table customer


RESOLUTION 10
DISTRIBUTIONS ONLY;
UPDATE STATISTICS MEDIUM FOR table orders
RESOLUTION 10
DISTRIBUTIONS ONLY;
UPDATE STATISTICS MEDIUM FOR table items
RESOLUTION 10
DISTRIBUTIONS ONLY;
UPDATE STATISTICS MEDIUM FOR table stock
RESOLUTION 10
DISTRIBUTIONS ONLY;
UPDATE STATISTICS MEDIUM FOR table catalog
RESOLUTION 10
DISTRIBUTIONS ONLY;

--HIGH for all indexed columns in the customer table

UPDATE STATISTICS HIGH FOR table customer(customer_num)


RESOLUTION 10;
UPDATE STATISTICS HIGH FOR table customer(zipcode)
RESOLUTION 10;

--HIGH for all indexed columns in the orders table

© Copyright IBM Corp. 2001, 2017 8-45


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

UPDATE STATISTICS HIGH FOR table orders(order_num)


RESOLUTION 10;
UPDATE STATISTICS HIGH FOR table orders(customer_num)
RESOLUTION 10;
2. Save the first set of statements as: upd_stats1.sql
3. Write the remaining UPDATE STATISTICS statements in the second file, and
save the file as: upd_stats2.sql

--HIGH for the indexed columns in the items table (or for
--those with a composite index, the first column in the
--index)
UPDATE STATISTICS HIGH FOR table items(item_num)
RESOLUTION 10;
UPDATE STATISTICS HIGH FOR table items(order_num)
RESOLUTION 10;
UPDATE STATISTICS HIGH FOR table items(manu_code)
RESOLUTION 10;

UPDATE STATISTICS HIGH FOR table stock(stock_num)


RESOLUTION 10;
UPDATE STATISTICS HIGH FOR table stock(manu_code)
RESOLUTION 10;

-- stock_num and manu_code are both contained in a


-- multi-column index
UPDATE STATISTICS LOW FOR table stock
(stock_num,manu_code);

--HIGH for all indexed columns in the catalog table

UPDATE STATISTICS HIGH FOR table catalog(catalog_num)


RESOLUTION 10;

4. Create a duplicate index on the lname and fname columns of the customer
table. Name this index cust_name_idx.
CREATE INDEX cust_name_idx ON customer(lname, fname);
5. Drop all data distributions. Since data distributions were automatically built when
you created your indexes, they must be dropped in order to see the effect of
distributions on the optimizer. Execute the following SQL statement in your
stores_demo database to drop the existing distributions:
UPDATE STATISTICS LOW DROP DISTRIBUTIONS;

© Copyright IBM Corp. 2001, 2017 8-46


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

6. Execute the following query and capture the Explain plan:


SET EXPLAIN FILE TO 'ex08.expl';
UNLOAD TO /dev/null
SELECT * FROM customer
WHERE FNAME = 'Douglas';
SET EXPLAIN OFF;
7. Examine the ex08.expl file and make a note of the optimizer’s choices for
access method and cost.
Sequential scan with an estimated cost of 717.

8. Execute the UPDATE STATISTICS statements you’ve saved in the


upd_stats1.sql and upd_stats2.sql files, in your stores_demo database.
$ dbaccess stores_demo upd_stats1.sql
$ dbaccess stores_demo upd_stats2.sql
9. Query the sysdistrib system catalog table and gather the following information
for column 1 on the customer table.
SELECT *
FROM sysdistrib d, systables t
WHERE t.tabname = "customer"
AND t.tabid = d.tabid
AND d.colno = 1;

Table customer:
What is the resolution used to create the distribution? 10.0
What is the confidence used to create the distribution? 0
What is the mode? H for HIGH

© Copyright IBM Corp. 2001, 2017 8-47


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

10. Use dbschema to generate distribution information to a file for the items and
customer tables and answer the following questions about the stock_num
column from the items table and lname column from the customer table:
Table items.stock_num:
$ dbschema -hd items -d stores_demo
How many bins were created? 5

How many distinct values are represented by each regular bin?


The number of distinct values per bin is not the same across all bins.
They range from 4 to 8 distinct values per bin. But the total number of
values in each bin (with the exception of the last bin) is the same.

How many overflow bins are created? 12

Do all the overflow bins contain the same number of values?


No, the overflow bins have from 27 - 49 rows represented, depending
on the bin’s distinct value.

© Copyright IBM Corp. 2001, 2017 8-48


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Table customer.lname:
$ dbschema -hd customer -d stores_demo
How many bins were created? 4

How many distinct values are represented by each regular bin?


The number of distinct values per bin is not the same across all bins.
They range from 7 to 11 distinct values per bin. But the total number of
values in each bin (with the exception of the last bin) is the same.

How many overflow bins are created? 10


Do all the overflow bins contain the same number of values?
No, the overflow bins have from 151 - 479 rows represented, depending
on the bin’s distinct value.

© Copyright IBM Corp. 2001, 2017 8-49


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Task 2. Understanding the optimizer choice.


1. Using the following query from the previous task, capture the Explain plan and
compare the optimizer’s choices with the choices from the previous exercise.
SET EXPLAIN FILE TO 'ex08.expl';
UNLOAD TO /dev/null
SELECT * FROM customer
WHERE fname = 'Douglas';
SET EXPLAIN OFF;

How have the optimizer’s choices changed?


Before UPDATE STATISTICS:

© Copyright IBM Corp. 2001, 2017 8-50


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

After UPDATE STATISTICS:

How has the optimizer’s choice changed?


The optimizer now uses the index on the lname and fname columns
instead of a sequential scan. Notice also that the cost for the query has
gone from 717 down to 230. In other words, this query has improved.
Result:
In this exercise, you learned how to use the UPDATE STATISTICS statement
and how to create data distributions using UPDATE STATISTICS.

© Copyright IBM Corp. 2001, 2017 8-51


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 8 Updating statistics and data distributions

Unit summary
• Execute the UPDATE STATISTICS statement and explain the results
• Use the system catalog tables to monitor data distributions

Updating statistics and data distributions © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 8-52


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Managing the optimizer

Managing the optimizer

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

© Copyright IBM Corp. 2001, 2017 9-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Unit objectives
• Describe the effect on the engine of the different values of
OPTCOMPIND
• Describe the effects of setting the OPT_GOAL parameter
• Write optimizer directives to improve performance

Managing the optimizer © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 9-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Influencing the optimizer


• OPTCOMPIND:
 Configuration parameter
 Environment variable
 Session-level SQL statement
• SET OPTIMIZATION:
 High/low
 All rows/first rows
• OPTIMIZER DIRECTIVES

Managing the optimizer © Copyright IBM Corporation 2017

Influencing the optimizer


The Informix optimizer is a sophisticated process that generally chooses the most
efficient query plan available. However, a few tools are also available for influencing
the optimizer's choices.

© Copyright IBM Corp. 2001, 2017 9-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

OPTCOMPIND

OPTCOMPIND Optimizer Preference


0 Access path: Index scan, when available
Join plan: Prefer nested loop join
1 If isolation is Repeatable Read, behave as if
OPTCOMPIND=0
Otherwise, OPTCOMPIND=2
2 Normal behavior: Choose lowest cost query plan

• The SET ENVIRONMENT OPTCOMPIND SQL statement allows the


user to specify it at the user session level.
• For example:
SET ENVIRONMENT OPTCOMPIND '2';
Managing the optimizer © Copyright IBM Corporation 2017

OPTCOMPIND
The OPTCOMPIND configuration parameter allows you to influence the behavior of the
optimizer for all queries executed against the database server. You can override this
global behavior by setting the OPTCOMPIND environment variable in a session
environment.
The OPTCOMPIND value influences both the access plan and join plan chosen by the
optimizer. An OPTCOMPIND setting of 0, or 1 with an active isolation level of
repeatable read, instructs the optimizer to consider:
• Only index-scan access paths only when an index is available
• A sequential-scan access path only when no index is available
• Nested-loop joins
When OPTCOMPIND is 2, or OPTCOMPIND is 1 and the isolation level is not
repeatable read, the optimizer chooses the lowest cost path. An OPTCOMPIND value
of 2 is the default onconfig.std setting.

© Copyright IBM Corp. 2001, 2017 9-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

The value entered using the SET ENVIRONMENT OPTCOMPIND command takes
precedence over the current setting specified in the configuration file. The default
setting of the OPTCOMPIND environment variable is restored when the current session
terminates. No other user sessions are affected by SET ENVIRONMENT
OPTCOMPIND statements that a session executes.
The quotes shown around the number are required.
Running SET ENVIRONMENT OPTCOMPIND DEFAULT reverts the session back to
the values specified by the configuration parameter.

© Copyright IBM Corp. 2001, 2017 9-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

SET OPTIMIZATION

• Default values are HIGH and ALL_ROWS


• Duration is for the process or until next SET OPTIMIZATION statement
• Set one option per statement, use two statements if needed

Managing the optimizer © Copyright IBM Corporation 2017

SET OPTIMIZATION
The SET OPTIMIZATION SQL statement allows you to specify the optimization goal
and time that the optimizer spends considering alternative query paths.
FIRST_ROWS versus ALL_ROWS
The goal in optimizing the query might be to retrieve the first buffer of rows as quickly as
possible, or to retrieve all rows in the quickest manner. If the application is an end-user
query tool, you might choose FIRST_ROWS optimization.
Suppose the end user is a financial analyst performing a series of what-if scenarios. To
run each scenario, they submit a query that retrieves many rows. After viewing just a
few rows, they realizes that this scenario does not produce the needed result and they
move on to the next query. By choosing FIRST_ROWS optimization, the user might
receive a quicker response from the database server, which in turn, allows them to be
more productive.
For a batch application that processes payroll updates for several thousand employees,
however, ALL_ROWS optimization is probably the most desirable optimization method
and will likely produce the best performance.

© Copyright IBM Corp. 2001, 2017 9-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

HIGH versus LOW


The HIGH and LOW options influence how much time the optimizer spends analyzing
query paths. The HIGH option instructs the optimizer to use a sophisticated algorithm to
examine all reasonable query plans and select the best alternative. The LOW option
uses a less sophisticated but faster algorithm. This algorithm eliminates more of the join
options earlier in the optimization phase, reducing the time spent analyzing possible
paths.

© Copyright IBM Corp. 2001, 2017 9-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

OPTIMIZATION LOW

SELECT * FROM a,b,c,d WHERE a.a = b.a


AND b.b = c.b and c.c = d.c;

ab ac ad bc ....

acb acd
Examine only the lowest
acdb cost paths

Managing the optimizer © Copyright IBM Corporation 2017

OPTIMIZATION LOW
The example shows how the optimizer might choose a query path when optimization is
set to LOW. At each level (two-way join, three-way join, four-way join), the lowest-cost
join is chosen and the other paths are not examined further.
In the SET OPTIMIZATION LOW example, ac is chosen as the least-cost two-way join.
The other two-way joins are not examined any further. Next, the three-way joins
possible from the ac join are examined. Again, only the least-cost join is followed down
to the next level. As you can see, the number of joins that must be examined is
drastically reduced.
The SET OPTIMIZATION LOW statement reduces the time spent optimizing the query,
but increases the risk of a less efficient query plan being chosen.

© Copyright IBM Corp. 2001, 2017 9-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

When to try SET OPTIMIZATION LOW


• If the query time is unacceptable
• Queries that join five or more tables
• All join columns are indexed
• One table in the query is joined to all other tables in the query

Managing the optimizer © Copyright IBM Corporation 2017

When to try SET OPTIMIZATION LOW


You should only try SET OPTIMIZATION LOW when your query time is unacceptable.
Since there is no way to determine the percentage of query time that was spent on
query optimization, you have to make best-guess decisions.
Some unique types of queries that improve the chance that using the SET
OPTIMIZATION LOW algorithm will choose the best query plan include:
• A SELECT statement that includes several tables and the number of possible join
combination is very high.
• When all join columns are indexed, the optimizer has a better chance of choosing
an appropriate path even when using the LOW algorithm.
There is no sure, guaranteed formula for knowing when SET OPTIMIZATION LOW will
give you better results, except by testing the query with both the LOW and HIGH
settings.
The SET OPTIMIZATION LOW statement remains in effect until the session
disconnects from the database server, or until you execute a SET OPTIMIZATION
HIGH command. The optimization is set at the session level, so disconnecting from one
database and connecting to another database when both databases reside on the
same Informix database server does not reset optimization.

© Copyright IBM Corp. 2001, 2017 9-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

FIRST_ROWS
• Useful for decision support environments and online query and
reporting activities
• Instructs the optimizer to choose the query path that returns the first
buffer of rows quickest (Note: Total query time might be longer!)
• To use statement by statement:
 SQL statement:
SET OPTIMIZATION FIRST_ROWS;
 Use FIRST_ROWS optimizer directive
• OPT_GOAL (-1 = ALL_ROWS(Default), 0 = FIRST_ROWS)
 Place in ONCONFIG file to set as default for the database server

Managing the optimizer © Copyright IBM Corporation 2017

FIRST_ROWS
The SET OPTIMIZATION FIRST_ROWS statement is useful for decision support
environments and online query and reporting activities. It instructs the optimizer to
choose a query path that returns the first buffer of rows most quickly, even if the
time for retrieving all rows increases.
When you use the SQL statement:
SET OPTIMIZATION FIRST_ROWS;
the optimization goal remains in effect until the end of the process or until you
execute a statement to set ALL_ROWS optimization.
You can also specify FIRST ROWS optimization as the default for your instance by
setting the OPT_GOAL parameter in your configuration file. For example:
# Optimization goal: -1 = ALL_ROWS(Default), 0 = FIRST_ROWS
OPT_GOAL -1
Alternatively, you can set an optimization goal for a particular user environment
using the environment variable OPT_GOAL.
# ksh
export OPT_GOAL=0

© Copyright IBM Corp. 2001, 2017 9-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

If you only want to use FIRST_ROWS optimization for a single query, you can use
the optimizer directive FIRST_ROWS.
SELECT --+ FIRST_ROWS
fname, lname FROM customer
ORDER BY lname;
Optimizer directives are covered next in this unit.

© Copyright IBM Corp. 2001, 2017 9-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Using directives

• Positive directives
• Negative directives

Directive

Managing the optimizer © Copyright IBM Corporation 2017

Using directives
Informix provides both positive and negative optimizer directives. A positive directive
instructs the optimizer to limit its choice to a certain set of paths. A negative
directive instructs the optimizer to avoid certain less-than-optimal paths. The
optimizer is still free to consider all other paths. If a new index provides an improved
path, the directive does not need to be changed. The optimizer is automatically free
to choose the new path.
If the outer circle is the set of all possible query paths, a positive directive instructs
the optimizer to limit considered paths to paths that are in the inner circle. A
negative directive allows the optimizer to choose any path in the set defined by the
outer circle, but not the inner circle.
Use negative directives rather than positive directives whenever possible. This
ensures that directives require less maintenance, because as indexes are added
and removed or data distributions change, the optimizer is free to select a more
optimal path whenever one is available.

© Copyright IBM Corp. 2001, 2017 9-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Types of optimizer directives


• Directives support control in the following areas of the optimization
process:
 Access-method directives: index versus scans
 Join-order directive: specifies order in which tables are joined
 Join-method directives: force hash joins or nested-loop joins
 Optimization-goal directives: for faster response time versus throughput
 Explain directive: generates query plan output

Managing the optimizer © Copyright IBM Corporation 2017

Types of optimizer directives


Optimizer directives allow the query developer to influence the optimizer in the
creation of a query plan for an SQL statement.
Optimizer directives provide the flexibility to direct the optimizer to follow specific
paths rather than choose a plan through its analysis. This can result in reducing time
required for correcting performance problems. Rewriting queries can be very time
consuming, and directives provide a quick way to alter a plan and test it.

© Copyright IBM Corp. 2001, 2017 9-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Identifying directives
• Directives are identified by using comment notation followed by a plus
sign:
SELECT --+ORDERED,INDEX(c cust_idx)
*
FROM customer c, orders o
WHERE c.customer_num BETWEEN 101 AND 1001
AND c.customer_num = o.customer_num;
• SQEXPLAIN output
...
DIRECTIVES FOLLOWED:
ORDERED
DIRECTIVES NOT FOLLOWED:
INDEX ( c cust_idx ) Invalid Index Name Specified.

Estimated cost: 4
− ...
Managing the optimizer © Copyright IBM Corporation 2017

Identifying directives
Directives are identified by using comment notation followed by a plus sign (+). Valid
syntax includes:
--+directive text
{+directive text}
/*+directive text */
In order for directives to be allowed in Informix ESQL products, the ESQL
preprocessor has been modified to pass comments containing directives to the
database server instead of stripping them out.
Directives can also tell the optimizer what to AVOID, rather than what to choose.
This allows you to write a directive to avoid certain actions that are known to cause
performance problems for a query. The optimizer can still explore using any new
indexes or table attributes as they are added.
When directives are used, two headings are added to the SET EXPLAIN output:
DIRECTIVES FOLLOWED:
DIRECTIVES NOT FOLLOWED:
If the directive is followed, it is listed under the DIRECTIVES FOLLOWED heading.

© Copyright IBM Corp. 2001, 2017 9-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

If a directive is not followed, it is listed under the DIRECTIVES NOT FOLLOWED


heading along with a reason for not following the directive. In the example on the
slide, the index specified for the customer table does not exist.
If the syntax for a directive is incorrect, the error is ignored (since the directive is
contained within a comment) and no entry for the directive is made under either
heading. If you do not see your directive under either of the two headings, then that
is a clue that the syntax is incorrect.

© Copyright IBM Corp. 2001, 2017 9-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Access method directives


• INDEX: Force all indexes (or a specified index) to be used
• AVOID_INDEX: Do not use indexes (or a specified index)
• INDEX_ALL or MULTI_INDEX: Access table using the specified indexes
• AVOID_MULTI_INDEX: Do not use a multi-index scan path for the
specified indexes
• FULL: Perform a full table sequential scan
• AVOID_FULL: Avoid a full table sequential scan (use index if available)
• INDEX_SJ: Use a self-join on the specified index
• AVOID_INDEX_SJ: Do not do a self-join on the specified index
SELECT --+AVOID_FULL(e),INDEX (e salary_indx)
name, salary
FROM emp e
WHERE e.department_num = 1
AND e.salary > 5000;
Managing the optimizer © Copyright IBM Corporation 2017

Access method directives


Use the access-method directives to specify the manner in which the optimizer should
search the tables. The nine access method directives provided by Informix are:
• INDEX: If one index is specified, that index is used. If more than one index is
specified, the index used is chosen from that list based on the least cost. If no
specific indexes are named, then all indexes are considered. A sequential scan is
never considered unless the table has no indexes. For example:
SELECT --+ INDEX (e salary_indx)
name, salary
FROM emp e WHERE e.dno = 1 AND e.salary > 50000;
• AVOID_INDEX: This allows the optimizer to consider indexes that are added
after the query is written while avoiding known indexes that slow down the query.
A full table scan is also possible.
• INDEX_ALL or MULTI_INDEX: These keywords are inter-changeable synonyms.
This allows the optimizer to consider all available indexes and choose the best
combination of indexes to use (when no index list is supplied) or to choose a
multi-index access path using ALL of the indexes supplied.
• AVOID_MULTI_INDEX: The optimizer does not consider a multi-index scan path
for the specified table.

© Copyright IBM Corp. 2001, 2017 9-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

• FULL (tablename): This forces the optimizer to perform a sequential scan on the
table specified, even if an index exists on a column in the table. For example:
SELECT --+FULL(e)
name, salary
FROM emp e;
• AVOID_FULL (tablename): The optimizer considers the various indexes it can
scan. If no indexes exist, the optimizer performs a full-table scan. This directive
might be used with REPEATABLE READ isolation level, for example, to avoid the
full-table scan and subsequent locking.
• INDEX_SJ: This directive forces an index self-join path using the specified index,
or choosing the least costly index in a list of indexes, even if data distribution
statistics are not available for the leading index key columns of the index.
• AVOID_INDEX_SJ: Tells the optimizer not to use an index self-join path for the
specified index or indexes.
The optimizer automatically considers the index self-join path if you specify the INDEX
or AVOID_FULL directive. Use the INDEX_SJ directive only to force an index self-join
path using the specified index (or the least costly index in a comma-separated list of
indexes). The INDEX_SJ directive can improve performance when a multicolumn index
includes columns that provide only low selectivity as index key filters.
Specifying the INDEX_SJ directive circumvents the usual optimizer requirement for
data distribution statistics on the lead keys of the index. This directive causes the
optimizer to consider an index self-join path, even if data distribution statistics are not
available for the leading index key columns. In this case, the optimizer only includes the
minimum number of index key columns as lead keys to satisfy the directive.
For example, if an index is defined on columns c1, c2, c3, c4, and the query specifies
filters on all four of these columns but no data distributions are available on any column,
specifying INDEX_SJ on this index results in column c1 being used as the lead key in
an index self-join path.
If you want the optimizer to use an index but not to consider the index self-join path,
then you must specify an INDEX or AVOID_FULL directive to choose the index, and
you must also specify an AVOID_INDEX_SJ directive to prevent the optimizer from
considering any other index self-join path.
Multiple directives can be used as long as they are in the same comment block.

© Copyright IBM Corp. 2001, 2017 9-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Access method directives in combination


• Can generally specify only one access-method directive per table
• The following combinations are valid for the same table in the same
query:
 INDEX,AVOID_INDEX_SJ
 AVOID_FULL,AVOID_INDEX
 AVOID_FULL,AVOID_INDEX_SJ
 AVOID_INDEX,AVOID_INDEX_SJ
 AVOID_FULL,AVOID_INDEX,AVOID_INDEX_SJ
 AVOID_FULL,AVOID_MULTI_INDEX
 AVOID_INDEX,AVOID_MULTI_INDEX
 AVOID_INDEX_SJ,AVOID_MULTI_INDEX
 AVOID_FULL,AVOID_INDEX_SJ,AVOID_MULTI_INDEX
 AVOID_INDEX,AVOID_INDEX_SJ,AVOID_MULTI_INDEX

Managing the optimizer © Copyright IBM Corporation 2017

Access method directives in combination


In general, you can specify only one access-method directive per table. Only the
combinations of access method directives shown in the visual are valid for the same
table in the same query.
If AVOID_INDEX_SJ is used together with the INDEX directive, either as an explicit
INDEX directive or as the equivalent AVOID_FULL and AVOID_INDEX combination,
the indexes specified in the AVOID_INDEX_SJ directive must be a subset of the
indexes specified in the INDEX directive.

© Copyright IBM Corp. 2001, 2017 9-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Join order directive


• Use the ORDERED directive to access tables in FROM clause order.

SELECT --+ORDERED
c.customer_num, sum(total_price)
FROM customer c, orders o, items i
WHERE c.customer_num = o.customer_num
AND o.order_num = i.order_num
GROUP BY 1;

Managing the optimizer © Copyright IBM Corporation 2017

Join order directive


The ORDERED directive is especially beneficial for applications originally written for
rules-based optimizers. Because a rules-based optimizer typically joins tables
according to their order in the FROM clause, developers familiar with rules-based
optimizers can build SQL statements accordingly.

© Copyright IBM Corp. 2001, 2017 9-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Join method directive


• The join method directives allow you to influence whether the optimizer
performs a nested-loop or hash join:
 USE_NL
 AVOID_NL
 USE_HASH
− /BUILD
− /PROBE

 AVOID_HASH

Managing the optimizer © Copyright IBM Corporation 2017

Join method directive


The join method directives allow you to influence whether the optimizer preforms a
nested-loop or a hash join.
The join method directives allow you to determine which tables are inner tables in the
nested-loop joins and which tables are hashed or probed when hash joins are used.
The join-method directives include:
• USE_NL (tablename)
The USE_NL directive can be used to force a nested-loop join. In a nested-loop
join, each row in the outer table is used to probe the inner table to find matching
rows. The joined rows are returned as the result. Access to the inner table can be
a scan, an existing index, or a dynamically built index.
USE_NL takes table names as arguments. The maximum number of tables you
can specify is one less than the total number of tables because one table has to
be used as the outer table for the sequence of nested loops.
Example:
SELECT --+USE_NL (dept)
name, title, salary, dname
FROM emp, dept, job
WHERE loc = "Palo Alto"

© Copyright IBM Corp. 2001, 2017 9-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

AND emp.department_num = dept.department_num


AND emp.job = job.job;
This causes the optimizer to use a nested-loop join to join the dept table with the
other tables in the query. The dept table is the inner table of the join.
• AVOID_NL (tablename)
AVOID_NL can be used to force the optimizer to avoid a nested loop join on the
specified tables.
• USE_HASH (tablename)
Use the USE_HASH directive to force a hash join. In a hash join, the rows of one
of the tables is used to build a hash table for each element. Then, the second
table is processed by probing, directly hashing the probed value into the hashed
table and checking for a join pair.
The USE_HASH directive without any arguments directs the optimizer to join
tables using a hash join, determining the order of the join by cost. It is possible to
specify which table is to be the build table or probe table within the directive.
Example
SELECT --+USE_HASH (dept/BUILD)
name, title, salary, dname
FROM emp, dept, job
WHERE loc = "Palo Alto"
AND emp.department_num = dept.department_num
AND emp.job = job.job;
This directive causes the optimizer to join the dept table using a hash join and to
build the hash table on the dept table.
• AVOID_HASH (tablename)
AVOID_HASH forces the optimizer to avoid hash joins of the listed table. You can
optionally restrict the table from being the probe table or the build table.

© Copyright IBM Corp. 2001, 2017 9-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Optimizing goal directives


• FIRST_ROWS:
 Allows you to instruct the optimizer to choose a plan to retrieve the first
screen of rows as quickly as possible
 Useful for interactive reporting:
SELECT --+FIRST_ROWS
FIRST 50 *
FROM customer;
• ALL_ROWS:
 Default value
 Optimizer selects the quickest path for retrieving all rows in the result set

Managing the optimizer © Copyright IBM Corporation 2017

Optimizing goal directives


The FIRST_ROWS directive is useful for developers and users alike. For
developers, it is a simple tool that can be used with the SQL FIRST keyword to
quickly access a small sample set of data from a large table.
Additionally, if developers add the FIRST_ROWS optimization directive to screen
populating queries in end-user reporting tools, users perceive a quicker response
from the system. Quicker system response means happier users!

© Copyright IBM Corp. 2001, 2017 9-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

EXPLAIN directive
SELECT --+ORDERED, EXPLAIN
c.customer_num, sum(total_price)
FROM customer c, orders o, items i
WHERE c.customer_num = o.customer_num
AND o.order_num = i.order_num
GROUP BY 1;

DIRECTIVES FOLLOWED:
ORDERED
EXPLAIN
DIRECTIVES NOT FOLLOWED:
Estimated Cost: 17
Estimated # of Rows Returned: 1
1) informix.c: INDEX PATH ...
Managing the optimizer © Copyright IBM Corporation 2017

EXPLAIN directive
The EXPLAIN directive allows you to turn on the SQL explain feature directly from
the query. The explain feature is very helpful for testing directives.

© Copyright IBM Corp. 2001, 2017 9-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Directives
• CAN be used:
 In SELECT, UPDATE, and DELETE statements
 In SELECT statements embedded in INSERT statements
 In views, stored procedures and triggers
• CANNOT be used:
 In distributed queries that access remote tables
 For UPDATE/DELETE WHERE CURRENT OF statements
• Multiple compatible directives can be used within the same comment
block

Managing the optimizer © Copyright IBM Corporation 2017

Directives
The visual describes when directives can and cannot be used.

© Copyright IBM Corp. 2001, 2017 9-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Tips for using directives


• Apply the following guidelines when using directives in your queries:
 Frequently examine the effectiveness of a directive.
− Changes to data values or indexes may alter the best path for a query.
 Use negative directives whenever possible:
− Negative directives are less limiting.
− Ifchanges to data values or indexes favor a different path, the query can still be
free to choose this path.
 Disable processing of all directives by setting:
− The DIRECTIVES configuration parameter to 0 (zero)
− The IFX_DIRECTIVES environment variable to OFF

Managing the optimizer © Copyright IBM Corporation 2017

Tips for using directives


These guidelines might be helpful when determining how and where to use directives in
your queries:
• Frequently re-evaluate the effectiveness of a directive. Changes to the table
structure, available indexes, data distribution, and so on, can change how the
query is optimized.
• Whenever possible, use negative directives as opposed to positive directives. A
positive directive limits the optimizer to one choice, such as always perform an
index scan on this table. A negative directive only excludes a poor choice, leaving
all other options available to the optimizer.
• Directives are processed by default. A DIRECTIVES configuration parameter can
be set to 0 (OFF) to disable the processing of directives. The environment
variable IFX_DIRECTIVES can also be set to ON or OFF to control whether
directives are processed.

© Copyright IBM Corp. 2001, 2017 9-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

External directives
• Allow dynamic substitution of SQL with directive in packaged
applications
• Use SAVE EXTERNAL DIRECTIVE SQL statement
• Requires ONCONFIG configuration parameter and session
environment variable
• Saved in sysdirectives system catalog table
• Example:
SAVE EXTERNAL DIRECTIVES /*+ avoid_full(table1) */
ACTIVE
FOR SELECT * FROM table1 WHERE col_1 > 5000;

Managing the optimizer © Copyright IBM Corporation 2017

External directives
The external directive feature of Informix allows the dynamic rewrite of an SQL
statement to add directives. External optimizer directives are useful when it is not
feasible to rewrite a query for a short-term solution to a problem, such as when a
query starts to perform poorly.
For external directives to work, the EXT_DIRECTIVES configuration parameter
must be set to a value greater than zero at server initialization time, and the
IFX_EXTDIRECTIVES variable might need to be set on the client.

© Copyright IBM Corp. 2001, 2017 9-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Exercise 9
Managing the optimizer
• examine the effect on the optimizer of using Optimizer Directives

Managing the optimizer © Copyright IBM Corporation 2017

Exercise 9: Managing the optimizer

© Copyright IBM Corp. 2001, 2017 9-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Exercise 9 :
Managing the optimizer
Purpose:
In this exercise, you will learn how to manage the optimizer using optimizer
parameters and directives.

Task 1. Using access method directives.


In this task, you will demonstrate the use of access method optimizer directives and
analyze the query plan developed for an SQL statements. Use SET EXPLAIN to
generate the ex09.expl explain file for the entire set of queries.
1. Run the following queries and analyze the query plan developed for the SQL
statements generated in the ex09.expl file.
SET EXPLAIN FILE TO "ex09.expl";

UNLOAD TO /dev/null
SELECT --+INDEX(c zipcode_ix)
* FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";

UNLOAD TO /dev/null
SELECT --+AVOID_INDEX(c zipcode_ix)
* FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";

UNLOAD TO /dev/null
SELECT --+FULL(customer)
* FROM customer
WHERE zipcode > "65";

UNLOAD TO /dev/null
SELECT --+AVOID_FULL(customer)
* FROM customer
WHERE zipcode > "65";

SET EXPLAIN OFF;

Were your directives followed?

© Copyright IBM Corp. 2001, 2017 9-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Task 2. Using the join order directive.


In this task, you will demonstrate the use of join-order optimizer directive and
analyze the query plan developed for an SQL statement. Use SET EXPLAIN to
generate an ex09.expl Explain plan file for the query.
1. Run the following query and analyze the query plan developed for the SQL
statement generated in the ex09.expl file.
SET EXPLAIN FILE TO "ex09.expl";

UNLOAD TO /dev/null
SELECT --+ORDERED
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;

SET EXPLAIN OFF;


Task 3. Using join method directives.
In this task, you will demonstrate the use of join-method optimizer directive and
analyze the query plans developed for the following SQL statement. Use SET
EXPLAIN to generate an ex09.expl Explain plan file for the query.
1. Run the following queries and compare the query plans developed for the SQL
statements generated in the ex09.expl file. Make sure you include the “SET
EXPLAIN OFF” statement.
SET EXPLAIN FILE TO "ex09.expl";
UNLOAD TO /dev/null
SELECT --+USE_HASH(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;

UNLOAD TO /dev/null
SELECT --+AVOID_HASH(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;

© Copyright IBM Corp. 2001, 2017 9-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

UNLOAD TO /dev/null
SELECT --+USE_NL(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";

UNLOAD TO /dev/null
SELECT --+USE_NL(customer)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";

UNLOAD TO /dev/null
SELECT --+AVOID_NL(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
SET EXPLAIN OFF;
Task 4. Using optimizer goal directives
In this task, you will demonstrate the use of optimization goal directives and analyze
the query plan developed for the following SQL statement. Use the EXPLAIN
directive to generate an Explain plan file for the query.
1. Run the following query and analyze the query plan developed for the SQL
statement generated in the sqexplain.out file.
SELECT --+ORDERED, EXPLAIN
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;
Results:
In this exercise, you learned how to manage the optimizer using optimizer
parameters and directives.

© Copyright IBM Corp. 2001, 2017 9-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Exercise 9:
Managing the optimizer - Solutions
Purpose:
In this exercise, you will learn how to manage the optimizer using optimizer
parameters and directives.

Task 1. Using access method directives.


In this task, you will demonstrate the use of access method optimizer directives and
analyze the query plan developed for an SQL statements. Use SET EXPLAIN to
generate the ex09.expl explain file for the entire set of queries.
1. Run the following queries and analyze the query plan developed for the SQL
statements generated in the ex09.expl file.
SET EXPLAIN FILE TO "ex09.expl";

UNLOAD TO /dev/null
SELECT --+INDEX(c zipcode_ix)
* FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";

UNLOAD TO /dev/null
SELECT --+AVOID_INDEX(c zipcode_ix)
* FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";

UNLOAD TO /dev/null
SELECT --+FULL(customer)
* FROM customer
WHERE zipcode > "65";

UNLOAD TO /dev/null
SELECT --+AVOID_FULL(customer)
* FROM customer
WHERE zipcode > "65";

SET EXPLAIN OFF;

Were your directives followed?

© Copyright IBM Corp. 2001, 2017 9-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

In the above query, the optimizer used the zipcode_ix index from the
customer table as directed.

© Copyright IBM Corp. 2001, 2017 9-33


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

In the above query, the optimizer was directed to AVOID using the
zipcode_ix index on the customer table, so it did a sequential scan. Even
though there was a filter on customer_num, which is indexed, the
optimizer was going to have to read such a large part of the table that
reading the index was going to add more I/O than by just doing the
sequential scan on the table.

© Copyright IBM Corp. 2001, 2017 9-34


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

In this query, the optimizer performed a full-table (sequential) scan as


directed.

© Copyright IBM Corp. 2001, 2017 9-35


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

In this query, the optimizer was directed to avoid performing a full-table


scan, so it chose the index on zipcode.
Task 2. Using the join order directive.
In this task, you will demonstrate the use of join-order optimizer directive and
analyze the query plan developed for an SQL statement. Use SET EXPLAIN to
generate an ex09.expl Explain plan file for the query.
1. Run the following query and analyze the query plan developed for the SQL
statement generated in the ex09.expl file.
SET EXPLAIN FILE TO "ex09.expl";

UNLOAD TO /dev/null
SELECT --+ORDERED
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;

SET EXPLAIN OFF;

© Copyright IBM Corp. 2001, 2017 9-36


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

In this query, the optimizer was directed to join the tables based on the
order of the tables listed in the FROM clause. The customer table was
processed first. Notice also that the optimizer chose to do a sequential
scan on both tables instead of using an index, and performed a dynamic
hash join.
Task 3. Using join method directives.
In this task, you will demonstrate the use of join-method optimizer directive and
analyze the query plans developed for the following SQL statement. Use SET
EXPLAIN to generate an ex09.expl Explain plan file for the query.
1. Run the following queries and compare the query plans developed for the SQL
statements generated in the ex09.expl file. Make sure you include the “SET
EXPLAIN OFF” statement.
SET EXPLAIN FILE TO "ex09.expl";
UNLOAD TO /dev/null
SELECT --+USE_HASH(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;

UNLOAD TO /dev/null
SELECT --+AVOID_HASH(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;

© Copyright IBM Corp. 2001, 2017 9-37


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

UNLOAD TO /dev/null
SELECT --+USE_NL(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";

UNLOAD TO /dev/null
SELECT --+USE_NL(customer)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";

UNLOAD TO /dev/null
SELECT --+AVOID_NL(orders)
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND o.customer_num > 122
AND c.zipcode > "65";
SET EXPLAIN OFF;

In this query, the optimizer was directed to use a hash join. Since there
is no Build Outer notation, we know that the orders table was used as
the build table and the customer table was used as the probe table.

© Copyright IBM Corp. 2001, 2017 9-38


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

In this query, the optimizer was directed to avoid using a hash join on
the orders table, so it used a nested loop join.

© Copyright IBM Corp. 2001, 2017 9-39


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

In the first query, the optimizer was directed to use a nested-loop join on
the orders table. In the second one, it was directed to use a nested-loop
join on the customer table. Notice the difference in the cost of the two
queries. Specifying which table to do the nested-loop join on can make a
significant difference in performance.

© Copyright IBM Corp. 2001, 2017 9-40


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

In this query, the optimizer was directed to avoid using a nested-loop


join. It performed a dynamic hash join instead. Notice the Build Outer
notation.

© Copyright IBM Corp. 2001, 2017 9-41


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Task 4. Using optimizer goal directives.


In this task, you will demonstrate the use of optimization goal directives and analyze
the query plan developed for the following SQL statement. Use the EXPLAIN
directive to generate an Explain plan file for the query.
1. Run the following query and analyze the query plan developed for the SQL
statement generated in the sqexplain.out file.
SELECT --+ORDERED, EXPLAIN
fname, lname, order_num, o.customer_num
FROM customer c, orders o
WHERE c.customer_num = o.customer_num;

In this query, the optimizer automatically turned on the SQL explain


feature because of the EXPLAIN directive, and additionally processed
the tables in the order listed in the FROM clause.
Results:
In this exercise, you learned how to manage the optimizer using optimizer
parameters and directives.

© Copyright IBM Corp. 2001, 2017 9-42


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

Unit summary
Having completed this unit, you should be able to:
• Describe the effect on the engine of the different values of
OPTCOMPIND
• Describe the effects of setting the OPT_GOAL parameter
• Write optimizer directives to improve performance

Managing the optimizer © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 9-43


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 9 M a n a g i n g t h e o p t i m i ze r

© Copyright IBM Corp. 2001, 2017 9-44


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Referential and entity integrity

Referential and entity integrity

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Unit 10 Referential and entity integrity

© Copyright IBM Corp. 2001, 2017 10-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Unit objectives
• Explain the benefits of referential and entity integrity
• Specify default values, check constraints, and referential constraints

Referential and entity integrity © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 10-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Definitions
• Referential integrity: Are the relationships between tables enforced?
• Entity integrity: Does each row in the table have a unique identifier?
• Semantic integrity: Does the data in the columns properly reflect the
types of information the column was designed to hold?

Referential and entity integrity © Copyright IBM Corporation 2017

Definitions
Integrity is the accuracy or correctness of the data in the database. More definitions of
terms used in this and the next unit are:
• Referential integrity
Referential integrity enforces the primary and foreign key relationships between
tables. For example, a customer record must exist before an order can be placed
for that customer.
• Entity integrity
Entity integrity is enforced by creating a primary key that uniquely identifies each
row in a table.
• Semantic integrity
Semantic integrity is enforced by using the following:
• Data types: The data type defines the type of values that you can store in a
column. For example, the data type smallint allows you to enter values from
-32,767 to 32,767.
• Default values: The default value is the value inserted in a column when an
explicit value is not specified. For example, the user_id column of a table might
default to the login name of the user if a name is not entered.

© Copyright IBM Corp. 2001, 2017 10-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

• Check constraints: Check constraints specify conditions on the data that is


inserted or updated in a column. Each row inserted into a table must meet
those conditions. For example, the quantity column of a table might check for
quantities greater than or equal to 1. Check constraints can also be used to
enforce relationships within a table. For example, in an order table, the
ship_date must be greater than the order_date.
• NOT NULL Constraints: The NOT NULL constraint ensures that a column
contains a value during insert and update operations.

© Copyright IBM Corp. 2001, 2017 10-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Integrity at the application level

prog.4gl

state in ("CA", "NY")

form1.per form2.per

state in ("CA", "AZ") state in ("CA", "WA")

IBM Informix
Data
database server

Referential and entity integrity © Copyright IBM Corporation 2017

Integrity at the application level


You can perform integrity checking within the application. To validate a state value,
for example, you can specify the valid states by using the INCLUDE attribute in a
form, or you can check for the value in a table by using a SELECT statement in an
Informix 4GL program. Although this second method is flexible, it can lead to
inconsistent checking. Furthermore, if the constraint values change, all affected
application code might need to be modified.
In the diagram above, the two forms and the Informix 4GL program are not
consistent.

© Copyright IBM Corp. 2001, 2017 10-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Integrity at the database server level

prog.4gl

form1.per form2.per

state in ("CA", "AZ")


Data
IBM Informix
Database server

Referential and entity integrity © Copyright IBM Corporation 2017

Integrity at the database server level


By placing integrity constraint checking at the database server level, consistency
throughout all applications is ensured.

© Copyright IBM Corp. 2001, 2017 10-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Types of integrity constraints


• Default value: Automatically provides values for columns omitted in an
INSERT statement.
• NOT NULL constraint: Requires a value for a column during an insert
(if there is no default value) or an update.
• Check constraint: All inserted and updated rows must meet the
conditions defined by this constraint.
• Referential constraint: Enforces Primary-Foreign Key relationships.
• Unique constraint: Every row inserted or updated must have a unique
value for the key specified.

Referential and entity integrity © Copyright IBM Corporation 2017

Types of integrity constraints


The types of referential, entity, and semantic integrity (except data types) that are
implemented in Informix products are listed in the visual. CHECK, UNIQUE, and
NOT NULL constraints apply integrity checks within a single row, whereas
referential constraints apply integrity checks between rows.

© Copyright IBM Corp. 2001, 2017 10-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

UNIQUE, NOT NULL, and DEFAULT


CREATE TABLE orders (
order_num INTEGER UNIQUE,
order_date DATE NOT NULL DEFAULT TODAY);

ALTER TABLE orders


MODIFY order_num INTEGER NOT NULL;

Referential and entity integrity © Copyright IBM Corporation 2017

UNIQUE, NOT NULL, and DEFAULT


Here are examples of UNIQUE, NOT NULL, and DEFAULT constraints.

© Copyright IBM Corp. 2001, 2017 10-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Constraint names
CREATE TABLE orders (
order_num INTEGER UNIQUE
CONSTRAINT order_num_uq,
order_date DATE NOT NULL
CONSTRAINT order_date_nn
DEFAULT TODAY);

ALTER TABLE orders


MODIFY order_num INTEGER NOT NULL
CONSTRAINT order_num_nn;

Referential and entity integrity © Copyright IBM Corporation 2017

Constraint names
A constraint is identified by its name. You can assign a name or use the default name
that the database server assigns. The names must be unique within a database.
Names are assigned to all constraints: primary key, foreign key, unique, check, and
NOT NULL. Constraint names are stored in the sysconstraints system catalog table.
If you want to change the mode (for example, enabled, disabled, or filtering) of a
specific constraint, you must know its name (see the Modes and Violation Detection
unit):
SET CONSTRAINTS pk_items, fk1_items TO DISABLED;
You can also set constraints for an entire table without having to know the constraint
name. For example:
SET CONSTRAINTS FOR TABLE orders TO DISABLED;
To drop a constraint without altering the table in any other way, use the DROP clause
with the ALTER TABLE command:
ALTER TABLE items DROP CONSTRAINT pk_items;
System default names are a composite of a constraint ID code, a table ID, and a unique
constraint ID. Name your constraints using a naming convention instead of taking the
system default. This makes identifying a constraint and its purpose easier.

© Copyright IBM Corp. 2001, 2017 10-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

CHECK constraint
ALTER TABLE items ADD CONSTRAINT
CHECK (quantity >= 1 AND quantity <= 10)
CONSTRAINT ck_items_qty;

ALTER TABLE items MODIFY quantity SMALLINT


CHECK (quantity >= 1 AND quantity <= 10)
CONSTRAINT ck_items_qty;

ALTER TABLE orders MODIFY paid_date DATE


CHECK (paid_date > ship_date)
CONSTRAINT ck_paid_date;
− #^
− #676: Invalid check constraint column.

ALTER TABLE orders ADD CONSTRAINT


CHECK (paid_date > ship_date)
CONSTRAINT ck_paid_date;
Referential and entity integrity © Copyright IBM Corporation 2017

CHECK constraint
You can add constraints to both the tables and columns.
Examples
• The first example adds a table-level constraint.
• The second example adds an equivalent constraint at the column level. If you
add a constraint to a column, you cannot reference other columns.
• The third example illustrates the error that occurs when you try to add
constraints to a table and a column.
• The fourth example shows how a constraint that references multiple columns
can be added at the table level. The columns must be from the same table.
When you modify a column, you modify everything about that column, which is why
the MODIFY clause must include the data type. If you do not list all constraints with
the MODIFY clause, any constraints not listed are dropped.

© Copyright IBM Corp. 2001, 2017 10-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

What is a referential constraint?

Parent Primary Key

Child Foreign Key

Child must have a parent


Parent must have a unique Primary Key

Referential and entity integrity © Copyright IBM Corporation 2017

What is a referential constraint?


Referential constraints allow users to specify primary and foreign keys to enforce
parent-child (master-detail) relationships. To define a referential constraint, a user must
have the REFERENCES privilege or be the table owner. Referential constraints allow
you to place restrictions described on the next page on updates, deletes, and inserts.
For now, assume that checking referential constraints is performed as each row is
inserted, updated, or deleted.
Some rules that are enforced with referential constraints:
• If a user deletes a PRIMARY KEY and corresponding FOREIGN KEYS are
present, the delete fails. Use cascading deletes to circumvent this limitation.
• If a user updates a PRIMARY KEY and FOREIGN KEYS corresponding to the
original values of the PRIMARY KEY are present, the update fails.
• There are no restrictions associated with deleting FOREIGN keys.
• If a user updates a FOREIGN KEY and no PRIMARY KEY is present that
corresponds to the new, non-NULL value of the FOREIGN KEY, the update fails.
• All values within a PRIMARY KEY must be unique.

© Copyright IBM Corp. 2001, 2017 10-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

• When a user inserts a row into a child table, if all FOREIGN KEYS are non-NULL
and no corresponding PRIMARY KEY is present, the insert fails.
To fully enforce referential integrity, do not allow NULL values in the primary and foreign
key columns.

© Copyright IBM Corp. 2001, 2017 10-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Types of referential constraints

• Cyclic referential constraints enforce Parent


parent-child relationships between tables.
Child

• Multiple-path constraints refer to a Parent


Primary Key that can have several
Foreign Keys. Child Child

• Self-referencing constraints enforce a Parent / Child


parent-child relationship within a table.

Referential and entity integrity © Copyright IBM Corporation 2017

Types of referential constraints


You can provide several types of referential constraints, including cyclic, multiple-
path, and self-referencing.

© Copyright IBM Corp. 2001, 2017 10-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Cyclic referential constraints


CREATE TABLE customer (
customer_num SERIAL, Parent
fname CHAR(20),
PRIMARY KEY (customer_num)
CONSTRAINT pk_cnum
);

CREATE TABLE orders (


order_num SERIAL,
Child
customer_num INTEGER,
FOREIGN KEY (customer_num)
REFERENCES customer
CONSTRAINT fk_cnum
);
Referential and entity integrity © Copyright IBM Corporation 2017

Cyclic referential constraints


Cyclic referential constraints enforce parent-child (master-detail) relationships
between tables. An example of a cyclic referential constraint is shown in the visual.
To enforce a referential constraint, you must specify a primary key in the parent
table and a corresponding foreign key in the child table. There can only be one
primary key per table. The REFERENCES clause specifies the parent table.
Because only one primary key is allowed in that table, you do not need to list the
column name in the REFERENCES clause.
The example enforces a parent-child relationship between the customer and orders
tables. It ensures that orders are added only for customers that exist and that
customers cannot be deleted if they have orders. The parent-child relationship is
one to many - one customer can have many orders.

© Copyright IBM Corp. 2001, 2017 10-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Cyclic referential constraints: Example


INSERT INTO customer VALUES (1, "Smith");
INSERT INTO orders VALUES (0, 1);
INSERT INTO orders VALUES (0, 2);
#
#691: Missing key in referenced table for
#referential constraint (informix.fk_cnum).
#111:ISAM error: no record found.

DELETE FROM customer WHERE customer_num = 1;


# ^
#692:Key value for constraint (informix.pk_cnum)
#is still being referenced.

Referential and entity integrity © Copyright IBM Corporation 2017

Cyclic referential constraints: Example


In the example, an order cannot be added to the orders table for customer number 2
because customer number 2 does not exist in the customer table.
Customer number 1 cannot be deleted from the customer table because orders are in
the orders table for customer number 1. If the customer record is missing, who would
you bill for the order?

© Copyright IBM Corp. 2001, 2017 10-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Cascading deletes
CREATE TABLE customer (
Parent
customer_num INT,
PRIMARY KEY(customer_num));

CREATE TABLE orders (


order_num INT, Child

customer_num INT,
PRIMARY KEY(order_num),
FOREIGN KEY(customer_num) REFERENCES customer
ON DELETE CASCADE);
---------------------------------------------------
DELETE FROM customer WHERE customer_num = 1;

All rows in orders table for customer 1 are automatically deleted.


Referential and entity integrity © Copyright IBM Corporation 2017

Cascading deletes
Cascading deletes let you define a referential constraint in which the database server
automatically deletes child rows when the corresponding parent row is deleted. This
feature is useful in simplifying application code and logic.
Cascading deletes provide a performance enhancement because by automatically
deleting rows in the database server rather than requiring the application to delete child
rows first, fewer SQL statements are processed. The database server can process
deletes more efficiently because the overhead of an SQL statement is not incurred.
If for any reason the original DELETE statement fails or the resulting DELETE
statements on the child rows fail, the entire DELETE statement is rolled back (parent
and child rows are rolled back).
To invoke cascading deletes, add the ON DELETE CASCADE clause after the
REFERENCES clause in the CREATE TABLE statement for the child table.

© Copyright IBM Corp. 2001, 2017 10-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Restrictions
• Restrictions on using cascading deletes:
 The database must have logging
 The child table cannot be involved in a correlated subquery where the
DELETE statement is involved in the parent table

Referential and entity integrity © Copyright IBM Corporation 2017

Restrictions
Here are some of the restrictions in using cascading deletes:
• Referential integrity with cascading deletes can be created with logging off.
However, the cascading deletes are not activated. If logging is turned off,
cascading deletes are deactivated (you receive a referential integrity error). Once
you turn on logging, cascading deletes are automatically reactivated; no action is
necessary by the administrator.
• A correlated subquery that uses the child table in a DELETE statement for the
parent table does not use cascading deletes. Instead, you receive the error:
735: Cannot reference table that participates in a cascaded delete.

© Copyright IBM Corp. 2001, 2017 10-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Adding a cascading delete


• If the column has a Foreign Key constraint and you want to add a
cascading delete, drop the constraint and add it with the ON DELETE
CASCADE clause:

ALTER TABLE orders


DROP CONSTRAINT orders_fk1,
ADD CONSTRAINT (FOREIGN KEY (customer_num)
REFERENCES customer
ON DELETE CASCADE
CONSTRAINT orders_fk1);

Referential and entity integrity © Copyright IBM Corporation 2017

Adding a cascading delete


If the column has a Foreign Key constraint and you want to add a cascading delete,
drop the constraint and add it with the ON DELETE CASCADE clause. You can
drop the constraint and add it with the same ALTER TABLE statement. When you
perform both operations in the same ALTER TABLE statement, the index is not
dropped, so the overhead involved with dropping and readding the constraint is
minimal.

© Copyright IBM Corp. 2001, 2017 10-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Multiple-path referential constraints


 CREATE TABLE stock (
Parent
 stock_num SMALLINT, manu_code CHAR(3),...,
 PRIMARY KEY (stock_num, manu_code) CONSTRAINT pk_stock);

 CREATE TABLE items (


item_num SMALLINT,
stock_num SMALLINT, Child
manu_code CHAR(3),...,
FOREIGN KEY (stock_num, manu_code)
REFERENCES stock
CONSTRAINT fk1_stock);

 CREATE TABLE catalog(


catalog_num SERIAL, Child
stock_num SMALLINT,
manu_code CHAR(3),...,
FOREIGN KEY (stock_num, manu_code)
REFERENCES stock
CONSTRAINT fk2_stock);

Referential and entity integrity © Copyright IBM Corporation 2017

Multiple-path referential constraints


Multiple-path referential constraints refer to a primary key that can have several
foreign keys.
The example shows a parent table, stock, that has two different children, the items
table and the catalog table. Other columns were left out of the example for clarity.

© Copyright IBM Corp. 2001, 2017 10-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Self-referencing referential constraints


CREATE TABLE emp (
enum SERIAL, Parent and Child
mnum INTEGER,
PRIMARY KEY (enum) CONSTRAINT pk_enum,
FOREIGN KEY (mnum) REFERENCES emp
CONSTRAINT fk_enum);

INSERT INTO emp VALUES (1, 1);


INSERT INTO emp VALUES (2, 1);
INSERT INTO emp VALUES (3, 10);
#691: Missing key in referenced table for
#referential constraint (informix.fk_enum).
#111: ISAM error: no record found

Referential and entity integrity © Copyright IBM Corporation 2017

Self-referencing referential constraints


Self-referencing referential constraints enforce parent-child (master-detail)
relationships within a single table.
The example assumes the scenario in which an employee table is used to track all
employees and the manager to whom they are assigned. A self-referencing
constraint is used to ensure that the manager assigned to each employee exists in
the employee table. In other words, you cannot have a manager who is not also an
employee. The enum (employee number) is a primary key that must exist for the set
of values stored in the mnum column (manager number). In the example, the emp
table requires that the value entered in the enum (employee number) column exist
before it can be added to the mnum (manager number) column. A manager number
of 1 is allowed, but a manager number of 10 fails.
Self-referencing referential constraints can present application and performance
problems because queries searching different levels of the hierarchy need to be
rewritten for each level searched. This means that the server must make multiple
passes through the table.
The Node DataBlade Module alleviates that problem. See the appendix on the Node
DataBlade Module for more information.

© Copyright IBM Corp. 2001, 2017 10-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Creating primary key constraints


• Two methods exist for creating a table with a Primary Key constraint:
 At the end of the column list:
CREATE TABLE customer (
customer_num SERIAL,
fname CHAR(20),
PRIMARY KEY(customer_num)
CONSTRAINT pk_cnum);
 At the end of the column definition:
CREATE TABLE customer (
customer_num SERIAL
PRIMARY KEY CONSTRAINT pk_cnum,
fname CHAR(20));

Referential and entity integrity © Copyright IBM Corporation 2017

Creating primary key constraints


You can add primary key referential constraints in a CREATE TABLE statement in
two ways. The methods shown above accomplish the same thing. However, you can
use only the first method to create a composite primary key.

© Copyright IBM Corp. 2001, 2017 10-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Creating foreign key constraints


• Two methods exist for creating a table with a Foreign Key constraint:
 At the end of the column list:
CREATE TABLE orders (
order_num SERIAL,
customer_num INTEGER,
FOREIGN KEY(customer_num)
REFERENCES customer CONSTRAINT fk_cnum);
 At the end of the column definition:
CREATE TABLE orders (
order_num SERIAL,
customer_num INTEGER
REFERENCES customer CONSTRAINT fk_cnum);

Referential and entity integrity © Copyright IBM Corporation 2017

Creating foreign key constraints


You can add foreign key referential constraints in a CREATE TABLE statement in
two ways. The methods shown accomplish the same thing. However, only the first
method can be used to create a composite foreign key.

© Copyright IBM Corp. 2001, 2017 10-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Adding a primary key constraint


• Two methods exist for adding a Primary Key constraint to an existing
table:
 Add a constraint to a table:
ALTER TABLE customer ADD CONSTRAINT
PRIMARY KEY(customer_num)
CONSTRAINT pk_cnum;
 Modify the column definition:
ALTER TABLE customer
MODIFY customer_num SERIAL
PRIMARY KEY CONSTRAINT pk_cnum;

Referential and entity integrity © Copyright IBM Corporation 2017

Adding a primary key constraint


You can add primary key referential constraints in an ALTER TABLE statement in
two ways. The methods shown accomplish the same thing. However, only the first
method creates a composite primary key (more than one column).
Also, the second method modifies everything about the column, so you must be
careful to include all constraints, as any constraints not listed for that column are
dropped.

© Copyright IBM Corp. 2001, 2017 10-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Adding a foreign key constraint


• Two methods exist for adding a Foreign Key constraint to an existing
table:
 Add a constraint to a table:
ALTER TABLE orders ADD CONSTRAINT
FOREIGN KEY (customer_num)
REFERENCES customer CONSTRAINT fk_cnum;
 Modify the column definition:
ALTER TABLE orders
MODIFY customer_num INTEGER
REFERENCES customer CONSTRAINT fk_cnum;

Referential and entity integrity © Copyright IBM Corporation 2017

Adding a foreign key constraint


There are two ways to add foreign key referential constraints in an ALTER TABLE
statement. The methods shown accomplish the same thing. However, only the first
method can be used to create a composite foreign key (more than one column).
Also, the second method modifies everything about the column, so you must be
careful to include all constraints, as any constraints not listed for that column are
dropped.

© Copyright IBM Corp. 2001, 2017 10-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

System Catalog tables

sysdefaults sysconstraints

syscoldepend syschecks sysreferences

Referential and entity integrity © Copyright IBM Corporation 2017

System Catalog tables


The following system catalog tables are used to enforce referential and entity
integrity:

syschecks Contains the text of the check


constraint.

sysdefaults Keeps track of every column that has


a user-specified default value.

sysreferences Lists the referential constraints placed


on the columns in the database.

sysconstraints Stores primary, check, and referential


constraints in addition to unique
constraints.

syscoldepend Tracks table columns specified in


each check constraint.

© Copyright IBM Corp. 2001, 2017 10-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Exercise 10
Referential and entity integrity
• Work with referential and check constraints

Referential and entity integrity © Copyright IBM Corporation 2017

Exercise 10: Referential and entity integrity

© Copyright IBM Corp. 2001, 2017 10-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Exercise 10 :
Referential and entity integrity

Purpose:
In this exercise, you will learn how to create and maintain referential and
entity integrity using constraints.

Task 1. Implement referential integrity.


In this task, you will add Primary and Foreign Key constraints on all the tables in
your database, and verify that each constraint was added by querying the
sysconstraints system catalog table.
1. Create Primary Keys for the customer, orders, and items tables. Name them
pk_cust, pk_orders, and pk_items, respectively.
2. Ensure that when an order is added, the customer_num value exists in the
customer table.
3. Ensure that when an item is added to an order, the order_num value exists in
the orders table.
4. Query the sysconstraints system catalog table for each constraint added.
5. Did you notice additional constraints? If so, what columns are they for?
Task 2. Create the manufact table with referential integrity.
In this task, you will create a new table manufact and include the referential
constraints. You will use dbaccess to gather information about the manufact table.
1. Create a manufact table in your database in the dbspace dbspace4 with the
following columns and assign a Primary Key. Name the Primary Key pk_manu.
Column Description
name
manu_code Manufacturer code. This will be a 3-character identifier.
manu_name Name of manufacturer. Allow for a length of 15.
lead_time The interval of time in days to allow for delivery of orders
from this manufacturer.

© Copyright IBM Corp. 2001, 2017 10-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Execute the following steps:


• Run the load script called manufactload.sql provided by your instructor.
This script loads the manufact table. Use the following command to execute
the load script:
$ dbaccess stores_demo manufactload.sql
• Ensure that when a row is added to the items table, the manu_code value
exists in the manufact table.
• Ensure that when a row is added to the stock table, the manu_code value
exists in the manufact table.
2. Use the Table > Info > manufact > constraints options of dbaccess to gather
information for each constraint added.
Task 3. Add table check constraints.
1. Alter the items table so that the quantity column only accepts values that are
greater than zero.
2. Alter the items table so that the quantity value defaults to 1.
3. Insert a couple of rows into the items table to verify the ALTER TABLE
statement in Step 2.
Results:
In this exercise, you learned how to create and maintain referential and entity
integrity using constraints

© Copyright IBM Corp. 2001, 2017 10-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Exercise 10:
Referential and entity integrity - Solutions

Purpose:
In this exercise, you will learn how to create and maintain referential and
entity integrity using constraints.

Task 1. Implement referential integrity.


In this task, you will add Primary and Foreign Key constraints on all the tables in
your database, and verify that each constraint was added by querying the
sysconstraints system catalog table.
1. Create Primary Keys for the customer, orders, and items tables. Name them
pk_cust, pk_orders, and pk_items, respectively.
ALTER TABLE customer
ADD CONSTRAINT PRIMARY KEY(customer_num)
CONSTRAINT pk_cust;

ALTER TABLE orders


ADD CONSTRAINT PRIMARY KEY(order_num)
CONSTRAINT pk_orders;

ALTER TABLE items


ADD CONSTRAINT PRIMARY KEY(item_num, order_num)
CONSTRAINT pk_items;
2. Ensure that when an order is added, the customer_num value exists in the
customer table.
ALTER TABLE orders
ADD CONSTRAINT FOREIGN KEY(customer_num)
REFERENCES customer
CONSTRAINT fk_orders_cust;
3. Ensure that when an item is added to an order, the order_num value exists in
the orders table.
ALTER TABLE items
ADD CONSTRAINT FOREIGN KEY(order_num)
REFERENCES orders
CONSTRAINT fk_items_orders;

© Copyright IBM Corp. 2001, 2017 10-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

4. Query the sysconstraints system catalog table for each constraint added.
Table customer:
SELECT constrid, constrname, c.owner, c.tabid,
constrtype, idxname
FROM sysconstraints c, systables t
WHERE t.tabname = "customer" AND c.tabid = t.tabid;

Table orders:
SELECT constrid, constrname, c.owner, c.tabid,
constrtype, idxname
FROM sysconstraints c, systables t
WHERE t.tabname = "orders" AND c.tabid = t.tabid;

© Copyright IBM Corp. 2001, 2017 10-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Table items:
SELECT constrid, constrname, c.owner, c.tabid,
constrtype, idxname
FROM sysconstraints c, systables t
WHERE t.tabname = "items" AND c.tabid = t.tabid;

5. Did you notice additional constraints? If so, what columns are they for? Yes.
The additional constraints are in the customer and orders tables. These
are the NOT NULL constraints (constrtype ‘N’) automatically created for
the serial columns.
Task 2. Create the manufact table with referential integrity.
In this task, you will create a new table manufact and include the referential
constraints. You will use dbaccess to gather information about the manufact table.
1. Create a manufact table in your database in the dbspace dbspace4 with the
following columns and assign a Primary Key. Name the Primary Key pk_manu.
Column Description
name
manu_code Manufacturer code. This will be a 3-character identifier.
manu_name Name of manufacturer. Allow for a length of 15.
lead_time The interval of time in days to allow for delivery of orders
from this manufacturer.

CREATE TABLE manufact (


manu_code char(3) primary key constraint pk_manu,
manu_name char(15),
lead_time interval day to day)
in dbspace4;

© Copyright IBM Corp. 2001, 2017 10-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Execute the following steps:

• Run the load script called manufactload.sql provided by your instructor.


This script loads the manufact table. Use the following command to
execute the load script:
$ dbaccess stores_demo manufactload.sql
• Ensure that when a row is added to the items table, the manu_code value
exists in the manufact table.
ALTER TABLE items
ADD CONSTRAINT FOREIGN KEY(manu_code)
REFERENCES manufact
CONSTRAINT fk_items_manu;
• Ensure that when a row is added to the stock table, the manu_code value
exists in the manufact table.
ALTER TABLE stock
ADD CONSTRAINT FOREIGN KEY(manu_code)
REFERENCES manufact
CONSTRAINT fk_stock_manu;
2. Use the Table > Info > manufact > cOnstraints options of dbaccess to gather
information for each constraint added, as shown.

© Copyright IBM Corp. 2001, 2017 10-33


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Task 3. Add table check constraints.


1. Alter the items table so that the quantity column only accepts values that are
greater than zero.
ALTER TABLE items
MODIFY quantity SMALLINT
CHECK (quantity > 0);

2. Alter the items table so that the quantity value defaults to 1.


ALTER TABLE items
MODIFY quantity SMALLINT DEFAULT 1
CHECK (quantity > 0);

3. Insert a couple of rows into the items table to verify the ALTER TABLE
statement in Step 2.
INSERT INTO items VALUES (1,1004,110,"ANZ",1,840);
INSERT INTO items VALUES (1,1005,201,"ANZ",0,200);
In the second INSERT statement above, the value of "0" for quantity fails
when trying to enter a value less than 1. The check constraint on the
quantity column catches this and returns the following error (your
constraint name might be different):
530: Check constraint (informix.c121_28) failed.
Results:
In this exercise, you learned how to create and maintain referential and entity
integrity using constraints

© Copyright IBM Corp. 2001, 2017 10-34


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

Unit summary
• Explain the benefits of referential and entity integrity
• Specify default values, check constraints, and referential constraints

Referential and entity integrity © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 10-35


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 10 Referential and entity integrity

© Copyright IBM Corp. 2001, 2017 10-36


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Managing constraints

Managing constraints

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Unit 11 Managing constraints

© Copyright IBM Corp. 2001, 2017 11-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Unit objectives
• Determine when constraint checking occurs
• Drop a constraint
• Delete and update a parent row
• Insert and update a child row

Managing constraints © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 11-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Constraint transaction modes


• Constraint transaction modes determine when checking for referential
violations occurs:
 For databases with logging:
− Immediate constraint checking checks for violations after each statement is
finished executing
− Deferred constraint checking checks for violations at COMMIT time
 For databases with no logging:
− Detached constraint checking checks for violations with each row processed in the
table as the statement is executing. It is the only mode available for databases
with no logging and is not available for databases with logging.

Managing constraints © Copyright IBM Corporation 2017

Constraint transaction modes


Constraint transaction modes determine when checking for referential violations
occurs.
You can change the transaction mode by using the SET CONSTRAINTS statement.
For example:
SET CONSTRAINTS pk_orders,fk_orders DEFERRED;
You can identify the transaction mode format of the SET CONSTRAINTS statement
by the DEFERRED or IMMEDIATE clause. Duration of the transaction mode format
of the SET CONSTRAINTS statement is the transaction in which it is executed. You
cannot execute the SET CONSTRAINTS statement to set transaction mode outside
of a transaction. Once a COMMIT WORK or ROLLBACK WORK statement is
executed, the transaction mode reverts to IMMEDIATE.
The SET CONSTRAINTS command is also used to change the object mode of a
constraint. The object mode format of the SET CONSTRAINTS command is
identified by the ENABLED, DISABLED, or FILTERING clause. See the Modes and
Violation Detection unit for more detail.

© Copyright IBM Corp. 2001, 2017 11-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Immediate constraint checking


CREATE TABLE test (current_no INTEGER UNIQUE);
UPDATE test SET current_no = current_no + 1;

Index violation No index


occurs after first violation at end
row is updated of update

1 2 2
2 2 3
3 3 4
4 4 5
5 5 6

Managing constraints © Copyright IBM Corporation 2017

Immediate constraint checking


Immediate (or effective) checking specifies that constraint checking appears to
occur at the end of each statement. If some constraint is not satisfied, then the
statement appears to have not been executed. Immediate checking is the default
mode.
In the example, after the first row is updated but before the second row is updated,
a duplicate value exists that violates the unique-index constraint. However, after the
all the rows are updated, all the values are unique, so the statement executes
successfully.
How immediate constraint checking is implemented
A change that violates a constraint is allowed to succeed but is recorded as a
violation. Later, at the end of the statement, checks are made to see if the violation
still exists. If violations still exist, an error is returned and the statement is undone.
Save points are used to allow the database server to undo the effects of a single
statement without undoing earlier changes made within the same transaction. By
establishing a savepoint at the beginning of a statement, the database server can
roll back to that savepoint if a constraint violation occurs during effective checking.

© Copyright IBM Corp. 2001, 2017 11-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

For referential constraints, a memory buffer or temp table records the violations.
One temp table records violations for each referential pair. The temp table contains
key values that were violated. As rows are inserted, deleted, and updated, the temp
tables are updated to reflect new violations and removal of old ones. Later, when
checking is done, the temp files are scanned and for those keys that are still valid,
the violations are revalidated. As violations are resolved, records are removed from
the temp table.
For check constraints, a memory buffer or temp table is also used. However, this
time the temp table records only the rowids of the violating rows. As rows are
updated, rows that now pass the check constraints are removed. When checking is
done, the temp table should now be empty.
For unique indexes, the checking is done on a row-by-row basis instead of at the
end of the statement. If you want to perform effective checking, use unique
constraints rather than creating unique indexes.

© Copyright IBM Corp. 2001, 2017 11-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Deferred constraint checking


ALTER TABLE orders ADD CONSTRAINT
PRIMARY KEY (order_num);
ALTER TABLE items ADD CONSTRAINT
FOREIGN KEY (order_num) REFERENCES orders;

If deferred constraint checking is set, this transaction is committed; if not, it fails.

BEGIN WORK;
SET CONSTRAINTS ALL DEFERRED;
UPDATE orders SET order_num = 1006
WHERE order_num = 1001;
UPDATE items SET order_num = 1006
WHERE order_num = 1001;
COMMIT WORK;

Managing constraints © Copyright IBM Corporation 2017

Deferred constraint checking


In the example, assume that the data type for the order_num column is an integer
and not a serial column. Serial columns cannot be updated. Deferred checking is
used when the primary and foreign key values in a parent and child pair are
changed to a new value. For example, you would use deferred checking if you
wanted to change an order number in both the orders and items tables.
Deferred checking is also used when you need to switch the primary key values for
two or more rows in a table.
For example, if order number 1005 should really be 1006, and order number 1006
should be 1005, deferred constraint checking should be specified during the
transaction used to switch the values.
How deferred checking is implemented:
Deferred checking specifies that constraint checking does not occur until
immediately before the transaction is committed or the user changes the mode to
immediate. If a constraint error occurs at commit time, then the transaction is rolled
back.

© Copyright IBM Corp. 2001, 2017 11-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

In the example, if constraint mode is not set to deferred, the statement fails. This
happens because the referential constraint enforces the rule that all items must
have an order. The failure would occur at the first update statement because there
would be items that did not have an order (order number 1001 no longer exists in
the orders table).
Deferred checking is implemented similarly to immediate checking. However, the
checks for violations are made at the end of the transaction as opposed to the end
of the statement.
You must put the SET CONSTRAINTS ALL DEFERRED statement within a
transaction. It is valid from the time that it is set until the end of the transaction.
You can also replace the keyword ALL with the constraint name to defer only a
specific constraint. For example:
SET CONSTRAINTS uniq_ord DEFERRED;
Unique indexes (that is, indexes created using the UNIQUE keyword) are not used
by deferred checking. If you want the checking done at commit time, use unique
constraints rather than creating unique indexes.

© Copyright IBM Corp. 2001, 2017 11-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Detached constraint checking

CREATE TABLE test (current_no INTEGER UNIQUE);


UPDATE test SET current_no = current_no + 1;
Index violation
occurs after first row
is updated and the
statement fails

1 2 1
2 2 2
3 3 3
4 4 4
5 5 5

Managing constraints © Copyright IBM Corporation 2017

Detached constraint checking


Detached checking is the only mode available in databases created without logging
and for temporary tables that were created WITH NO LOG. If logging is not on, it is
not possible to perform the rollbacks that effective checking requires.
How it is implemented
Detached constraint checking is done on a row-by-row basis. Once a constraint
error occurs, an error is returned immediately to the user and the rest of the
statement is not executed.

© Copyright IBM Corp. 2001, 2017 11-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Performance effect
• Referential constraints are implemented by using indexes on both
Primary and Foreign Keys.
• Indexes must be updated on UPDATE, INSERT, and DELETE.
• Index lookups on each UPDATE, INSERT, and DELETE.
• Numerous locks are required. Shared locks are held on all indexes
being used.

Managing constraints © Copyright IBM Corporation 2017

Performance effect
Unique constraints are enforced by internally creating unique indexes on
appropriate columns.
When a referential constraint is created, a non-unique index is built on the foreign
key columns. If an index exists, then that index is used.
A column can have both a referential and a unique constraint. A column can also
have multiple different referential constraints. In these situations, a single index is
used to enforce the multiple constraints.

© Copyright IBM Corp. 2001, 2017 11-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Dropping a constraint (1 of 2)

ALTER TABLE orders test


DROP CONSTRAINT pk_orders;

The ALTER TABLE


statement drops the
primary key constraint
and any corresponding
foreign key constraints.

Managing constraints © Copyright IBM Corporation 2017

Dropping a constraint
When a primary key constraint is dropped, any corresponding foreign key
constraints are automatically dropped as well. When a foreign key constraint is
dropped, the corresponding primary key constraint is not affected.

© Copyright IBM Corp. 2001, 2017 11-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Dropping a constraint (2 of 2)
CREATE TABLE orders (
order_num SERIAL,
order_date DATE,
PRIMARY KEY (order_num)
CONSTRAINT pk_orders);

CREATE TABLE items (


item_num SMALLINT,
order_num INTEGER,
FOREIGN KEY (order_num) REFERENCES orders
CONSTRAINT fk_orders);

The ALTER TABLE


ALTER TABLE orders DROP order_num; statement drops
order_num, constraint
pk_orders, and constraint
fk_orders.

Managing constraints © Copyright IBM Corporation 2017

When a column that has a constraint is dropped, the action can affect more than just
the table that is mentioned in the ALTER TABLE statement. Any constraints that
reference the dropped column are also dropped.
In the example, dropping the primary key column in table orders requires table items
to be locked while the constraint is dropped. Also, the index that was used to
implement the constraint is dropped only if it was built for this constraint.

© Copyright IBM Corp. 2001, 2017 11-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Deleting and updating a parent row


DELETE FROM orders WHERE order_num = (1004);

DELETE of a row in orders table causes shared locks


to be placed on the items keys in the index.

orders items

1001 1001 Since rows exist in the child


1002 1001 table, the DELETE fails.

1003 …
1004 1004
1005 1004
… …

Managing constraints © Copyright IBM Corporation 2017

Deleting and updating a parent row


Indexes are used to support referential integrity when deleting or updating a row in a
parent table.
Before deleting or updating a row in a parent (master) table, the database server
looks up any foreign keys that correspond to the primary key of the row being
updated or deleted. When a corresponding foreign key is found, a shared lock is
placed on the foreign keys in the index. The lock is required to test for the existence
of a key that is being removed or a newly inserted foreign key that is not committed
yet.

© Copyright IBM Corp. 2001, 2017 11-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Inserting and updating a child row


INSERT INTO items VALUES (1004);

INSERT of a row in the


items table causes a lock
to be placed on the
orders index key.

orders items

1001 1001 Because rows exist in the


1002 1001 parent table, the INSERT is OK!

1003 …
1004 1004
1005 1004
… 1004

Managing constraints © Copyright IBM Corporation 2017

Inserting and updating a child row


Indexes are used in the following ways to support referential integrity when inserting
or updating a row into a child (detail) table.
Before inserting or updating a row, the database server looks through all foreign
keys on this table that are set to non-NULL values by the update. For each of these
foreign keys, the database server uses the unique index that corresponds to the
primary key and performs a lookup on the parent table. If rows are found, the
database server places a shared lock on the index key to ensure that the row is not
deleted before the child row is inserted or updated. The lock is held until the
referencing row is inserted or updated.

© Copyright IBM Corp. 2001, 2017 11-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Exercise 11
Managing constraints
• investigate the effects of constraints and constraint checking

Managing constraints © Copyright IBM Corporation 2017

Exercise 11: Managing constraints

© Copyright IBM Corp. 2001, 2017 11-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Exercise 11 :
Managing constraints

Purpose:
In this exercise, you will learn how to maintain referential and entity integrity
constraints.

Task 1. Immediate constraint checking.


In this task, you will create a new table parent_tab with one column that will be of
integer data type and insert ten rows into the parent_tab table. You will try and
update the parent_tab table to increment the values inserted, and determine why or
why not the statement executed.

1. Create a parent_tab table in your database with the following column:


Column name Description
col1 Unique integer.
2. Insert the following rows of data into the table:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
3. Increment the col1 column by 1. Did the update execute? Why or why not?
Task 2. Deferred constraint checking.
In this task, you will change the manufacturer code for PRC distributor to PRO by.
updating all the PRC values to PRO in the manufact, stock, and items tables. You
will defer checking until all the tables have been updated.
1. Update the manu_code column in manufact, stock and items tables where
manu_code is PRC to PRO.
Why did the statement fail?
How would you correct this problem?
2. Update the tables again using the correct syntax to allow deferred constraint
checking.
3. Query each table for the manu_code "PRC” to verify the update was
successful.

© Copyright IBM Corp. 2001, 2017 11-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Task 3. Delete and update of a parent row.


In this task, you will alter the parent_tab table and add a Primary Key constraint.
You will create a new table called child_tab with an integer column and a Foreign
Key. You will execute update and delete statements on the parent table parent_tab.
1. Alter the parent_tab table and add a Primary Key constraint called pk_parent.
What happened?
2. Fix the problem.
3. Create a table called child_tab in your database with the following column and
include a Foreign Key called fk_child_parent.
Column Name Description
c_col1 Integer.
4. Insert into the child_tab table the following values:
2, 3, 4, 5, 6, 7, 8, 9, 10, 11
5. Update the parent table parent_tab and change the col1 column value 5 to 25.
Why did the statement fail?
6. Delete from the parent table parent_tab where the col1 column value is 5.
Why did the statement fail?
Task 4. Insert and update of a child row.
In this task, you will insert and update a row into the child table child_tab.
1. Insert the value 12 into the child_tab table.
Why did the statement fail?
2. Update the child table child_tab and change the c_col1 column value 5 to 25.
Why did the statement fail?
Task 5. Cascading delete.
In this task, you will drop the Foreign Key constraint on the child_tab table. You will
alter the child_tab table to allow cascade deletes. You will delete a row from the
parent table parent_tab, and query the child table child_tab for the deleted row to
verify the cascade delete.
1. Drop the Foreign Key constraint on the child_tab table. Alter the child_tab
table to add a Foreign Key constraint with cascading deletes.
2. Delete from the parent_tab parent table where col1 = 5.
3. Query the child table child_tab to verify the value 5 is missing.
Results:
In this exercise, you learned how to maintain referential and entity integrity
constraints.

© Copyright IBM Corp. 2001, 2017 11-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Exercise 11:
Managing constraints - Solutions

Purpose:
In this exercise, you will learn how to maintain referential and entity integrity
constraints.

Task 1. Immediate constraint checking.


In this task, you will create a new table parent_tab with one column that will be of
integer data type and insert ten rows into the parent_tab table. You will try and
update the parent_tab table to increment the values inserted, and determine why or
why not the statement executed.
1. Create a parent_tab table in your database with the following column:
Column name Description
col1 Unique integer
CREATE TABLE parent_tab (
col1 INTEGER UNIQUE
);
2. Insert the following rows of data into the table:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
INSERT INTO parent_tab VALUES (1);
INSERT INTO parent_tab VALUES (2);
INSERT INTO parent_tab VALUES (3);
INSERT INTO parent_tab VALUES (4);
INSERT INTO parent_tab VALUES (5);
INSERT INTO parent_tab VALUES (6);
INSERT INTO parent_tab VALUES (7);
INSERT INTO parent_tab VALUES (8);
INSERT INTO parent_tab VALUES (9);
INSERT INTO parent_tab VALUES (10);

3. Increment the col1 column by 1. Did the update execute? Why or why not?
UPDATE parent_tab SET col1 = col1 + 1;
Yes
Because during immediate constraint checking the constraint
checking occurs at the end the statement. When all the rows were
updated a duplicate value did not exist.

© Copyright IBM Corp. 2001, 2017 11-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Task 2. Deferred constraint checking.


In this task, you will change the manufacturer code for PRC distributor to PRO by.
updating all the PRC values to PRO in the manufact, stock, and items tables. You
will defer checking until all the tables have been updated.
1. Update the manu_code column in manufact, stock and items tables where
manu_code is PRC to PRO.
UPDATE manufact SET manu_code = "PRO"
WHERE manu_code = "PRC";

UPDATE stock SET manu_code = "PRO"


WHERE manu_code = "PRC";

UPDATE items SET manu_code = "PRO"


WHERE manu_code = "PRC";

The above statements result in the following error:

692: Key value for constraint (informix.pk_manu) is still being referenced.

Why did the statement fail?


The constraint was checked after each statement. This violated the
Primary and Foreign Key values in the parent and child tables.

How would you correct this problem?


Make the update of all tables one statement by using BEGIN and
COMMIT WORK and defer constraint checking.

© Copyright IBM Corp. 2001, 2017 11-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

2. Update the tables again using the correct syntax to allow deferred constraint
checking.
BEGIN WORK;
SET CONSTRAINTS ALL DEFERRED;

UPDATE manufact SET manu_code = "PRO"


WHERE manu_code = "PRC";

UPDATE stock SET manu_code = "PRO"


WHERE manu_code = "PRC";

UPDATE items SET manu_code = "PRO"


WHERE manu_code = "PRC";

COMMIT WORK;

3. Query each table for the manu_code "PRC” to verify the update was
successful.
Table manufact:
SELECT count(*) FROM manufact
WHERE manu_code = "PRC";
Table stock:
SELECT count(*) FROM stock
WHERE manu_code = "PRC";
Table items:
SELECT count(*) FROM items
WHERE manu_code = "PRC";

All count values should be 0.

© Copyright IBM Corp. 2001, 2017 11-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Task 3. Delete and update of a parent row.


In this task, you will alter the parent_tab table and add a Primary Key constraint.
You will create a new table called child_tab with an integer column and a Foreign
Key. You will execute update and delete statements on the parent table parent_tab.
1. Alter the parent_tab table and add a Primary Key constraint called pk_parent.
What happened?
ALTER TABLE parent_tab
ADD CONSTRAINT PRIMARY KEY(col1)
CONSTRAINT pk_parent;

This returns the following error:

577: A constraint of the same type already exists on the column set.
There is already a unique constraint on col1. In order to change the
unique constraint into a Primary Key constraint, the unique
constraint must be dropped first.

2. Fix the problem.

To find the unique constraint name:


SELECT constrname
FROM systables t, sysconstraints c
WHERE tabname = "parent_tab"
AND t.tabid = c.tabid;

Drop the unique constraint:


ALTER TABLE parent_tab
DROP CONSTRAINT constraint_name;

Add the Primary Key constraint:


ALTER TABLE parent_tab
ADD CONSTRAINT PRIMARY KEY(col1)
CONSTRAINT pk_parent;

© Copyright IBM Corp. 2001, 2017 11-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

3. Create a table called child_tab in your database with the following column and
include a Foreign Key called fk_child_parent.
Column Name Description
c_col1 Integer.
CREATE TABLE child_tab (
c_col1 INTEGER,
FOREIGN KEY (c_col1)
REFERENCES parent_tab
CONSTRAINT fk_child_parent
);

4. Insert into the child_tab table the following values:


2, 3, 4, 5, 6, 7, 8, 9, 10, 11
INSERT INTO child_tab VALUES(2);
INSERT INTO child_tab VALUES(3);
INSERT INTO child_tab VALUES(4);
INSERT INTO child_tab VALUES(5);
INSERT INTO child_tab VALUES(6);
INSERT INTO child_tab VALUES(7);
INSERT INTO child_tab VALUES(8);
INSERT INTO child_tab VALUES(9);
INSERT INTO child_tab VALUES(10);
INSERT INTO child_tab VALUES(11);

5. Update the parent table parent_tab and change the col1 column value 5 to 25.
Why did the statement fail?
UPDATE parent_tab
SET col1 = 25
WHERE col1 = 5;

This statement returns the following error:

692: Key value for constraint (informix.pk_parent) is still being referenced.


You have violated a referential constraint. You were trying to
change a value in a Primary Key column that a row in the child
table is referencing via the Foreign Key.

© Copyright IBM Corp. 2001, 2017 11-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

6. Delete from the parent table parent_tab where the col1 column value is 5.
Why did the statement fail?
DELETE FROM parent_tab
WHERE col1 = 5;

This statement returns the following error:

692: Key value for constraint (informix.pk_parent) is still being


referenced.
You have violated a referential constraint. You were trying to delete
a row in the parent table that a row in the child table is referencing
via the Foreign Key.

Task 4. Insert and update of a child row.


In this task, you will insert and update a row into the child table child_tab.
1. Insert the value 12 into the child_tab table.
Why did the statement fail?
INSERT INTO child_tab VALUES(12);

This command returns the following errors:

691: Missing key in referenced table for referential constraint


111: ISAM error: no record found.
You have violated a referential constraint. You were trying to insert a
value into a column that is part of a referential constraint (Foreign
Key). The value you are trying to enter does not exist in the referenced
parent table column.

2. Update the child table child_tab and change the c_col1 column value 5 to 25.
Why did the statement fail?
UPDATE child_tab
SET c_col1 = 25
WHERE c_col1 = 5;

This statement returns the following errors:

691: Missing key in referenced table for referential constraint


111: ISAM error: no record found.

© Copyright IBM Corp. 2001, 2017 11-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

You have violated a referential constraint. You were trying to update a


value in a column that is part of a referential constraint (Foreign Key).
The value you are trying to enter does not exist in the referenced
parent table column.

Task 5. Cascading delete.


In this task, you will drop the Foreign Key constraint on the child_tab table. You will
alter the child_tab table to allow cascade deletes. You will delete a row from the
parent table parent_tab, and query the child table child_tab for the deleted row to
verify the cascade delete.
1. Drop the Foreign Key constraint on the child_tab table. Alter the child_tab
table to add a Foreign Key constraint with cascading deletes.
ALTER TABLE child_tab DROP CONSTRAINT fk_child_parent;

ALTER TABLE child_tab ADD CONSTRAINT


FOREIGN KEY(c_col1)
REFERENCES parent_tab
ON DELETE CASCADE
CONSTRAINT fk_child_parent;

2. Delete from the parent_tab parent table where col1 = 5.


DELETE FROM parent_tab
WHERE col1 = 5;

3. Query the child table child_tab to verify the value 5 is missing.


SELECT * FROM child_tab
WHERE c_col1 = 5;
No rows should be found. The delete of parent row 5 cascaded down to
delete the child row 5 as well.
Results:
In this exercise, you learned how to maintain referential and entity integrity
constraints.

© Copyright IBM Corp. 2001, 2017 11-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

Unit summary
• Determine when constraint checking occurs
• Drop a constraint
• Delete and update a parent row
• Insert and update a child row

Managing constraints © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 11-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 11 Managing constraints

© Copyright IBM Corp. 2001, 2017 11-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Modes and violation detection

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Unit 12 Modes and violation detection

© Copyright IBM Corp. 2001, 2017 12-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Unit objectives
• Enable and disable constraints and indexes
• Use the filtering mode for constraints and the indexes
• Reconcile the violations recorded in the database

Modes and violation detection © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 12-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Types of database objects


• Database objects include:
 Constraints
− Unique constraints
− Referential constraints
− Check constraints
− NOT NULL constraints
 Indexes
 Triggers

Modes and violation detection © Copyright IBM Corporation 2017

Types of database objects


Database objects include constraints, indexes, and triggers.

© Copyright IBM Corp. 2001, 2017 12-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Database object modes


• A database object can have one of the following modes:
 Enabled: normal state
 Disabled: not enforced (constraints) or used (triggers or indexes)
 Filtering (except triggers): enabled, but errors are logged. Transaction is not
rolled back.

Modes and violation detection © Copyright IBM Corporation 2017

Database object modes


A database object can have one of the following modes:

Enabled The default state or mode of an object. An enabled constraint is


active and its rules are enforced. An enabled index is available
for queries and contains all entries. An enabled trigger is fired
when the trigger event occurs.

Disabled A disabled constraint is not checked or enforced. With a


disabled index, contents are not updated when a row is
inserted, deleted, or updated. A disabled trigger is not fired and
is ignored by the database server. Even though the constraint,
trigger, or index entry is disabled, it remains in the system
catalog tables.

Filtering A filtering constraint is checked and enforced, just as if it were


an enabled constraint. However, any errors are placed in an
error-log table. One important effect of a filtering object is that, if
a constraint is violated, then the rest of the transaction is not
rolled back. Triggers cannot be in filtering mode. The only type
of index that can be in filtering mode is a unique index.

© Copyright IBM Corp. 2001, 2017 12-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Why use object modes?


• Database loads are faster without constraints, triggers, or indexes
• Re-enabling an object is easier and more efficient than
recreating it
• Finding a constraint violation is easy when the object is in filtering
mode
• In filtering mode, you can skip statements in a transaction that violate
constraints instead of causing the statement to roll back

Modes and violation detection © Copyright IBM Corporation 2017

Why use object modes?


Object modes are convenient for the following reasons:
• Inserting or updating many rows is dramatically slower when the database server
must check constraints or insert keys into indexes. For fastest performance,
disable the objects, load the data, and re-enable the objects. However, if you are
unsure about the integrity of the data you are loading, set constraints to filtering
mode and disable indexes.
• In previous database releases, you had to delete an object if you did not want it
enabled. After a data load, you had to recreate constraints and indexes. By
disabling an object instead of removing it, you simply re-enable it when needed.
One SQL statement can re-enable all objects in a table.
• Sometimes it can be difficult to find which row violated a constraint when a single
UPDATE statement, DELETE statement, or INSERT cursor affects many rows.
By changing the object to filtering mode, the rows that contained the error are
placed in a violation table.
• Without object modes, any SQL error in a transaction causes the statement to
automatically roll back (databases with logging only). In filtering mode, you can
skip any statements that cause a violation error while still allowing the transaction
to continue.

© Copyright IBM Corp. 2001, 2017 12-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Disabling an object
• Disabling individual objects:
SET CONSTRAINTS c117_11, c117_12 DISABLED;
SET INDEXES idx_x1 DISABLED;
SET TRIGGERS upd_cust DISABLED;
• Disabling all objects for a table:
SET CONSTRAINTS, INDEXES, TRIGGERS
FOR customer DISABLED;
• Results:
 Constraints are not checked
 Triggers do not fire
 Indexes are not updated or used for queries

Modes and violation detection © Copyright IBM Corporation 2017

Disabling an object
The two methods to disable objects are:
• Individually: To disable a constraint, index, or trigger, specify the object name.
You can find the constraint and index names with the dbschema utility or
dbaccess tool, or by querying the system catalog.
• By table: You can disable all constraints, triggers, and indexes for a table with
one SQL statement, as shown in the example. Any trigger that names the table in
the trigger event is disabled.
A disabled constraint is not checked and a disabled trigger does not execute. A
disabled index is neither updated nor used by the optimizer when it chooses query
paths.
Indexes created by constraints
If an index is created as a result of adding a referential or unique constraint, the
index is always enabled as long as the constraint is enabled.
The SET CONSTRAINTS statement places an exclusive table lock on the target
table for the duration of the statement.

© Copyright IBM Corp. 2001, 2017 12-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Creating a disabled object


• Index:
CREATE UNIQUE INDEX idx1 ON employee(emp_no)
DISABLED;
• Constraint:
CREATE TABLE customer (
customer_num SERIAL,
state CHAR(2)
CHECK(state IN("CA","AZ"))
DISABLED);
• Trigger:
CREATE TRIGGER t1 UPDATE ON orders
BEFORE(EXECUTE PROCEDURE x1())
DISABLED;

Modes and violation detection © Copyright IBM Corporation 2017

Creating a disabled object


You can specify that an object is disabled when you create it. The DISABLED
keyword is added to the end of the CREATE UNIQUE INDEX statement, CREATE
TRIGGER statement, or the column or table level constraint definition within the
CREATE TABLE statement.
The ALTER TABLE statement can also be used to create a filtered or disabled
constraint:
ALTER TABLE employee ADD CONSTRAINT
CHECK (age<100) CONSTRAINT agelimit FILTERING;

© Copyright IBM Corp. 2001, 2017 12-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Enabling a constraint
• Enabling individual objects:
SET CONSTRAINTS c117_11, c117_12 ENABLED;
SET INDEXES idx_x1 ENABLED;
SET TRIGGERS upd_cust ENABLED;
• Enabling all objects for a table:
SET CONSTRAINTS, INDEXES, TRIGGERS
FOR customer ENABLED;
If a constraint that is set
to disabled is enabled, all
existing rows must satisfy
the constraint. If some
rows violate the
constraint, an error is
returned.

Modes and violation detection © Copyright IBM Corporation 2017

Enabling a constraint
Two methods can be used to enable objects that exist in a database:
• Individually: To enable a constraint, index, or trigger, you must specify the
object name. The constraint and index names can be found by using
dbschema or dbaccess, or by querying the system catalog.
• By table: You can enable all constraints, triggers, and indexes for a table with
one SQL statement, as shown in the example. Any trigger that names the
table in the trigger event is enabled.
When a constraint is enabled from a disabled mode, all existing rows are checked to
see if they satisfy the constraint. If any rows do not satisfy the constraint, an error is
returned and the constraint remains disabled.
When a constraint is enabled from filtering, the existing rows are not rechecked
because they already satisfy the constraint.
When an index is enabled from disabled mode, the entire index is effectively rebuilt.

© Copyright IBM Corp. 2001, 2017 12-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Recording violations

SET CONSTRAINTS ENABLED;

Row that causes Constraints that were


violation violated.
One row can have
multiple violations

Violations Diagnostics
table table

Modes and violation detection © Copyright IBM Corporation 2017

Recording violations
When the constraint mode is changed to filtering or enabled, you can record any
subsequent violations in two tables: the violations table and the diagnostics table. All
violations for constraints or indexes on a table are placed in its corresponding
violations table and diagnostics table. Only one pair of these tables can be present
for each database table.
The violations table holds information about the row where the violation occurred.
The diagnostics table contains one row for every violation that occurred. In some
cases, one row can have multiple violations.
Example
An inserted row might violate the NOT NULL constraint, a referential constraint, and
a primary key constraint. In this case, three rows are placed in the diagnostics table
and one row is placed in the violations table.
In addition to the violation recorded in the violations table, you might also receive an
error. If the constraint is enabled, you receive an error when a violation occurs. Error
handling for violations in filtering mode is discussed later in this unit.

© Copyright IBM Corp. 2001, 2017 12-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Violation tables setup


• To create the violations and diagnostics tables:
START VIOLATIONS TABLE FOR tab_name;
• To explicitly name the violation tables:
START VIOLATIONS TABLE FOR tab_name
USING vio_tabname, diag_tabname;
• To cap the number of rows in the diagnostics table:
START VIOLATIONS TABLE FOR tab_name
MAX ROWS x;
(where x is the maximum number of rows)
• Must have RESOURCE privilege on database

Modes and violation detection © Copyright IBM Corporation 2017

Violation tables setup


To start violation logging, run the START VIOLATIONS TABLE statement.
This statement does two things:
• It creates the violations and diagnostics tables.
• It causes the database server to log violations if they occur.
You can restrict the number of rows inserted in the diagnostics table as a result of a
single data row with the MAX ROWS clause (up to the limit of 2,147,483,647 rows).
Remember that MAX ROWS only restricts the number of rows that are created per
data row, not the total number of rows inserted into the diagnostics table.
Only the table owner (with resource privileges for the database) or the DBA can
execute the START VIOLATIONS TABLE statement for a table.
Table names
If you do not explicitly name the tables with the USING clause, they are named
tabname_vio and tabname_dia, where tabname is the name of the table for which
violations are being recorded.

© Copyright IBM Corp. 2001, 2017 12-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Table permissions
In general, if a user has INSERT, UPDATE, or DELETE permissions on the target
table, the user also has permissions to INSERT into the violations tables.
Table extents
The extent sizes for the violations and diagnostics table are set at the default value.
The violations table and the diagnostics table are placed in the same dbspace as
the target table. If the target table is fragmented, the violations table is fragmented
in the same manner. The diagnostics table is fragmented in a round robin fashion
over the same dbspaces on which the target table is fragmented.

© Copyright IBM Corp. 2001, 2017 12-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Violations table schema

I = Insert
D = Delete
O = Update (original values)
N = Update (new values)
S = Created by the SET command C = Constraint violation
D = Unique index violation

Violations table

Target table columns Diagnostics table

informix_tupleid informix_tupleid
informix_optype objtype
informix_recowner objowner Owner of
constraint
objname
User

Name of constraint
Modes and violation detection © Copyright IBM Corporation 2017

Violations table schema


The violations table contains the same columns as the target table (the table for
which the violations are being recorded). In addition, three more columns store a
unique serial ID, the operation type that caused the error, and the login of the user
that submitted the SQL statement that caused the violation to occur. If the target
table (the table for which the violations are being recorded) contains a serial value, it
is stored as an integer in the violation table.
The violations table receives one row every time an SQL statement causes one or
more violations to occur. The diagnostics table contains one row for every constraint
or unique index violation that occurs. This table stores information about the
constraint that was violated.
The informix_tupleid column can be joined with the column of the same name in the
violations table to associate the violation entries with their corresponding
diagnostics.
To determine what constraint was violated, run the dbschema utility and look for the
constraint name that matches the objname in the diagnostics table.

© Copyright IBM Corp. 2001, 2017 12-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Filtering mode
• Set individual objects to filtering:
SET CONSTRAINTS c117_11, c117_12 FILTERING;
SET INDEXES idx_x1 FILTERING;
• Set all objects for a table to filtering:
SET CONSTRAINTS,INDEXES
FOR customer FILTERING;
• Cause an error to be returned to the application if a violation occurs:
SET CONSTRAINTS,INDEXES
FOR customer FILTERING WITH ERROR;

Modes and violation detection © Copyright IBM Corporation 2017

Filtering mode
You can set constraints and unique indexes to filtering mode. In filtering mode, any
constraint or unique index violations are recorded in the violations table as they
occur.
Before you set an object to filtering mode, make sure that violation logging is
enabled for the table with the START VIOLATIONS TABLE statement.
You cannot set triggers and indexes other than unique indexes to filtering mode.
Error handling
In filtering mode, a violation does not cause the statement to roll back. In the default
filtering mode (WITHOUT ERROR), the application tool is not informed that a
violation occurred. Be careful with this mode, as you could incorrectly assume that
the transaction was completed in full when, in fact, it might not have been.
If the WITH ERROR clause is included in the SET statement, an error is returned to
the user:
971: Integrity violations detected.
However, unlike enabled mode, the error does not automatically roll back the
statement. To roll back the transaction, explicitly execute ROLLBACK WORK.

© Copyright IBM Corp. 2001, 2017 12-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

The INSERT statements that add the errors to the diagnostics and violations table
are a part of the current transaction. If you roll back the transaction, the rows in the
violations and diagnostics tables are also rolled back.

© Copyright IBM Corp. 2001, 2017 12-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Turning off violation logging

• To turn off logging:


STOP VIOLATIONS TABLE FOR tab_name;

Modes and violation detection © Copyright IBM Corporation 2017

Turning off violation logging


You can turn off violation logging for a table with the STOP VIOLATIONS statement.
This statement does not remove the violations table. The administrator must do that
after violation logging is stopped by using the DROP TABLE statement.
Violation logging must be on if the target table has any constraints in filtering mode.
If logging is off, any filtering constraint violations produce an error because they
cannot be logged.

© Copyright IBM Corp. 2001, 2017 12-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

System catalog table: sysobjstate


• The sysobjstate system catalog table contains one row for every
constraint, trigger, and index.
• The table columns are:

Name Type Description


objtype char(1) C = Constraint
I = Index
T = Trigger
owner char(8) Owner of the object
name char(18) Name of the object
tabid integer The tabid of the table
state char(1) D = Disabled
E = Enabled
F = Filtering with no error
G = Filtering with error

Modes and violation detection © Copyright IBM Corporation 2017

System catalog table: sysobjstate


The sysobjstate system catalog table contains one row for every constraint, trigger, and
index.

© Copyright IBM Corp. 2001, 2017 12-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

System catalog table: sysviolations


• The sysviolations catalog table stores information about the violations
and diagnostics tables for the target database table.
• The table columns are:

Name Type Description

targettid integer Table ID that joins with systables to find


the target table name.
viotid integer Violations table ID.
diatid integer Diagnostics table ID.
maxrows integer Maximum number of rows allowed in the
violations table (null if no maximum).

Modes and violation detection © Copyright IBM Corporation 2017

System catalog table: sysviolations


The sysviolations catalog table stores information about the violations and diagnostics
tables for the target database table.

© Copyright IBM Corp. 2001, 2017 12-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Example 1 (1 of 4)
CREATE TABLE customer (
customer_num SERIAL NOT NULL CONSTRAINT nn_cn,
name CHAR(15),
PRIMARY KEY (customer_num) CONSTRAINT pk_cn);

CREATE TABLE orders (


order_num SERIAL NOT NULL CONSTRAINT nn_on,
customer_num INT NOT NULL CONSTRAINT nn_cn,
ship_instruct CHAR(40));

ALTER TABLE orders ADD CONSTRAINT


FOREIGN KEY(customer_num)
REFERENCES customer CONSTRAINT fk_cust;
Modes and violation detection © Copyright IBM Corporation 2017

Example 1
The next few pages illustrate how you can use object modes.
The example shows two tables, customer and orders. The customer table is the
parent table with a primary key and the orders table is the child table with a
constraint referencing the customer table. This referential constraint means that a
corresponding customer row must be present for every orders row.

© Copyright IBM Corp. 2001, 2017 12-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Example 1 (2 of 4)
• The following statement produces an error:
SET CONSTRAINTS, TRIGGERS, INDEXES
FOR customer DISABLED;
• Instead:
SET CONSTRAINTS, TRIGGERS, INDEXES
FOR orders DISABLED;
SET CONSTRAINTS, TRIGGERS, INDEXES
FOR customer DISABLED;
• Now start violation logging:
START VIOLATIONS TABLE FOR customer;
START VIOLATIONS TABLE FOR orders;

Modes and violation detection © Copyright IBM Corporation 2017

Suppose you want to perform a large load where some data might cause some
temporary referential integrity errors. You decide to disable the constraints for the
customer and orders table.
SET CONSTRAINTS order
You must execute the SET CONSTRAINTS statement in the proper order. You
cannot disable an object when other enabled objects refer to it. Because the
referential constraint for orders refers to the customer table, disabling customer
constraints first produces an error. Instead, disable the orders constraint first.
START VIOLATIONS TABLE
To log violations, inform the database server by using the START VIOLATIONS
TABLE statement. This statement creates the two violations tables for the table
listed. In the example, four additional tables are created: customer_vio,
customer_dia, orders_vio, and orders_dia.

© Copyright IBM Corp. 2001, 2017 12-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Example 1 (3 of 4)
• This statement is successful:
INSERT INTO orders
(order_num, customer_num, ship_instruct)
VALUES (0, 2, "ship tomorrow");
• However, an error occurs when constraints are enabled:
SET CONSTRAINTS, TRIGGERS, INDEXES
FOR customer ENABLED;
SET CONSTRAINTS, TRIGGERS, INDEXES
FOR orders ENABLED;
971: Integrity violations detected.

Modes and violation detection © Copyright IBM Corporation 2017

Once you disable constraints, they are not checked. That is why the INSERT
statement shown in the example is successful. If constraints are enabled, the
statement fails because you cannot add an orders row without a corresponding
customer row.
But when you try to enable the constraints, the database server must check all rows
to make sure that there are no violations. The SET CONSTRAINTS... ENABLED
statement fails because of the violation introduced with the INSERT statement.

© Copyright IBM Corp. 2001, 2017 12-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Example 1 (4 of 4)
• Errors placed in the violations table:
SELECT * FROM orders_vio, orders_dia
WHERE orders_vio.informix_tupleid =
orders_dia.informix_tupleid;

order_num 1
customer_num 2
ship_instruct ship tomorrow
informix_tupleid 1
informix_optype S
informix_recowner informix
informix_tupleid 1
objtype C
objowner informix
objname fk_cust

Modes and violation detection © Copyright IBM Corporation 2017

When the SET CONSTRAINTS...ENABLED statement is executed, any violations


are placed in the violations table (if the START VIOLATIONS TABLE statement was
executed for the table). The administrator can browse through the violations table
and determine (by the objname column) which constraints were violated.
In the example, the fk_cust constraint was violated. The administrator can run
dbschema and look for the fk_cust constraint that was created as follows:
ALTER TABLE orders ADD CONSTRAINT
(FOREIGN KEY(customer_num) REFERENCES customer
CONSTRAINT fk_cust;

© Copyright IBM Corp. 2001, 2017 12-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Example 2 (1 of 2)
• Checking for violations during normal SQL activity:
SET CONSTRAINTS, INDEXES
FOR customer FILTERING;
SET CONSTRAINTS, INDEXES
FOR orders FILTERING;
• This row is not inserted, but no error is returned!
INSERT INTO orders
(order_num, customer_num, ship_instruct)
VALUES (0, 4, "ship tomorrow");

Modes and violation detection © Copyright IBM Corporation 2017

Example 2
Suppose for some reason that the administrator wants to know what violations are
occurring during a load process without causing the load process to fail. The
administrator can determine violations by setting constraints and indexes to
FILTERING.
Since the WITH ERROR clause is not included in the SET statement, the
application is not notified when any error is returned. The INSERT statement in the
example fails and an entry is put in the violations table. However, the application
receives no error.
Serial values
Even though the row in the example is not inserted, the serial counter for the table is
incremented. The violations table shows the serial value of order_num as it would
have been if the row had been successfully inserted.

© Copyright IBM Corp. 2001, 2017 12-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Example 2 (2 of 2)
• Enable constraints:
SET CONSTRAINTS, INDEXES FOR customer ENABLED;
SET CONSTRAINTS, INDEXES FOR orders ENABLED;
• Turn off violations logging:
STOP VIOLATIONS TABLE FOR customer;
STOP VIOLATIONS TABLE FOR orders;
• Fix errors that caused violations to occur:
INSERT INTO customer (customer_num, name)
VALUES (4, "SCHMIDT");
• Insert rows that caused violations into target table:
INSERT INTO orders
SELECT order_num, customer_num, ship_instruct
FROM orders_vio;

Modes and violation detection © Copyright IBM Corporation 2017

Now the administrator wants to reconcile any violations. First, the administrator
enables the constraints and indexes. Then the administrator turns off violations
logging. These two steps are required to insert the violations back into the target
table, thus avoiding any endless cycles (violations added to the violations table and
later inserted back into the customer table).
Next, the administrator fixes any errors that caused the violations to occur. In the
example above, the parent row is missing for a referential constraint.
Finally, the administrator can copy any rows in the violations table into the target
table with the INSERT INTO... SELECT FROM statement.

© Copyright IBM Corp. 2001, 2017 12-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Exercise 12
Modes and violation detection
• use modes and violations to record and correct errors in data

Modes and violation detection © Copyright IBM Corporation 2017

Exercise 12: Modes and violation detection

© Copyright IBM Corp. 2001, 2017 12-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Exercise 12:
Modes and violation detection

Purpose:
In this exercise, you will learn how to capture data that violates constraints.

Task 1. Using the violations table.


In this task, you will disable all constraints on the items table, insert a row into the
items table and start violation logging. You will enable the constraints and find
errors have occurred in the violations tables. You will fix the errors and re-enable
the constraints.
1. Disable all constraints and other objects for the items table.
2. Insert the following row into the items table:
INSERT INTO items
(item_num, order_num, stock_num, manu_code,
quantity, total_price)
VALUES (2, 1001, 1, "JKL", 2, 100.00);
3. Start violation logging for the items table.
4. Enable the constraints and other objects.
5. What errors occurred in the violation tables?
6. How do you know which constraint has been violated?
7. Fix the errors by changing the manu_code to "HSK”.
8. Enable the constraints and other objects again.
Task 2. Using filtering mode.
In this task, you will query the sysobjstate system catalog to ensure violation
logging is enabled and set constraints for the items table to filtering. You will insert
a row into the items table and query the items table to see if the row inserted. You
will enable the constraints and stop violation logging. You will find errors have
occurred in the violations tables, fix the errors, and enable the constraints again.
1. Query the sysobjstate system catalog to ensure that the objects are enabled
for the items table.
2. Set the filtering mode on all constraints for the items table.
3. Insert the following row into the items table:
INSERT INTO items
(item_num, order_num, stock_num, manu_code,
quantity, total_price)
VALUES (3, 1010, 1, "WIL", 5, 300.00);

© Copyright IBM Corp. 2001, 2017 12-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

4. Query the items table to see if the row inserted.


5. Was the row inserted into the items table?
6. Was an error returned? Why or why not?
7. Enable all constraints for the items table.
8. Fix the errors by inserting the following row into the manufact table:
INSERT INTO manufact VALUES ("WIL", "Wilson", "3");
9. Use the data in the items_vio table to re-insert the last row into the items table.
10. Verify that the new item was inserted into the items table for order_num 1010.
11. Stop violations logging for the items table.
12. Drop the items violations and diagnostics tables.
Results:
In this exercise, you learned how to capture data that violates constraints.

© Copyright IBM Corp. 2001, 2017 12-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Exercise 12:
Modes and violation detection – Solutions

Purpose:
In this exercise, you will learn how to capture data that violates constraints.

Task 1. Using the violations table.


In this task, you will disable all constraints on the items table, insert a row into the
items table and start violation logging. You will enable the constraints and find
errors have occurred in the violations tables. You will fix the errors and re-enable
the constraints.
1. Disable all constraints and other objects for the items table.
SET CONSTRAINTS, INDEXES FOR items DISABLED;
2. Insert the following row into the items table:
INSERT INTO items
(item_num, order_num, stock_num, manu_code,
quantity, total_price)
VALUES (2, 1001, 1, "JKL", 2, 100.00);
3. Start violation logging for the items table.
START VIOLATIONS TABLE FOR items;
4. Enable the constraints and other objects.
SET CONSTRAINTS, INDEXES FOR items ENABLED;
5. What errors occurred in the violation tables?
971: Integrity violations detected.
6. How do you know which constraint has been violated?
By checking the violations table.
Use the following SELECT statement:

SELECT * FROM items_vio, items_dia


WHERE items_vio.informix_tupleid =
items_dia.informix_tupleid;

© Copyright IBM Corp. 2001, 2017 12-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

7. Fix the errors by changing the manu_code to "HSK”.


Note that the invalid row was successfully inserted into the items table
because constraints were disabled. When enabling the constraints failed,
the row was not removed from the table, so it must be updated (or
deleted, fixed, and re-inserted) before enabling the constraints again. Note
also that the data remains in the items_vio and items_dia tables unless
physically removed.

UPDATE items
SET manu_code = "HSK"
WHERE item_num = 2
AND order_num = 1001;
8. Enable the constraints and other objects again.
SET CONSTRAINTS, INDEXES FOR items ENABLED;
Task 2. Using filtering mode.
In this task, you will query the sysobjstate system catalog to ensure violation
logging is enabled and set constraints for the items table to filtering. You will insert
a row into the items table and query the items table to see if the row inserted. You
will enable the constraints and stop violation logging. You will find errors have
occurred in the violations tables, fix the errors, and enable the constraints again.
1. Query the sysobjstate system catalog to ensure that the objects are enabled
for the items table.
SELECT s.objtype, s.name, s.state,
s.tabid, t.tabname
FROM sysobjstate s, systables t
WHERE t.tabname = "items" AND s.tabid = t.tabid;
2. Set the filtering mode on all constraints for the items table.
SET CONSTRAINTS, INDEXES
FOR items FILTERING;
3. Insert the following row into the items table:
INSERT INTO items
(item_num, order_num, stock_num, manu_code,
quantity, total_price)
VALUES (3, 1010, 1, "WIL", 5, 300.00);
4. Query the items table to see if the row inserted.
SELECT * FROM items
WHERE order_num = 1010;

© Copyright IBM Corp. 2001, 2017 12-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

5. Was the row inserted into the items table? No.


6. Was an error returned? Why or why not? No error message was returned
because the WITH ERROR clause was not included in the SET
CONSTRAINTS statement. However, the message returned by dbaccess
indicated 0 rows were inserted.
7. Enable all constraints for the items table.
SET CONSTRAINTS, INDEXES FOR items ENABLED;
8. Fix the errors by inserting the following row into the manufact table:
INSERT INTO manufact VALUES ("WIL", "Wilson", "3");
9. Use the data in the items_vio table to re-insert the last row into the items table.
INSERT INTO items
SELECT item_num, order_num, stock_num, manu_code,
quantity, total_price
FROM items_vio
WHERE manu_code = "WIL";
10. Verify that the new item was inserted into the items table for order_num 1010.
SELECT * FROM items
WHERE order_num = 1010;
Stop violations logging for the items table.
STOP VIOLATIONS TABLE FOR items;
11. Drop the items violations and diagnostics tables.
DROP TABLE items_vio;
DROP TABLE items_dia;
Results:
In this exercise, you learned how to capture data that violates constraints.

© Copyright IBM Corp. 2001, 2017 12-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

Unit summary
• Enable and disable constraints and indexes
• Use the filtering mode for constraints and the indexes
• Reconcile the violations recorded in the database

Modes and violation detection © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 12-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 12 Modes and violation detection

© Copyright IBM Corp. 2001, 2017 12-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Concurrency control

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Unit 13 Concurrency control

© Copyright IBM Corp. 2001, 2017 13-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Unit objectives
• Use the different concurrency controls
• Monitor the concurrency controls for lock usage
• Use the Retain Update Lock feature

Concurrency control © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 13-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

ANSI SQL-92 transaction isolation


• The ANSI committee has defined the following levels of transaction
isolation (SQL-92):
 Read uncommitted
 Read committed
 Repeatable read
 Serializable read

Concurrency control © Copyright IBM Corporation 2017

ANSI SQL-92 transaction isolation


The isolation level for a query determines the degree to which the query is isolated
from modifications made by other concurrently executing UPDATE, DELETE, and
INSERT statements.
Informix supports all four levels of transaction isolation, as defined by the ANSI
committee.

© Copyright IBM Corp. 2001, 2017 13-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Informix isolation
• Informix allows users to control the level of isolation for their query by
using:
 SET TRANSACTION
− ANSI compliant
− Supports access modes
− Can be set once per transaction
 SET ISOLATION
− Not ANSI compliant
− Does not support access modes
− Can be changed within a transaction

Concurrency control © Copyright IBM Corporation 2017

Informix isolation
Informix provides two SQL statements for setting the current isolation level in an
application. The first, SET TRANSACTION, complies with the ANSI SQL-92
specification. The second statement, SET ISOLATION, is not ANSI compliant and does
not support access modes, but does allow you to specify an Informix defined isolation
level, cursor stability.
A primary difference between SET TRANSACTION and SET ISOLATION is that a SET
TRANSACTION statement remains in effect only for the duration of the transaction.
Additionally, you can execute only one SET TRANSACTION statement within a
transaction.
The SET ISOLATION statement allows you to change the effective isolation level within
a single transaction. For example:
BEGIN WORK;
SET ISOLATION TO DIRTY READ;
SELECT * FROM customer;
SET ISOLATION TO REPEATABLE READ;
INSERT INTO cust_info …
Once you set an isolation level by using the SET ISOLATION statement, it remains in
effect until the next SET ISOLATION statement or until the end of the session.

© Copyright IBM Corp. 2001, 2017 13-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Comparison

ANSI Levels SET TRANSACTION SET ISOLATION

Read uncommitted READ UNCOMMITTED DIRTY READ


Read committed READ COMMITTED COMMITTED READ
Not defined Not available COMMITTED READ
LAST COMMITTED
Not defined Not available CURSOR STABILITY
Repeatable read REPEATABLE READ REPEATABLE READ
Serializable SERIALIZABLE REPEATABLE READ

Concurrency control © Copyright IBM Corporation 2017

Comparison
Informix supports each of the defined ANSI isolation levels and two additional isolation
levels, CURSOR STABILITY (specific to cursor manipulation) and COMMITTED READ
LAST COMMITTED.

© Copyright IBM Corp. 2001, 2017 13-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Access methods
• By default, Informix transactions are always read/write capable
• To control the access mode, use the statements:
SET TRANSACTION READ WRITE;
SET TRANSACTION READ ONLY;
• Read-only transactions cannot:
 Update, insert, or delete rows
 Add, drop, alter, or rename database objects
 Update database statistics
 Grant or revoke privileges

Concurrency control © Copyright IBM Corporation 2017

Access methods
The ANSI SQL-92 defines both read/write and read-only transactions.

© Copyright IBM Corp. 2001, 2017 13-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

READ UNCOMMITTED
• ANSI:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
• Informix:
SET ISOLATION TO DIRTY READ;

Table

Database Server
Process

Database server process


reads rows from a
database table without
checking for locks.

Concurrency control © Copyright IBM Corporation 2017

READ UNCOMMITTED
When the isolation level is read uncommitted or dirty read, the database server does
not place any locks or check for existing locks when resolving your query. During
retrieval, you can look at any row, even rows that contain uncommitted changes.
Dirty-read isolation makes it possible for your query to retrieve phantom rows. A
phantom row is a row inserted by a transaction, but the transaction is rolled back rather
than committed. Although the phantom row is never committed to the database and
therefore never truly exists in the database, it is visible to any process using dirty-read
isolation.
Dirty-read isolation can be useful when:
• The table is static
• 100 percent accuracy is not as important as speed and freedom from contention
• You cannot wait for locks to be released
Dirty read isolation is the only isolation level available for non-logging databases.

© Copyright IBM Corp. 2001, 2017 13-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

READ COMMITTED
• ANSI:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
• Informix:
SET ISOLATION TO COMMITTED READ;

Table
Can lock be acquired?

Database Server
Process

Database server process


reads rows from a
database after seeing that
a lock can be acquired.

Concurrency control © Copyright IBM Corporation 2017

READ COMMITTED
Queries in logged databases default to ANSI read-committed isolation. Read committed
isolation is synonymous with the Informix committed read isolation. Read committed
isolation ensures that all rows read are committed to the database. To perform a
committed read, the database server attempts to acquire a shared lock on a row before
trying to read it. It does not place the lock; rather, it checks whether it can acquire the
lock. If it can, it is guaranteed that the row exists and is not being updated by another
process while it is being read. Remember, a shared lock cannot be acquired on a row
that is locked exclusively, which is always the case when a row is being updated.
As you are scanning rows using committed read, you are not looking at any phantoms
rows or dirty data. You know that the current row was committed (at least when your
process read it). After your process reads the row, however, other processes can
change it.
Committed read can be useful for:
• Lookups
• Queries
• Reports that yield general information

© Copyright IBM Corp. 2001, 2017 13-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

CURSOR STABILITY
• Informix:
SET ISOLATION TO CURSOR STABILITY;

Table Shared lock


placed on row
Database Server
Process

Database server process


reads rows from database
table and locks each one
as it is read. Lock is held
until next row is fetched.

Concurrency control © Copyright IBM Corporation 2017

CURSOR STABILITY
With CURSOR STABILITY, a shared lock is acquired on each row as it is read by a
cursor. This shared lock is held until the next row is retrieved. If data is retrieved by
using a cursor, the shared lock is held until the next FETCH is executed.
At this level, not only can you look at committed rows, but you are assured the row
continues to exist while you are looking at it. No other process (UPDATE or DELETE)
can change that row while you are looking at it.
You can use SELECT statements that use an isolation level of CURSOR STABILITY
for:
• Lookups
• Queries
• Reports yielding operational data
Example
A SELECT statement that uses CURSOR STABILITY is useful for detail-type reports
like price quotas or job-tracking systems.
If the isolation level of CURSOR STABILITY is set and a cursor is not used, CURSOR
STABILITY behaves in the same manner as READ COMMITTED (the shared lock is
never placed).

© Copyright IBM Corp. 2001, 2017 13-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

REPEATABLE and SERIALIZABLE reads


• ANSI:
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
• Informix:
SET ISOLATION TO REPEATABLE READ;

Table
Locks put on all
rows examined
Database Server
Process

Database server process


puts locks on all rows
examined to satisfy the
query.

Concurrency control © Copyright IBM Corporation 2017

REPEATABLE and SERIALIZABLE reads


Within an Informix database, repeatable reads and serializable reads exhibit the same
behavior and characteristics. A repeatable or serializable read isolation level places a
shared lock on all the rows that the database server examines; all these locks are held
until the transaction is committed. Other users can read the data, but cannot modify it in
any way. You are assured the row continues to exist not only while you are looking at it,
but also when you reread it later within the same transaction.
Informix databases created as MODE ANSI use REPEATABLE READ as the default
isolation level.
Repeatable reads are useful when you must treat all rows read as a unit or you need to
guarantee that a value does not change. For example:
• Critical, aggregate arithmetic (as in account balancing)
• Coordinated lookups from several tables (as in reservation systems)
Repeatable read guarantees the consistency of the data set for the duration of the
transaction. To do so, it must lock not only the rows that meet the query filter conditions,
but any rows and index keys that had to be read to resolve the query. If the query
performs a sequential scan as opposed to an indexed read, the database server
optimizes the scan by replacing the page or row locks required with a single share lock
on the table.

© Copyright IBM Corp. 2001, 2017 13-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

COMMITTED READ LAST COMMITTED


• Informix:
SET ISOLATION TO COMMITTED READ LAST COMMITTED;

Table Is the row


locked?
Database Server
Process

Database server process


reads rows from database
table and if locked,
returns the data at its last
committed value.

Concurrency control © Copyright IBM Corporation 2017

COMMITTED READ LAST COMMITTED


COMMITTED READ LAST COMMITTED relieves the problems of waiting for other
users to release locks and the risk of retrieving data in an inconsistent state (in the
middle of a transaction).
When the COMMITTED READ LAST COMMITTED is set and another process holds
exclusive row-level locks on the data, the data returned is the data as it existed when
last committed.
This isolation level also relieves the issue of deadlocks. Even if locks are set, the user
gets the data as last committed and does not have to wait for locks held by another
user to be released.

© Copyright IBM Corp. 2001, 2017 13-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

COMMITTED READ LAST COMMITTED example


User 1 User 2

CREATE DATABASE d1 WITH LOG; SET ISOLATION TO COMMITTED


CREATE TABLE t1 READ
LAST COMMITTED;
(c1 serial, c2 char(20));
SELECT * FROM t1
INSERT INTO t1
WHERE c1=3;
SELECT 0, tabname
c1 c2
FROM systables;
3 sysindices

BEGIN WORK;
UPDATE t1 SET c2="JOHN"
WHERE c1=3;

Concurrency control © Copyright IBM Corporation 2017

COMMITTED READ LAST COMMITTED example


This example illustrates what happens when a session with the COMMITTED READ
LAST COMMITTED isolation level set.
User 1 is in the middle of a transaction modifying the row that User 2 wants. Instead of
retrieving dirty data or waiting for the lock to be released, the value of column c2 when it
was last committed is returned to User 2. This ensures that the data is valid and has not
been changed.

© Copyright IBM Corp. 2001, 2017 13-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Configuring COMMITTED READ LAST COMMITTED


• Parameter:
 USELASTCOMMITTED
− ONCONFIG parameter
− Environment variable
 Values:
− COMMITTED READ
− DIRTY READ
− ALL
− NONE

• SQL statement:
SET ISOLATION TO COMMITTED READ LAST COMMITTED;
SET ENVIRONMENT USELASTCOMMITTED 'ALL|COMMITTED
READ|DIRTY READ|NONE';

Concurrency control © Copyright IBM Corporation 2017

Configuring COMMITTED READ LAST COMMITTED


Using COMMITED READ LAST COMMITTED can improve concurrency in sessions
that use the committed read, dirty read, read committed, or read uncommitted isolation
levels. It reduces the risk of locking conflicts when two or more sessions attempt to
access the same row in a table using row-level locking granularity.
Configuring this isolation level can be done in several ways.
• Set the parameter USELASTCOMMITTED in the ONCONFIG file. As with certain
other ONCONFIG parameters, USELASTCOMMITTED is a global setting for all
sessions, but can be overridden by session environment variables and SQL
statements.
• The ONCONFIG value for USERLASTCOMMITTED can be dynamically set
using the onmode -wf or onmode -wm commands.
• Set the environment variable USELASTCOMMITTED. This environment variable
overrides the value for the ONCONFIG parameter USELASTCOMMITTED. The
values for this variable are the same as for the ONCONFIG parameter.
• Execute the SQL statement SET ISOLATION TO COMMITTED READ LAST
COMMITTED. The SET ISOLATION statement overrides the
USELASTCOMMITTED session environment setting.

© Copyright IBM Corp. 2001, 2017 13-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

• Execute the SQL statement SET ENVIRONMENT USELASTCOMMITTED. This


session-level statement determines the action to be taken when the operation
encounters a lock and the isolation level of the session is set to either
COMMITTED READ or DIRTY READ.
SET ENVIRONMENT USELASTCOMMITTED can specify whether queries
should use the most recently committed version of the data rather than wait for
the lock to be released. This statement can override the USELASTCOMMITTED
configuration parameter setting for the duration of the current session.
Set USELASTCOMMITTED to one of the following values:
• COMMITTED READ: The database server reads the most recently committed
version of the data when it encounters an exclusive lock while attempting to read
a row in the committed read or read committed isolation level.
• DIRTY READ: The database server reads the most recently committed version of
the data if it encounters an exclusive lock while attempting to read a row in the
dirty read or read uncommitted isolation level.
• ALL: The database server reads the most recently committed version of the data
if it encounters an exclusive lock while attempting to read a row in the committed
read, dirty read, read committed, or read uncommitted isolation level.
• NONE: Disables the USELASTCOMMITTED feature. Under this setting, if a
session encounters an exclusive lock when attempting to read a row in the
committed read, dirty read, read committed, or read uncommitted isolation level,
the transaction is not allowed to read that row until the concurrent transaction
holding the exclusive lock is committed or rolled back.
Any SPL routine can use these statements to specify the committed read last
committed transaction isolation level during a session. These statements enable SQL
operations that read data to use the last committed version when an exclusive lock is
encountered during an operation that reads a row.
In cross-server distributed queries, if the isolation level of the session that issued the
query has the LAST COMMITTED isolation level option in effect, but one or more of the
participating databases does not support this LAST COMMITTED feature, then the
entire transaction conforms to the committed read or dirty read isolation level of the
session that issued the transaction without the LAST COMMITTED option enabled.

© Copyright IBM Corp. 2001, 2017 13-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

LAST COMMITTED considerations


• Supports:
 B+ tree indexes
 Functional indexes
• Does NOTsupport:
 R-tree indexes
 Tables accessed by DataBlade Modules
 Tables with columns of collection data types
 Tables created using Virtual Table Interface (VTI)
 Tables with page-level or exclusive locks
 Unlogged tables/databases

Concurrency control © Copyright IBM Corporation 2017

LAST COMMITTED considerations


The COMMITTED READ LAST COMMITTED feature supports B+ tree and functional
indexes, but not R-tree indexes.
COMMITTED READ LAST COMMITTED also does not support tables that are
accessed by DataBlade modules, tables with columns of collection data types, tables
created using a virtual table interface, tables with page-level locking, tables with
exclusive table-level locks, unlogged tables, or tables in databases with no transaction
logging.

© Copyright IBM Corp. 2001, 2017 13-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Locks and concurrency


• Informix lock granularity: • Informix lock types:
 Database level  Shared
 Table level  Exclusive
 Page level  Update (promotable)
 Row level
 Key level

Concurrency control © Copyright IBM Corporation 2017

Locks and concurrency


To ensure data consistency and integrity in a multiuser (concurrent access)
environment, a database server must place locks on data being modified.
Lock granularity
Informix provides five different levels of locking granularity:
• Database-level locks are useful for some administrative activities, such as imports
and exports.
• Table-level locks are useful and more efficient when an entire table or most of the
table’s rows are being updated.
• Page locking provides the optimum in lock efficiency when rows are being
accessed and modified in physical order.
• Row locks deliver the highest degree of concurrent access and are most useful
for OLTP activity.
• Key locking is automatic with row-level locking to ensure the same optimal level
of concurrency during index updates.

© Copyright IBM Corp. 2001, 2017 13-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Lock types
Informix supports three types of locks:
• Shared locks prevent other processes from updating the locked object. Other
users maintain read access to the object. Multiple share locks can be placed on a
single object.
• Exclusive locks are automatically placed by the database server on any object
actively being modified. Exclusive locks prevent all reads except dirty reads. A
DBA or user can also explicitly request an exclusive lock on a database or table
to perform administrative or batch operations.
• Promotable or update locks are placed on objects that are retrieved for update
but are not yet being updated. They prevent other users from acquiring exclusive
or promotable locks on the object. As an example, when you open a cursor with
the FOR UPDATE clause, the database server acquires an update lock on each
row fetched. The lock is promoted to an exclusive lock when the UPDATE
WHERE CURRENT statement is executed.

© Copyright IBM Corp. 2001, 2017 13-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Database-level locking

DATABASE stores EXCLUSIVE;

Other users cannot access


database

stores
database X

Concurrency control © Copyright IBM Corporation 2017

Database-level locking
This condition can be the case if you are:
• Executing many updates that involve many tables.
• Archiving the database files for backups.
• Altering the structure of the database.
You can lock the entire database by using the DATABASE statement with the
EXCLUSIVE option. The EXCLUSIVE option opens the database in an exclusive mode
and allows only the current user access to the database.
To allow other users access to the database, you must execute the CLOSE
DATABASE statement and then reopen the database.
Users with any level of database permission can open the database in exclusive mode.
Doing so does not give them any greater level of access than they normally have.

© Copyright IBM Corp. 2001, 2017 13-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Locking a table in Share Mode

LOCK TABLE customer IN SHARE MODE;

Others can
SELECT from
the table, but cannot
INSERT, UPDATE, or
DELETE
SELECT UPDATE

customer table X
DELETE

INSERT

Concurrency control © Copyright IBM Corporation 2017

Locking a table in Share Mode


If you want to give other users read access to the table but prevent them from modifying
any of the data that it contains, use the LOCK TABLE statement with the IN SHARE
MODE option.
When a table is locked in SHARE mode, other users can SELECT data from the table
but they cannot INSERT, DELETE, or UPDATE rows in the table or ALTER the table.
Locking a table in SHARE MODE does not prevent row locks from being placed for
updates by your process. To avoid exclusive row locks in addition to the share lock on
the table, you must lock the table in EXCLUSIVE mode.
Table locks and transactions
If your database is logged, table locks are allowed only within transactions. You must
execute a BEGIN WORK statement before a LOCK TABLE statement.

© Copyright IBM Corp. 2001, 2017 13-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Locking a table in Exclusive Mode

LOCK TABLE customer IN EXCLUSIVE MODE;

Others cannot
SELECT from
the table, nor can they
INSERT, UPDATE, or
DELETE
SELECT UPDATE
X
customer table X
DELETE

INSERT

Concurrency control © Copyright IBM Corporation 2017

Locking a table in Exclusive Mode


If your process is modifying a large percentage of the rows in a table, it might be useful
to place an exclusive lock on the entire table. The exclusive table lock prevents users
from reading the data, except with dirty-read isolation, which ensures that their process
is also aware that updates are actively being performed on the data.
Additionally, since an exclusive lock is placed on the table, the database server can
avoid placing exclusive locks on each page or row and key modified. If you are
modifying 5 million rows, this can be a dramatic savings.
Exclusive table locks and simple large objects stored in blobspaces
When a table is locked in exclusive mode, the only additional locks that the database
server can place on table data are locks on simple-large-object values stored in
blobspaces. Two additional locks are placed per blobpage. Tables that contain simple
large objects do not obtain additional locks.

© Copyright IBM Corp. 2001, 2017 13-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Unlocking a table

UNLOCK TABLE customer;

SELECT UPDATE

customer table

DELETE

INSERT

Concurrency control © Copyright IBM Corporation 2017

Unlocking a table
When a database is logged, table locks are automatically released when the
transaction commits. When the database does not use logging, transactions are not
supported and the table lock persists until the process completes or until you execute
an explicit UNLOCK TABLE statement.

© Copyright IBM Corp. 2001, 2017 13-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Row and page locks


• You determine whether the database server acquires page locks or
row and key locks when you create a table or when it is altered:
CREATE TABLE orders (
order_num SERIAL NOT NULL,
customer_num INTEGER,
order_date DATE)
LOCK MODE ROW;
ALTER TABLE orders LOCK MODE (PAGE);

Concurrency control © Copyright IBM Corporation 2017

Row and page locks


When you create a table, you choose the lock mode used when any rows from that
table are accessed. Page-level locking locks an entire data page whenever a single
row located on that page needs to be locked. Row-level locking locks only the row in
question. The default lock mode when you create a table is page-level.
Page locks are useful when, in a transaction, you process rows in the same order
as the cluster index of the table or process rows in physically sequential order.
Row locks are useful when, in a transaction, you process rows in an arbitrary order.
When the number of locked rows becomes large, you run these risks:
• Number of available locks becomes exhausted
• Overhead for lock management becomes significant
A trade-off between these two levels of locking is that page-level locking requires
fewer resources than row-level locking, but it also reduces concurrency. If a page
lock is placed on a page that contains many rows, other processes that need other
data from that same page might be denied access to that data.

© Copyright IBM Corp. 2001, 2017 13-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Configurable lock mode


• Configurable lock mode allows you to:
 Globally define the lock mode of newly created tables per session or per
database server
 No need to use the LOCK MODE clause
• Can be set by using:
DEF_TABLE_LOCKMODE configuration parameter
IFX_DEF_TABLE_LOCKMODE environment variable

Concurrency control © Copyright IBM Corporation 2017

Configurable lock mode


The DEF_TABLE_LOCKMODE configuration parameter is set by your system
administrator in the configuration file for your database server. The default setting is
page-level locking. To override page-level locking, set the
IFX_DEF_TABLE_LOCKMODE environment variable or have your system
administrator change the default setting to ROW. For example, using the UNIX korn
shell, execute the command:
export IFX_DEF_TABLE_LOCKMODE=ROW;

© Copyright IBM Corp. 2001, 2017 13-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Setting the lock mode


• Wait for lock to be released
SET LOCK MODE TO WAIT;

• Do not wait for lock to be released


SET LOCK MODE TO NOT WAIT;

• Wait 20 seconds for lock to be released


SET LOCK MODE TO WAIT 20;

Concurrency control © Copyright IBM Corporation 2017

Setting the lock mode


The default behavior for a database server is to immediately return an error to a
process when an SQL request is blocked by an existing lock. If you prefer that the
database server wait for the lock to be released, you can use the SET LOCK MODE
statement to specify how long the database server waits for the lock to be released.
If the lock is not released within the period you specify, the operation fails and an
error is returned to the requesting process.
Use care when you set the lock mode to WAIT without specifying a maximum wait
interval. If you do not specify a wait interval for SET LOCK MODE, your process
could, theoretically, wait forever. For example:
SET LOCK MODE TO WAIT;

© Copyright IBM Corp. 2001, 2017 13-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

RETAIN UPDATE LOCKS


• Syntax:

SET ISOLATION TO DIRTY READ


RETAIN UPDATE LOCKS;

SET ISOLATION TO COMMITTED READ


RETAIN UPDATE LOCKS;

SET ISOLATION TO CURSOR STABILITY


RETAIN UPDATE LOCKS;

Concurrency control © Copyright IBM Corporation 2017

RETAIN UPDATE LOCKS


The RETAIN UPDATE LOCKS feature is a switch that you can turn on and off at any
time during a user connection to the database server. It only affects SELECT...FOR
UPDATE statements with dirty read, committed read and cursor stability isolation levels.
When the update lock is in place on a row during a FETCH of a SELECT... FOR
UPDATE statement with one of the isolation levels above, it is not released at the
subsequent FETCH or when the cursor is closed. The update lock is retained until the
end of the transaction. This feature lets you avoid the overhead of the repeatable-read
isolation level or workarounds, such as dummy updates on a row.
To monitor the isolation level that a session uses, use the onstat -g ses and
onstat -g sql commands. The following lock values signify RETAIN UPDATE LOCKS.

Value Description

DRU Dirty-read with RETAIN UPDATE LOCKS.

CRU Committed read with RETAIN UPDATE LOCKS.

CSU Cursor stability with RETAIN UPDATE LOCKS.

© Copyright IBM Corp. 2001, 2017 13-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Deadlock detection

Wants a lock on row x


Holds a lock on row x

Process A Process B

Wants a lock on row y


Holds a lock on row y

Concurrency control © Copyright IBM Corporation 2017

Deadlock detection
In some databases, when multiple users request locks on the same resources,
deadlocks can occur. Deadlocks are serious problems, as they can halt a major portion
of the activity in a database system.
Informix has a built-in, sophisticated mechanism that detects potential deadlocks and
prevents them from happening. To prevent local deadlocks from occurring, the
database server maintains a list of locks for every user on the system. Before a lock is
granted, the lock list for each user is examined. If a lock is currently held on the
resource that the process requests to lock, the owner of that lock is identified and their
lock list is traversed to see if waits on any locks are held by the user who wants the new
lock. If so, the deadlock is detected at that point and an error message is returned to
the user who requested the lock.
The ISAM error code returned is:
-143 ISAM error: deadlock detected
Deadlocks cannot occur when the isolation level is set to COMMITTED READ LAST
COMMITTED.

© Copyright IBM Corp. 2001, 2017 13-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

What happens after a delete?

B+ tree scanner pool

B+ tree
scanner When a deleted item is
1 committed, its page number
is put in the B+ tree scanner
pool.
Index Key 2 The btscanner thread reads
the pool occasionally and
removes all committed
deleted items on each page it
finds.

Engine skips index items when:


 Delete flag set
AND
 Delete transaction committed

Concurrency control © Copyright IBM Corporation 2017

What happens after a delete?


Since a delete operation does not physically delete the associated index keys, but
rather sets their delete flag to 1, another mechanism must physically delete the key
item. That mechanism is known as the btscanner thread.
How the B+ tree scanner thread works
When an item is deleted, the delete flag is set. When the transaction is committed, a
request to delete the item is placed in a pool in shared memory called the B+ tree
scanner pool. The request is a 20-byte structure that consists of the tblspace number,
the page number, and the key number for the key to be deleted. Only one request is
placed in the pool for each page. The B+ tree scanner pool starts out at 1 kilobyte, but if
this space becomes full, another kilobyte is allocated to the pool for more requests.
Once every minute, or if the number of requests in the B+ tree scanner pool exceeds
100, the btscanner thread wakes up and reads the requests in the B+ tree scanner
pool. For each request, the btscanner thread finds the page and deletes the key that is
marked as deleted. Before it deletes the key, however, the btscanner thread makes
sure that the row was committed by test locking it.
How other sessions see deleted keys
If another session encounters a deleted key while it reads an index, the session checks
to see if the key value is still locked. If it is, the session assumes that row still exists. If,
however, the row is marked as deleted but is not locked (the B+ tree scanner has not
deleted the key yet), the session skips over the key entry as if it is not there.

© Copyright IBM Corp. 2001, 2017 13-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Row versioning
• Used to determine if row changed:
 Can help detect collisions in ER
 Can be used to ensure application not updating stale data
• Creates two shadow columns:
ifx_insert_checksum
− Generated on initial insert
− Never changes
ifx_row_version
− Incremented each time row updated
• Not required, but improves performance of redirected writes

Concurrency control © Copyright IBM Corporation 2017

Row versioning
Row versioning is a feature that allows an application to check that it is not updating a
stale row. It is also used by Informix Enterprise Replication to help detect collisions in
data rows.
When row versioning is enabled, two shadow columns are created:
ifx_insert_checksum: A checksum generated when the row is initially inserted. This
value never changes.
ifx_row_version: This value is incremented each time the row is updated.
These columns are not visible, and are not returned with a SELECT * from the table. In
order to view their values, they must be explicitly listed in the SELECT list.
While row versioning is not required, it can help improve performance in redirected
writes, since only the checksum and version values are checked instead of comparing
the entire row.

© Copyright IBM Corp. 2001, 2017 13-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Managing versioning
• Can be created when table created:
 CREATE TABLE tab1 (
col1 INTEGER,
...
WITH VERCOLS;
• Can be added/dropped using ALTER TABLE:
 ALTER TABLE tab1 ADD VERCOLS;
ALTER TABLE tab1 DROP VERCOLS;

Concurrency control © Copyright IBM Corporation 2017

Managing versioning
Row versioning is implemented using the VERCOLS keyword.
It can be implemented when the table is created, or later using the ALTER TABLE
statement, as shown in the visual.

© Copyright IBM Corp. 2001, 2017 13-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

ifx_row_id virtual column


• Hidden virtual column consisting of four colon-delimited parts:
 partition number
 rowid
 ifx_insert_checksum
 ifx_row_version
− Example: 1048928:257:741480809:1
• In table without version columns, ifx_insert_checksum and
ifx_row_version not present:
− Example: 1048928:257

Concurrency control © Copyright IBM Corporation 2017

Ifx_row_id virtual column


There is a hidden virtual column named ifx_row_id that consists of four colon-delimited
values:
• partition number (as decimal)
• rowid
• ifx_insert_checksum
• ifx_row_version
If VERCOLS is not enabled, then ifx_insert_checksum and ifx_row_version are not
present.

© Copyright IBM Corp. 2001, 2017 13-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Versioning code example


Without Version Columns: With Version Columns:
/* declare and open a cursor */ /* declare and open a cursor with row version */
$declare cur1 cursor for select * from T1 for $declare cur1 cursor for select *, ifx_row_id from T1
update; for update;

$open cur1; $open cur1;

/* fetch data */ /* fetch data */


$fetch cur1 into :my_id, :my_name; $fetch cur1 into :my_id, :my_name, :row_id;

/* process the data and make a decision */ /* process the data and make a decision */
do_update = get_user_input(my_id, my_name); do_update = get_user_input(my_id, my_name);

/* update the row if required */ /* update the row if required */


if (do_update) if (do_update)
{ {
$update T1 set name = :my_name where current of /* compare current ifx_row_id with the one read
cur1; earlier */
} $update T1 set name = :my_name where id = :my_id and
ifx_row_id = :row_id;

/*check whether update succeeded, raise warning


otherwise*/
if (sqlca.sqlerrd[2] < 1 )
raise_warning(“trying to update stale data”);
}

Concurrency control © Copyright IBM Corporation 2017

Versioning code example


The visual shows two code examples.
The first example is how the code is written when row versioning has not been enabled.
The second example shows how code could be written to use row versioning to ensure
that updates are done to current rows, rather than stale rows.
The second example shows the use of the ifx_row_id virtual column to check that the
row is current.

© Copyright IBM Corp. 2001, 2017 13-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Exercise 13
Concurrency control
• Use isolation/transaction levels and lock modes to control the
effects of SQL statements

Concurrency control © Copyright IBM Corporation 2017

Exercise 13: Concurrency control

© Copyright IBM Corp. 2001, 2017 13-33


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Exercise 13:
Concurrency control

Purpose:
In this exercise, you will learn how to use concurrency control by setting
isolation / transaction levels and lock modes.

Task 1. Using SET TRANSACTION READ WRITE and


READ ONLY.
In this task, you will be using the SET TRANSACTION syntax to set your session for
READ ONLY access mode. You will try and insert a row into the customer table.
You will ROLLBACK WORK on your transaction and change the SET
TRANSACTION to READ WRITE. You will try the insert statement for the customer
table, and then delete this row from the customer table.
1. In a dbaccess session:
• Begin a transaction.
• Set the transaction mode to READ ONLY.
• Insert the following row into the customer table:
Henry Smith
Smith Studios
12345 Barney Road
St. Bob,CA 90123
What happened?
2. Rollback your work.
3. Rerun the INSERT statement in Step 1 and change the SET TRANSACTION
access mode to READ WRITE.
4. Determine the customer number for Henry Smith.
5. Delete customer_num 128 from the customer table using the SET
TRANSACTION READ ONLY access mode and the customer number you
found in step 4.
6. Rollback your work.
7. Change the SET TRANSACTION access mode to READ WRITE and rerun the
DELETE statement in Step 5.

© Copyright IBM Corp. 2001, 2017 13-34


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Task 2. Using DIRTY READ isolation level.


In this task, you will see how a phantom row can be produced using the DIRTY
READ isolation level on the customer table. In one window, you will begin a
transaction and insert a row into the customer table. In another window, you will set
your isolation level to dirty-read and query for the newly inserted row. This will
display the newly inserted row even though the transaction has not been committed.
You will rollback work on the newly inserted row and requery for this row.
1. Open two dbaccess sessions.
2. In one session:
• Begin a transaction.
• Insert the following row into the customer table:
Patrick Edmonds
Edmonds Goods
789 W. Stuckey Ave
Central, IL, 60580
3. In the second session, run the following SELECT statement:
SET ISOLATION TO DIRTY READ;
SELECT * FROM customer
WHERE fname LIKE "Pat%"
AND lname LIKE "Edmonds%";
Why did the newly inserted customer display?
4. In the first session, rollback your transaction.
5. In the second session, run the query again. Notice that the customer "Patrick
Edmonds” is no longer in the customer table.
Task 3. Using COMMITTED READ isolation level.
In this task, you will see how a phantom row cannot be produced using the
COMMITTED READ isolation level on the stock table. In one window, you will
begin a transaction and insert a row into the stock table. In the second window, you
will select the row inserted. You will open a third window to examine the locks held
using onstat -K. You will query systables table for your partnum for the stock
table to verify the locks being held.
1. In one window:
• Begin a transaction.
• Insert the following row into the stock table:
0, "ANZ", "golf balls", 28.00, "case", "12/case"

© Copyright IBM Corp. 2001, 2017 13-35


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

2. In the second window, run the following SELECT statement:


SET ISOLATION TO COMMITTED READ;
SELECT * FROM stock
WHERE manu_code = "ANZ";
Why did an error happen with this SELECT statement?
3. To find the locks being held on your table, you will need to know the partnum of
the stock table.
In the second window, run the following query:
SELECT tabname, hex(partnum)
FROM systables
WHERE tabname = "stock";
4. Open a third window and run the following command:
$ onstat -k
Find your partnum in the tblsnum column of the onstat -k output. Notice the X
in the type column. This indicates an exclusive lock is being held.
5. Rollback your transaction in the first window.
Task 4. Using COMMITTED READ LAST COMMITTED
isolation level.
In this task, you will see how the last committed value is returned using the
COMMITTED READ LAST COMMITTED isolation level. In one window, you will
begin a transaction and update a row in the customer table. In the second window,
you will select the row being updated.
1. In one window:
• Begin a transaction.
• Run the following SQL statement:
UPDATE customer
SET lname = "Smith"
WHERE customer_num = 125;
2. In the second window, run the following SQL statements:
SET ISOLATION TO COMMITTED READ LAST COMMITTED;
SELECT lname, fname
FROM customer
WHERE customer_num = 125;
What happened with this SELECT statement?
3. Rollback your transaction in the first window.

© Copyright IBM Corp. 2001, 2017 13-36


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Task 5. Using REPEATABLE READ and SERIALIZABLE READ.


In this task, you will select a row from the manufact table and trying to delete this
same row from another window. You will open a third window to examine the locks
held using onstat -K. You will query systables table for your partnum for the
manufact table to verify the locks being held.
1. In one window:
• Begin a transaction.
• Set the isolation level to REPEATABLE READ.
• Query the manufact table for the manu_code equal to HRO.
2. To find the locks being held on your table, you will need to know the partnum of
the manufact table.
In a second window, run the following query:
SELECT tabname, hex(partnum)
FROM systables
WHERE tabname = "manufact";
3. Run the following command to find the locks currently held by the query in
Step 4.1:
$ onstat -K
Notice that there is an IS type lock ( intent-shared lock) on the entire table (rowid
= 0 is a table lock). An S type lock (or shared lock) is being held on the page
100 (rowid ending in two zeros, the lock is a page lock) and the key.
4. In the second window:
• Begin a transaction.
• Delete the row from the manufact table where the manu_code is HRO.
5. Run the following command to find the additional locks currently for the
manufact table:
$ onstat -K
Notice the additional IX type lock, or intent-exclusive. This lock is for the session
trying to delete the manu_code HRO. The delete session will wait until the
other locks have been released before executing.
6. Rollback work in all windows.

© Copyright IBM Corp. 2001, 2017 13-37


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Task 6. Using lock modes.


In this task, you will alter a row from the manufact table and try to select the
manufact table from another window. You will open a third window to examine the
locks held using onstat -K. You will query systables for your partnum for the
manufact table to verify the locks being held
1. In one window:
• Alter the manufact table to row-level locking.
• Begin a transaction.
• Update the manufact table using the following statement:
UPDATE manufact
SET lead_time = "2"
WHERE manu_code = "ANZ";
2. In the second window:
• Set the isolation level to COMMITTED READ.
• Set the lock mode to wait.
• Select all rows from the manufact table.
3. Run the following command to find the locks currently held in the above steps:
$ onstat -k
What happened and why?
4. Commit work in the UPDATE session.
What happened to the second session and why?
RESULTS:
In this exercise, you learned how to use concurrency control by setting
isolation / transaction levels and lock modes.

© Copyright IBM Corp. 2001, 2017 13-38


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Exercise 13:
Concurrency control - Solutions

Purpose:
In this exercise, you will learn how to use concurrency control by setting
isolation / transaction levels and lock modes.

Task 1. Using SET TRANSACTION READ WRITE and


READ ONLY.
In this task, you will be using the SET TRANSACTION syntax to set your session for
READ ONLY access mode. You will try and insert a row into the customer table.
You will ROLLBACK WORK on your transaction and change the SET
TRANSACTION to READ WRITE. You will try the insert statement for the customer
table, and then delete this row from the customer table.
1. In a dbaccess session:
• Begin a transaction.
• Set the transaction mode to READ ONLY.
• Insert the following row into the customer table:
Henry Smith
Smith Studios
12345 Barney Road
St. Bob,CA 90123

BEGIN WORK;
SET TRANSACTION READ ONLY;
INSERT INTO customer (customer_num, fname, lname,
company, address1, city, state, zipcode)
VALUES ( 0, "Henry", "Smith", "Smith Studios",
"12345 Barney Road", "St. Bob", "CA", "90123");

What happened?
This results in the following error:
878: Invalid operation for a READ-ONLY transaction.
The statement failed because the transaction was set to READ-ONLY.
READ-ONLY ensures that data can only be read and not altered.

© Copyright IBM Corp. 2001, 2017 13-39


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

2. Rollback your work.


ROLLBACK WORK;
3. Rerun the INSERT statement in Step 1 and change the SET TRANSACTION
access mode to READ WRITE.
BEGIN WORK;
SET TRANSACTION READ WRITE;
INSERT INTO customer (customer_num, fname, lname,
company, address1, city, state, zipcode)
VALUES ( 0, "Henry", "Smith", "Smith Studios",
"12345 Barney Road", "St. Bob", "CA", "90123");
COMMIT WORK;
4. Determine the customer number for Henry Smith.
SELECT customer_num FROM CUSTOMER
WHERE fname = "Henry" and lname = "Smith";
5. Delete Henry Smith from the customer table using the SET TRANSACTION
READ ONLY access mode and the customer number you found in step 4.
BEGIN WORK;
SET TRANSACTION READ ONLY;
DELETE FROM customer
WHERE customer_num = 4713;
This results in the following error:
878: Invalid operation for a READ-ONLY transaction.
The statement failed because the transaction was set to READ-ONLY.
READ ONLY ensures that data can only be read and not altered.
6. Rollback your work.
ROLLBACK WORK;
7. Change the SET TRANSACTION access mode to READ WRITE and rerun the
DELETE statement in Step 5.
BEGIN WORK;
SET TRANSACTION READ WRITE;
DELETE FROM customer
WHERE customer_num = 4713;
COMMIT WORK;

© Copyright IBM Corp. 2001, 2017 13-40


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Task 2. Using DIRTY READ isolation level.


In this task, you will see how a phantom row can be produced using the DIRTY
READ isolation level on the customer table. In one window, you will begin a
transaction and insert a row into the customer table. In another window, you will set
your isolation level to dirty-read and query for the newly inserted row. This will
display the newly inserted row even though the transaction has not been committed.
You will rollback work on the newly inserted row and requery for this row.
1. Open two dbaccess sessions.
2. In the first session, run the following SELECT statement:
• Begin a transaction.
• Insert the following row into the customer table:
Patrick Edmonds
Edmonds Goods
789 W. Stuckey Ave
Central, IL, 60580

BEGIN WORK;
INSERT INTO customer (customer_num, fname, lname,
company, address1, city, state, zipcode)
VALUES ( 0, "Patrick", "Edmonds", "Edmonds Goods",
"789 West Stuckey Ave", "Central", "IL", "60580");
3. In the second session, run the following SELECT statement:
SET ISOLATION TO DIRTY READ;
SELECT * FROM customer
WHERE fname LIKE "Pat%"
AND lname LIKE "Edmonds%";
Why did the newly inserted customer display?
With DIRTY READ you can retrieve rows being inserted before the
transaction has been committed. These are sometimes called phantom
rows.
4. In the first session, rollback your transaction.
ROLLBACK WORK;
5. In the second session, run the query again. Notice that the customer "Patrick
Edmonds” is no longer in the customer table.

© Copyright IBM Corp. 2001, 2017 13-41


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Task 3. Using COMMITTED READ isolation level.


In this task, you will see how a phantom row cannot be produced using the
COMMITTED READ isolation level on the stock table. In one window, you will
begin a transaction and insert a row into the stock table. In the second window, you
will select the row inserted. You will open a third window to examine the locks held
using onstat -K. You will query systables table for your partnum for the stock
table to verify the locks being held.
1. In the first session, run the following SELECT statement:
• Begin a transaction.
• Insert the following row into the stock table:
0, "ANZ", "golf balls", 28.00, "case", "12/case"

BEGIN WORK;
INSERT INTO STOCK(stock_num, manu_code, description,
unit_price, unit, unit_descr)
VALUES (0,"ANZ", "golf balls", 20.00, "case",
"12/case");

2. In the second session, run the following SELECT statement:


SET ISOLATION TO COMMITTED READ;
SELECT * FROM stock
WHERE manu_code = "ANZ";

This returns the following errors:


245: Could not position within a file via an index.
144: ISAM error: key value locked.

Why did an error happen with this SELECT statement?


COMMITTED READ ensures that you are not looking at any phantom
rows.

© Copyright IBM Corp. 2001, 2017 13-42


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

3. To find the locks being held on your table, you will need to know the partnum of
the stock table.
In the second session, run the following query:

SELECT tabname, hex(partnum)


FROM systables
WHERE tabname = "stock";

Your output will be similar to the following:

4. Open a new Informix Server (putty) window (login = docker/tcuser; run docker
exec -it iif_developer_edition bash), and run the following query:
$ onstat -K
Find your partnum in the tblsnum column of the onstat -k output. Notice the X
in the type column. This indicates an exclusive lock is being held.

5. Rollback your transaction in the first session window:


ROLLBACK WORK;

© Copyright IBM Corp. 2001, 2017 13-43


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Task 4. Using COMMITTED READ LAST COMMITTED


isolation level.
In this task, you will see how the last committed value is returned using the
COMMITTED READ LAST COMMITTED isolation level. In one window, you will
begin a transaction and update a row in the customer table. In the second window,
you will select the row being updated.
1. In the first session window, run the following SQL statements:
• Begin a transaction.
• Run the following SQL statement:
UPDATE customer
SET lname = "Smith"
WHERE customer_num = 125;

BEGIN WORK;
UPDATE customer
SET lname = "Smith"
WHERE customer_num = 125;

2. In the second session window, run the following SQL statements:

SET ISOLATION TO COMMITTED READ LAST COMMITTED;


SELECT lname, fname
FROM customer
WHERE customer_num = 125;

What happened with this SELECT statement?


The value returned for lname was the value existing prior to the
uncommitted UPDATE statement. Also, because the value returned was
the last one committed, it was not necessary to wait on any locks held by
the current UPDATE statement.

3. In the first session window, rollback your transaction:


ROLLBACK WORK;

© Copyright IBM Corp. 2001, 2017 13-44


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Task 5. Using REPEATABLE READ and SERIALIZABLE READ.


In this task, you will select a row from the manufact table and trying to delete this
same row from another window. You will open a third window to examine the locks
held using onstat -K. You will query systables table for your partnum for the
manufact table to verify the locks being held.
1. In the first session window, run the following query:
• Begin a transaction.
• Set the isolation level to REPEATABLE READ.
• Query the manufact table for the manu_code equal to HRO.
BEGIN WORK;
SET ISOLATION TO REPEATABLE READ;
SELECT *
FROM manufact
WHERE manu_code = "HRO";

To find the locks being held on your table, you will need to know the partnum of
the manufact table.
2. In the second session window, run the following query:
SELECT tabname, hex(partnum)
FROM systables
WHERE tabname = "manufact";

(Your results might vary):

© Copyright IBM Corp. 2001, 2017 13-45


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

3. Run the following command to find the locks currently held by the query in
Step 4.1:
$ onstat -K
Notice that there is an IS type lock ( intent-shared lock) on the entire table (rowid
= 0 is a table lock). An S type lock (or shared lock) is being held on the row 105.

4. In the second session window:


• Begin a transaction.
• Delete the row from the manufact table where the manu_code is HRO.
BEGIN WORK;
DELETE FROM manufact
WHERE manu_code = "HRO";

This results in the following errors:


240: Could not delete a row.
107: ISAM error: record is locked.

© Copyright IBM Corp. 2001, 2017 13-46


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

5. Run the following command to find the additional locks currently for the
manufact table:
$ onstat -K
Notice the additional IX type lock, or intent-exclusive. This lock is for the session
trying to delete the manu_code HRO. The delete session will wait until the
other locks have been released before executing.

6. Rollback work in all session windows.


ROLLBACK WORK;
Task 6. Using lock modes.
In this task, you will alter a row from the manufact table and try to select the
manufact table from another window. You will open a third window to examine the
locks held using onstat -K. You will query systables for your partnum for the
manufact table to verify the locks being held
1. In the first session window:
• Alter the manufact table to page-level locking.
• Begin a transaction.
• Update the manufact table using the following statement:
UPDATE manufact
SET lead_time = "2"
WHERE manu_code = "ANZ";

ALTER TABLE manufact LOCK MODE (page);


BEGIN WORK;
UPDATE manufact
SET lead_time = "2"
WHERE manu_code = "ANZ";

© Copyright IBM Corp. 2001, 2017 13-47


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

2. In the second session window:


• Set the isolation level to COMMITTED READ.
• Set the lock mode to wait.
• Select all rows from the manufact table.
SET ISOLATION TO COMMITTED READ;
SET LOCK MODE TO WAIT;
SELECT * FROM manufact;

This session runs part way, but is waiting on the release of the
exclusive lock held by the first session running the update transaction.
3. In the third session window, run the following command to find the locks
currently held in the above steps:
$ onstat -K
What happened and why?

The SELECT statement is waiting to complete execution. When it


encounters the exclusive locks created by the UPDATE process, it
waits until the locks are released. The onstat -K output shows the
session waiting for the lock held by the one doing the updating.
4. Commit work in the UPDATE session window.
What happened to the second session and why?
The COMMIT WORK statement releases the locks. At this point the
second session continues with its reading of the manufact table. You
can confirm this in the 3rd session window.
RESULTS:
In this exercise, you learned how to use concurrency control by setting
isolation / transaction levels and lock modes.

© Copyright IBM Corp. 2001, 2017 13-48


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

Unit summary
• Use the different concurrency controls
• Monitor the concurrency controls for lock usage
• Use the Retain Update Lock feature

Concurrency control © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 13-49


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 13 Concurrency control

© Copyright IBM Corp. 2001, 2017 13-50


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Data security

Data security

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Unit 14 Data security

© Copyright IBM Corp. 2001, 2017 14-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Unit objectives
• Use the database, table, and column-level privileges
• Use the GRANT and REVOKE statements
• Use role-based authorization

Data security © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 14-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Levels of data security

Database

Table
View Fragment in dbs1

Column
Fragment in dbs2

Routine Fragment in dbs3

Data security © Copyright IBM Corporation 2017

Levels of data security


Once a user gains access to the database server, the user can only access databases
and database objects for which they were granted privileges.
Informix allows the DBA to grant:
• Database-level privileges
• Object-level privileges, including:
• Table
• Column
• View
• Routine
• Fragment
• All database server administration tasks are restricted to the user informix.

© Copyright IBM Corp. 2001, 2017 14-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Database-level privileges
• The three levels of database access are:
 Connect
 Resource
 DBA

Data security © Copyright IBM Corporation 2017

Database-level privileges
Informix organizes database privileges into three levels of access.
CONNECT
The CONNECT privilege allows you to connect to a database, create temporary tables
and indexes, create views and synonyms, grant permission to objects that you create
and own, and drop or alter any objects that you own. You cannot create permanent
tables and indexes.
RESOURCE
The RESOURCE privilege gives you all CONNECT privileges and the ability to create
permanent tables and indexes, stored procedures, and functions.
DBA
A user with DBA privilege has full access to the database. The only restriction placed
on users with DBA status is the inability to revoke the DBA privilege from themselves.
However, a user with DBA status can grant the privilege to another user who can then
revoke it from the grantor.

© Copyright IBM Corp. 2001, 2017 14-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Table and column-level privileges

ALTER Add, delete, or modify columns


DELETE Remove rows from a table
INDEX Create indexes for a table
SELECT Retrieve information from the columns in a
table
UPDATE Modify information in the columns of a
table
INSERT Insert rows into a table
REFERENCES Reference columns in referential
constraints
ALL Perform any or all of the preceding
operations

Data security © Copyright IBM Corporation 2017

Table and column-level privileges


Even though you gain access to a database, you cannot access or modify database
objects until you are granted access permission to the object.
Table privileges
For tables, you can grant permission for read-only access (SELECT), data-modification
access (DELETE, INSERT, UPDATE), or table-administration access (ALTER, INDEX,
REFERENCES). A special privilege, UNDER, applies only to typed tables. The UNDER
privilege grants you the ability to create a subtable under an existing typed table. To
grant all of these privileges to a user, the administrator can use the keyword ALL.
Column privileges
The SELECT, UPDATE, and REFERENCES privileges can also be granted only on a
specific column within a table.

© Copyright IBM Corp. 2001, 2017 14-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Default privileges
• Database level:
 When you create a database, you are automatically granted DBA privileges
• Table level:
 Non-ANSI databases:
− All table-level privileges except ALTER and REFERENCES granted to all users
− Can use environment variable NODEFDAC to grant no privileges
 MODE ANSI databases:
− No default privileges granted

Data security © Copyright IBM Corporation 2017

Default privileges
Informix grants certain default database and table-level privileges.
• Default database-level privileges: If you want to allow other users to access the
database, you must grant them CONNECT, RESOURCE, or DBA privileges.
• Default table-level privileges: For non-ANSI databases, table-level privileges are
automatically granted to the public when a table is created. For ANSI-compliant
databases, table-level privileges are not automatically granted; you must explicitly
grant privileges to specific users or to public.
Environment variable NODEFDAC
To prevent the granting of default table-level privileges in a non-ANSI database, set the
NODEFDAC environment variable to YES before you execute the CREATE TABLE
statement.

© Copyright IBM Corp. 2001, 2017 14-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Granting database-level privileges


• Examples:

CONNECT is granted to all


GRANT CONNECT TO PUBLIC; users.

GRANT RESOURCE TO maria, joe;

GRANT DBA TO janet;

Data security © Copyright IBM Corporation 2017

Granting database-level privileges


You can use the GRANT statement to grant database-access privileges to users. The
components of the GRANT statement are:

privilege One of the database-level access types: CONNECT,


RESOURCE, or DBA.

PUBLIC The keyword that you use to specify access privileges for
all users.

user-list A list of login names for the users to whom you are
granting access privileges. You can enter one or more
names, separated by commas.

In the first example, the CONNECT privilege is granted to all users (PUBLIC). In the
second example, the RESOURCE privilege is granted only to the user maria and the
user joe. In the third example, janet is given DBA privilege.

© Copyright IBM Corp. 2001, 2017 14-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Revoking database-level privileges


• Examples:

REVOKE CONNECT FROM mike;

REVOKE RESOURCE FROM maria;

Data security © Copyright IBM Corporation 2017

Revoking database-level privileges


You can use the REVOKE statement to revoke database-access privileges from users.
The components of a REVOKE statement are:

privilege One of the database-level access types: CONNECT,


RESOURCE, or DBA.

PUBLIC The keyword that you use to specify access privileges for all
users.

user-list A list of login names for the users to whom you are granting
access privileges. You can enter one or more names,
separated by commas.

If you revoke the DBA or RESOURCE privilege from one or more users, they are left
with the CONNECT privilege. To revoke all database privileges from users with DBA or
RESOURCE status, you must revoke CONNECT as well as DBA or RESOURCE.
In the first example, the CONNECT privilege is revoked from mike.
In the second example, the RESOURCE privilege is revoked from the user maria. That
user now has the CONNECT privilege.

© Copyright IBM Corp. 2001, 2017 14-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Even though CONNECT has been revoked from user mike in this example, remember
that CONNECT TO PUBLIC was granted in the previous slide. Since mike is always a
member of PUBLIC, this statement has no effect unless CONNECT was only granted
to specific users.

© Copyright IBM Corp. 2001, 2017 14-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Granting table-level privileges


• Examples:

GRANT ALL ON customer TO PUBLIC;

Allows liz to grant update


GRANT UPDATE ON orders TO liz to other users
WITH GRANT OPTION;

GRANT INSERT, DELETE ON items TO mike


AS maria;

Grantor becomes maria

Data security © Copyright IBM Corporation 2017

Granting table-level privileges


You can use the GRANT statement to specify the operations that a user can perform
on a table that you have created. The components of a table-level GRANT are:

privilege One or more of the table access types: ALTER, DELETE,


INDEX, INSERT, SELECT, UPDATE, REFERENCES, and
ALL.

table or view The name of the table or view for which you are granting
access privileges.

PUBLIC The keyword that you use to specify access privileges for all
users.

user-list A list of login names for the users to whom you are granting
access privileges. You can enter one or more names,
separated by commas.

WITH GRANT Allows the user or users listed in the GRANT statement to
OPTION grant the same privileges to other users.

AS [user] Makes the grantor of the permission another user.

© Copyright IBM Corp. 2001, 2017 14-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

In the first example shown above, all privileges are granted to all users (PUBLIC) on the
customer table.
In the second example, liz is given update permissions on the order table with the ability
to give that permission to other users.
The third example has the INSERT and DELETE privileges granted to the user mike by
maria.

© Copyright IBM Corp. 2001, 2017 14-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Revoking table-level privileges


• Examples:

REVOKE ALL ON orders FROM PUBLIC;

REVOKE DELETE, UPDATE ON customer


FROM mike, maria;

REVOKE INSERT, UPDATE ON items


FROM mike AS maria;

Revoker becomes maria

Data security © Copyright IBM Corporation 2017

Revoking table-level privileges


You can use the REVOKE statement to prevent specific operations that a user can
perform on a table that you have created. The components of a REVOKE statement
are:

privilege One or more of the table-access types: ALTER,


REFERENCES, DELETE, INDEX, INSERT, SELECT,
UPDATE, and ALL.

table or view The name of the table or view for which you are granting
access privileges.

PUBLIC The keyword that you use to specify access privileges for all
users.

user-list A list of login names for the users to whom you are granting
access privileges. You can enter one or more names,
separated by commas.

© Copyright IBM Corp. 2001, 2017 14-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Although you can grant UPDATE and SELECT privileges for specific columns, you
cannot revoke these privileges column by column. If you revoke UPDATE or SELECT
privileges from a user, all UPDATE and SELECT privileges that you have granted to
that user are revoked.
In the first example shown above, all privileges are revoked from all users (PUBLIC) on
the orders table.
In the second example, the DELETE and UPDATE privileges are revoked from the
users mike and maria.
The third example shows INSERT and UPDATE permissions revoked from mike by
maria.

© Copyright IBM Corp. 2001, 2017 14-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Granting column-level privileges


• Only SELECT, UPDATE, and REFERENCES privileges can be
granted to individual columns.
• Column-level privileges are granted in the same way that table-level
privileges are granted, except that a column list must follow the
privilege in the GRANT statement.
• Examples:
GRANT SELECT (company, fname, lname)
ON customer TO PUBLIC;
GRANT INSERT, UPDATE (quantity), SELECT
ON items TO maria;

Data security © Copyright IBM Corporation 2017

Granting column-level privileges


When you grant privileges for a table, you can specify the SELECT, UPDATE, and
REFERENCES privileges to apply to only certain columns in the table.
In the first example shown above, the SELECT privilege is granted to all users for
columns - company, fname, and lname - of the customer table.
In the second example, the UPDATE privilege is granted only on the quantity column,
but the INSERT and SELECT privileges are granted on all columns of the table.
Privileges cannot be revoked at the column level. To remove column-level privileges,
first revoke the privilege at the table level and regrant privileges on various columns as
appropriate.

© Copyright IBM Corp. 2001, 2017 14-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Routine privileges
• Examples:

GRANT EXECUTE ON total_orders TO PUBLIC;

GRANT EXECUTE ON square ( x INT ) TO maria;

REVOKE EXECUTE ON cancel_orders FROM joe, tom;

Data security © Copyright IBM Corporation 2017

Routine privileges
When you create a user-defined routine, either a stored procedure or a function, you
own and are automatically granted execute privileges for that routine. The execute
privilege allows you to issue an EXECUTE PROCEDURE or EXECUTE FUNCTION
statement for the routine. If you want to allow other users to execute the routine, you
must grant them the EXECUTE privilege by using the GRANT EXECUTE ON
statement.

© Copyright IBM Corp. 2001, 2017 14-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

DataBlade privileges
• Examples:

GRANT EXTEND TO maria;

REVOKE EXTEND FROM joe, ram;

Data security © Copyright IBM Corporation 2017

DataBlade privileges
DataBlades are a set of user-defined routines bundled into a shared library.
In the example, if the DBA has the IFX_EXTEND_ROLE configuration parameter set to
1 (on), the GRANT statement allows the user to create or drop a user-defined routine
(UDR) that has the EXTERNAL clause. The REVOKE statement does not allow the
users to do any manipulation of UDRs that have the EXTERNAL clause.
If the DBA leaves IFX_EXTEND_ROLE set to 0 (off), then no restrictions are applied to
which users can manipulate UDRs.
When you grant the EXTEND role to a specific user, that user is automatically granted
EXECUTE permission on the sysbldprepare UDR, and the sysroleauth system catalog
table is updated to reflect the new built-in role.

© Copyright IBM Corp. 2001, 2017 14-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Roles

sales role

marketing role salesadmin role

Data security © Copyright IBM Corporation 2017

Roles
A role is a group of users who can be granted security privileges. Roles make the job of
administering security easier. Once a user is assigned to a role, the system
administrator needs only GRANT and REVOKE table and column privileges to a role.
You can nest roles within other roles.
The example shows users that are part of the marketing role, and users that are part of
the salesadmin role. However, all users are also part of the sales role.

© Copyright IBM Corp. 2001, 2017 14-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Creating roles
• Examples:

CREATE ROLE mkting;


CREATE ROLE slsadmin;
CREATE ROLE sales;

GRANT mkting TO jim, mary, ram;


GRANT slsadmin TO andy, liz, sam;
GRANT sales TO mkting, slsadmin;

Data security © Copyright IBM Corporation 2017

Creating roles
The CREATE ROLE statement creates a role. The statement effectively puts an entry
in the sysusers system catalog table where the user type is G. The role name must be
less than or equal to 32 characters. All user and role names on the system must be
unique. For example, if you have a user name gus that can connect to the database
server, you cannot create a role called gus. In order to enforce this rule, the following
checks are in place:
• The CREATE ROLE statement checks to make sure that the role name is not
present in the password file.
• A user is not able to connect if the user name is created as a role name.
The CREATE ROLE statement can only be executed by a user who has DBA
permissions on the database. A role is a database object, meaning that it is only
applicable for the database in which it is created.
Once the role is created, the next step is to assign users to roles. The GRANT
statement assigns one or more users to the role specified. You can also assign one role
to another, as shown in the example. A successful GRANT statement puts an entry in
the sysroleauth system catalog table.

© Copyright IBM Corp. 2001, 2017 14-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Using roles (1 of 2)
• Examples:

REVOKE ALL ON orders FROM public;

GRANT SELECT ON orders TO sales;

GRANT INSERT, UPDATE, DELETE ON orders


TO slsadmin;

What permission does the


marketing role have on
the orders table?

Data security © Copyright IBM Corporation 2017

Using roles
Table and column-level privileges can be assigned to roles by using the GRANT
statement. However, database privileges cannot be assigned to roles.

© Copyright IBM Corp. 2001, 2017 14-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Using roles (2 of 2)
• A user can either inherit a default role, or specify a role to use in their
session:
 Default roles are assigned by the DBA using the GRANT ROLE statement:
GRANT DEFAULT ROLE slsadmin TO liz;
 A user can set their own role through the SET ROLE SQL statement:
SET ROLE slsadmin;
SET ROLE DEFAULT;
• Default roles can be granted to PUBLIC:
GRANT DEFAULT ROLE slsadmin to PUBLIC;
• Default roles can be revoked with the REVOKE statement:
REVOKE DEFAULT ROLE FROM ram;

Data security © Copyright IBM Corporation 2017

Before a user can gain access to ROLE permissions, they have to either inherit them
through default roles or put themselves into the ROLE through the SET ROLE
statement. Default roles can be granted and revoked from users through the GRANT
and REVOKE SQL statements as described in the example.
If different default roles are assigned to a user and to PUBLIC, the default role of the
user takes precedence. If a default role is not assigned to a user, the user only has
individually granted and public privileges.
Discussion
Assume that the DBA executes the following statements:
CREATE ROLE mkting;
CREATE ROLE slsadmin;
CREATE ROLE sales;
GRANT mkting to jim, mary, ram;
GRANT slsadmin to andy, liz, sam;
GRANT sales to mkting, slsadmin;
REVOKE ALL on orders from public;
GRANT select ON orders TO sales;
GRANT insert, update, delete ON orders to slsadmin;
The following statements are run by user mary. Which statements will fail? Why?
SELECT * FROM orders;
SET ROLE mkting;
SELECT * FROM orders;

© Copyright IBM Corp. 2001, 2017 14-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

GRANT and REVOKE FRAGMENT


• Examples:
REVOKE ALL ON orders
FROM PUBLIC;
GRANT SELECT ON orders
TO PUBLIC;

REVOKE FRAGMENT ALL All fragments of orders


are read-only by user1.
ON orders
FROM user1;

GRANT FRAGMENT INSERT,


user1 can now INSERT,
UPDATE,DELETE UPDATE, or DELETE from
only the fragment in
ON orders(dbspace1) dbspace1.
TO user1;
Data security © Copyright IBM Corporation 2017

GRANT and REVOKE FRAGMENT


Two examples of the GRANT FRAGMENT and REVOKE FRAGMENT statement are
shown.
These examples show how you can grant read-only privileges to all fragments but the
one in dbspace1. If user1 tries to INSERT a row into any fragment but the one in
dbspace1, the following error occurs:
977: No permission on fragment (dbspace1).
271: Could not insert new row into the table.
The fragment-level privileges that you can grant are INSERT, UPDATE, DELETE, and
ALL. They can be granted whether you have table-level privileges or not. Table-level
privileges take precedence over fragment-level privileges. For example, if you have
table-level insert capability, fragment-level insert privileges are not checked.
REVOKE FRAGMENT and GRANT FRAGMENT are only valid when executed on
tables fragmented by expression.

© Copyright IBM Corp. 2001, 2017 14-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Discussion
• The orders table is fragmented so that orders for customer numbers 1 -
10,000 are in dbspace1 and orders for customer numbers 10,001 -
20,000 are in dbspace2.
• Given the GRANT and REVOKE FRAGMENT statements on the
previous page, which of these statements would fail (if executed by
user1)?
INSERT INTO orders(cust_nbr) VALUES 100;
SELECT * FROM orders;
UPDATE orders SET cust_nbr = 12200;
WHERE cust_nbr = 220;

Data security © Copyright IBM Corporation 2017

Discussion
The INSERT statement shown in the example succeeds because user1 has INSERT
permissions into the fragment in dbspace1.
The SELECT statement succeeds because user1 has SELECT permissions on the
table (fragment permissions are only for INSERT, UPDATE, and DELETE statements).
The UPDATE statement fails because user1 does not have UPDATE permissions for
the fragment in dbspace2. The user requires UPDATE permissions for the fragment
from where the row is moving and the fragment to where the row is moving.

© Copyright IBM Corp. 2001, 2017 14-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Exercise 14
Data security
• assign and revoke privileges at the user and role levels

Data security © Copyright IBM Corporation 2017

Exercise 14: Data security

© Copyright IBM Corp. 2001, 2017 14-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Exercise 14:
Data security

Purpose:
In this exercise, you will learn how to use the built-in features of Informix that
control data security.

Task 1. Using GRANT statements.


In this task, you will be granting table level privileges for a second login on the
items table. These grant privileges include the ability to create and drop tables,
select and insert rows, and update only one column in the items table.
1. Open two dbaccess sessions.
2. In the first session:
• Use the GRANT statement to allow all users to connect to your database.
• Use the GRANT statement to give user bob the ability to create a table and
drop any objects that he owns, but not the ability to drop the database.
• Exit dbaccess.
3. In a second dbaccess session:
• Connect to your database as user bob.
Do this by using the Connection > Connect options from the dbaccess menu.
Select the local_tcp server, then log in as user bob (password bob).
• Create a new table called bob1 with one column col1 of data type
character(10).
• Drop the new table.
Did this work?
__ a. Connect to the sysmaster database.
__ b. Drop the stores_demo database.
What happened?
4. In the first window:
• Reconnect to your stores_demo database.
• Revoke all privileges on the items table from public.
• Using the GRANT statement, give user jane the ability to select and insert
rows in the items table.

© Copyright IBM Corp. 2001, 2017 14-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

5. In the second window:


• Connect to your database as user jane (password jane).
• Delete the items from order_num 1022.
Why did this statement fail?
• Select all the items for order_num 1022.
What happened?
6. In the first window:
• Using the GRANT statement, give user sam the ability to update only the
manu_code column in the items table.
7. In the second window:
• Connect to your database as user sam (password sam).
• Update the items table and set the manu_code to HSK for order 1001.
What happened?
Task 2. Using GRANT and REVOKE statements.
In this task, you will revise the privileges previously assigned to the other users
using the GRANT and REVOKE statements.
1. In your first dbaccess window:
User jane no longer needs to select from or insert into the items table.
• Execute the SQL statement needed to change jane’s access privileges.
User joe needs to select from only the order_num and total_price columns of
the items table.
• Execute the SQL statement needed to change joe’s access privileges.
2. In a second dbaccess window:
• Connect to your database as user joe (password joe).
• Select all rows from the items table.
What happened?

© Copyright IBM Corp. 2001, 2017 14-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Task 3. Using roles.


In this task, you will create a role for the purchasing department and grant and
revoke privileges for various SQL statements.
1. In your first dbaccess window:
• Create a role for the purchasing department.
• Grant the purchasing department role the appropriate privileges so that users
can insert into the stock table, but only update the unit_price column.
• Revoke all privileges on the stock table from public except SELECT.
• Grant the purchasing role to user frank.
2. In the second window:
• Connect to your database as user frank (password frank).
• Insert a row into the stock table using the following statement:
INSERT INTO stock (stock_num, manu_code)
VALUES (1, "ANZ");
Did the insert work?
What do you need to do to make it work?
• Fix the problem and insert the row.
3. In the second window:
• Connect to your database as user mary (password mary).
• Insert a row into the stock table using the following statement:
INSERT INTO stock (stock_num, manu_code)
VALUES (2, "ANZ");
Did the insert work?
What do you need to do to make it work?
• Fix the problem and insert the row.

© Copyright IBM Corp. 2001, 2017 14-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Task 4. Using GRANT and REVOKE on fragments.


In this task, you will revoke all on the customer table and give select privileges to
public. For a second user, you will revoke all on the customer table. Grant the
second user only insert on the customer table in dbspace4.
1. In the first window:
• Revoke all privileges on the customer table from public.
• Grant SELECT to public on the customer table.
• Grant DELETE on the customer table in dbspace4 to user bob.
2. In the second window:
• Connect to your database as user bob (password bob).
• Delete the customer with the last name of “Currie” from the customer table.
Is the row deleted?
Results:
In this exercise, you learned how to use the built-in features of Informix that
control data security.

© Copyright IBM Corp. 2001, 2017 14-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Exercise 14:
Data security - Solutions

Purpose:
In this exercise, you will learn how to use the built-in features of Informix that
control data security.

Task 1. Using GRANT statements.


In this task, you will be granting table level privileges for a second login on the
items table. These grant privileges include the ability to create and drop tables,
select and insert rows, and update only one column in the items table.
1. Open two dbaccess sessions.
2. In the first dbaccess session:
• Use the GRANT statement to allow all users to connect to your database.
GRANT CONNECT TO public;
• Use the GRANT statement to give user bob the ability to create a table and
drop any objects that he owns, but not the ability to drop the database.
• GRANT RESOURCE TO bob;
• Exit dbaccess.
3. In a second dbaccess session:
• Connect to your database as user bob.
Do this by using the Connection > Connect options from the dbaccess menu.
Select the dev server, then log in as user bob (password bob).
• Create a new table called bob1 with one column col1 of data type
character(10).
CREATE TABLE bob1 (
col1 CHAR(10)
);
• Drop the new table.
DROP TABLE bob1;
Did this work?
Yes. Bob was allowed to drop his own table.

© Copyright IBM Corp. 2001, 2017 14-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

• Connect to the sysmaster database.


• Drop the stores_demo database.
What happened?
The DROP DATABASE command returns the following error:
389: No DBA permission.
User bob does not have sufficient privileges to drop the stores_demo
database.
4. In the first session:
• Reconnect to your stores_demo database.
• Revoke all privileges on the items table from public.
REVOKE ALL ON items FROM PUBLIC;
• Using the GRANT statement, give user jane the ability to select and insert
rows in the items table.
GRANT SELECT, INSERT ON items TO jane;
5. In the second session:
• Connect (dev) to your database as user jane (password jane).
• Delete the items from order_num 1022.
DELETE FROM items WHERE order_num = 1022;
Why did this statement fail?
Because jane does not have delete privileges on the items table.
274: No DELETE permission for items.
• Select all the items for order_num 1022.
SELECT * FROM items WHERE order_num = 1022;
What happened?
Jane was able to select the items because she has SELECT permission
on the items table. (No rows found.)
6. In the first session:
• Using the GRANT statement, give user sam the ability to update only the
manu_code column in the items table.
GRANT UPDATE (manu_code) ON items TO sam;

© Copyright IBM Corp. 2001, 2017 14-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

7. In the second session:


• Connect to your database as user sam (password sam).
• Update the items table and set the manu_code to HSK for order 1001.
UPDATE items
SET manu_code = "HSK"
WHERE order_num = 1001;

What happened?
The update failed because sam only has update privileges on the
manu_code column and cannot select the order_num column.
272: No SELECT permission for items.order_num.
Task 2. Using GRANT and REVOKE statements.
In this task, you will revise the privileges previously assigned to the other users
using the GRANT and REVOKE statements.
1. In your first session:
User jane no longer needs to select from or insert into the items table.
• Execute the SQL statement needed to change jane’s access privileges.
REVOKE SELECT, INSERT ON items FROM jane;
User joe needs to select from only the order_num and total_price columns of
the items table.
• Execute the SQL statement needed to change joe’s access privileges.
GRANT SELECT (order_num, total_price) ON items TO joe;
2. In your second session:
• Connect to your database as user joe (password joe).
• Select all rows from the items table.
SELECT * FROM items;
What happened?
Only the columns granted SELECT privileges (order_num, total_price) are
returned to the user.

© Copyright IBM Corp. 2001, 2017 14-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Task 3. Using roles.


In this task, you will create a role for the purchasing department and grant and
revoke privileges for various SQL statements.
1. In your first session:
• Create a role for the purchasing department.
CREATE ROLE purchasing;
• Grant the purchasing department role the appropriate privileges so that users
can insert into the stock table, but only update the unit_price column.
GRANT INSERT, UPDATE (unit_price) ON stock TO purchasing;
• Revoke all privileges on the stock table from public except SELECT.
REVOKE ALL ON stock FROM public;
GRANT SELECT ON stock TO public;
• Grant the purchasing role to user frank.
GRANT purchasing TO frank;
2. In your second session:
• Connect to your database as user frank (password frank).
• Insert a row into the stock table using the following statement:
INSERT INTO stock (stock_num, manu_code)
VALUES (1, "ANZ");
Did the insert work?
No. The insert returns the following error:
275: The insert privilege is required for this operation.
What do you need to do to make it work?
Since no default roles have been defined, user frank needs to grant
himself the purchasing role and re-insert the row.

• Fix the problem and insert the row.


SET ROLE purchasing;
INSERT INTO stock (stock_num,manu_code)
VALUES (1,"ANZ");

© Copyright IBM Corp. 2001, 2017 14-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

3. In the second session:


• Connect to your database as user mary (password mary).
• Insert a row into the stock table using the following statement:
INSERT INTO stock (stock_num, manu_code)
VALUES (2, "ANZ");
Did the insert work?
No. The insert returns the following error:
275: The insert permission is required for this operation.

What do you need to do to make it work?


Since only the purchasing role can insert into the stock table, user
mary needs to grant herself the purchasing role and re-insert the row.
• Fix the problem and insert the row.
SET ROLE purchasing;
INSERT INTO stock (stock_num,manu_code)
VALUES (2,"ANZ")
The SET ROLE statement returns the following errors:
19805: No privilege to set to the role.
111: ISAM error: no record found.
Mary has not been granted the role of purchasing, so she cannot assign
herself to that role.

© Copyright IBM Corp. 2001, 2017 14-33


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Task 4. Using GRANT and REVOKE on fragments.


In this task, you will revoke all on the customer table and give select privileges to
public. For a second user, you will revoke all on the customer table. Grant the
second user only insert on the customer table in dbspace4.
1. In the first session:
• Revoke all privileges on the customer table from public.
REVOKE ALL ON customer FROM PUBLIC;
• Grant SELECT to public on the customer table.
GRANT SELECT ON customer TO PUBLIC;
• Grant DELETE on the customer table in dbspace4 to user bob.
GRANT FRAGMENT DELETE ON customer(dbspace4) TO bob;
2. In the second session:
• Connect to your database as user bob (password bob).
• Delete the customer with the last name of “Currie” from the customer table.
DELETE FROM customer
WHERE lname = "Currie";
Is the row deleted?
No, because the customer named “Currie” is located in dbspace2 and
bob only has delete permissions on customers in dbspace4.
Results:
In this exercise, you learned how to use the built-in features of Informix that
control data security.

© Copyright IBM Corp. 2001, 2017 14-34


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

Unit summary
• Use the database, table, and column-level privileges
• Use the GRANT and REVOKE statements
• Use role-based authorization

Data security © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 14-35


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 14 Data security

© Copyright IBM Corp. 2001, 2017 14-36


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 15 Views

Views

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
U n i t 1 5 V i e ws

© Copyright IBM Corp. 2001, 2017 15-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Unit objectives
• Create views
• Use views to present derived and aggregate data
• Use views to hide joins from users

Views © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 15-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

What is a view?

• A virtual table
• A dynamic window

V
I
Data
E
W

Views © Copyright IBM Corporation 2017

What is a view?
A view is often called a virtual table. As far as the user is concerned, it acts like an
ordinary table. But in fact, a view has no existence in its own right. Rather, it is
derived from columns in real tables.
A view can also be called a dynamic window on your database.
A view can calculate the results of a computation like sum (total_price). Yet as
individual prices change, the value of the calculated sum is always up to date.

© Copyright IBM Corp. 2001, 2017 15-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Creating a view
CREATE VIEW ordsummary AS
SELECT order_num, customer_num, ship_date
FROM orders;

CREATE VIEW they_owe (ordno, orddate, cnum) AS


SELECT order_num, order_date, customer_num
FROM orders
WHERE paid_date IS NULL;

Views © Copyright IBM Corporation 2017

Creating a view
The CREATE VIEW statement consists of a CREATE VIEW clause and a SELECT
statement.
You can also give names to the columns in a view by listing them in parentheses after
the view name. If you do not assign names, the view uses the names of columns in the
underlying table.
Exceptions to syntax
Follow normal rules for writing the SELECT statement, except the following syntax is
prohibited:
• FIRST
• ORDER BY
• INTO TEMP
In the first example, the view ordsummary has three columns. They are given the same
names as the columns in the orders table.
In the second example, the view they_owe also has three columns. However, the
column names for the view differ from the column names in the orders table. They are
called ordno, orddate, and cnum instead of order_num, order_date, and
customer_num. In addition, the view they_owe shows only certain rows of the orders
table where paid_date is NULL.

© Copyright IBM Corp. 2001, 2017 15-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Dropping a view
• Example:

DROP VIEW ordsummary;

Views © Copyright IBM Corporation 2017

Dropping a view
The DROP VIEW command allows you to remove a view from your database.
When you drop a view, no data is deleted. The underlying tables remain intact.
You cannot ALTER a view. To change a view, you must first remove the view by using
DROP VIEW, then recreating it with CREATE VIEW.

© Copyright IBM Corp. 2001, 2017 15-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Views: Access to columns


• Example:

CREATE VIEW ordsummary AS


SELECT order_num, customer_num, ship_date
FROM orders;

Views © Copyright IBM Corporation 2017

Views: Access to columns


Views can restrict access to certain columns within a table or tables.
This can be useful for two reasons:
• Information in some columns can be sensitive and should be restricted from
general access. For example, a salary column in an employee table should not
be accessible to all users.
• Some columns can contain irrelevant data for some users. By leaving those
columns out of a view, the database looks simpler and less cluttered.
In the example, a view is created that contains only three columns: order_num,
customer_num, and ship_date. The other columns are not listed when the data is
selected through this view.

© Copyright IBM Corp. 2001, 2017 15-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Views: Access to rows


• Example:

CREATE VIEW baseball AS


SELECT *
FROM stock
WHERE description MATCHES "*baseball*";

Views © Copyright IBM Corporation 2017

Views: Access to rows


Views can also restrict access to certain rows within a table or tables.
This restriction can be valuable for two reasons:
• Some rows can contain sensitive data or data that should be restricted to certain
users.
• Some rows can be unimportant to certain users. For example, the Accounts
Receivables Department might only be interested in orders that are not paid.
The example shows a view that lists only baseball equipment from the stock table.

© Copyright IBM Corp. 2001, 2017 15-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Views: A virtual column


• Example:

CREATE VIEW ship_cost (ordno, cnum, s_wt,


s_chg, chg_per_lb) AS
SELECT ord_num, customer_num, ship_weight,
ship_charge, ship_charge / ship_weight
FROM orders;

Views © Copyright IBM Corporation 2017

Views: A virtual column


You can create a view with a SELECT statement that includes an expression. The
result of the expression is called a virtual column.
In the example, the view includes a column that is the computed: ship-charge per
pound. The ship-charge per pound is computed by using the formula chg_per_lb =
ship_charge / ship_weight. When a user queries the view, any virtual columns in the
view look like real columns.
The result of the computation:
ship_charge / ship_weight
is displayed in the virtual column chg_per_lb.

© Copyright IBM Corp. 2001, 2017 15-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Views: An aggregate function


• Example:

CREATE VIEW manu_total (m_code, total_sold) AS


SELECT manu_code, SUM(total_price)
FROM items
GROUP BY manu_code;

Views © Copyright IBM Corporation 2017

Views: An aggregate function


You can also include aggregate functions (such as SUM, MIN, MAX, AVG, COUNT) in
a SELECT statement for a view.
The example shows a view that selects the sum of the total price for each group of
items with a different manu_code. The aggregate function is placed in a virtual column
called total_sold.

© Copyright IBM Corp. 2001, 2017 15-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

A view that joins two tables


• Example:

CREATE VIEW stock_info AS


SELECT stock.*, manu_name
FROM stock, manufact
WHERE stock.manu_code =
manufact.manu_code;

Views © Copyright IBM Corporation 2017

A view that joins two tables


You can use a view to hide joins from a user. This makes a complicated join invisible to
a user.
In the example, the result of a SELECT on the view is a combination of data from the
stock and manufact tables. The view creates a useful illusion that the data is located in
one place, called stock_info. To the user, stock_info looks like a single table.
In reality, stock_info is not a single table, but a view based on two underlying tables,
stock and manufact.

© Copyright IBM Corp. 2001, 2017 15-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

A view on another view


• Example:

CREATE VIEW manu_total


(m_code,total_sold) AS
SELECT manu_code, SUM(total_price)
FROM items
GROUP BY manu_code;

CREATE VIEW manu_new AS


SELECT manu_name, total_sold
FROM manufact, manu_total
WHERE manufact.manu_code =
manu_total.m_code;

Views © Copyright IBM Corporation 2017

A view on another view


A view can be based wholly or partially on another view.
The example shows the creation of a view called manu_total that selects the total price
for each manu_code group. The view manu_new takes the data selected from the
manu_total view and joins it with the manu_name column in the manufact table.
The view manu_new displays two pieces of data: the column manu_name and the
virtual column total_sold.

© Copyright IBM Corp. 2001, 2017 15-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Restrictions on views
• You cannot create indexes on a view
• A view depends on its underlying tables
• Some views restrict inserts, updates, and deletes
• You must have full SELECT privileges on all columns

Views © Copyright IBM Corporation 2017

Restrictions on views
Several restrictions are imposed on views:
• You cannot create indexes on a view. However, when querying, you do receive
the benefit of existing indexes on columns in the underlying tables.
• A view depends on its underlying tables (and views). If you drop a table, all views
derived from that table are automatically dropped. If you drop a view, any views
derived from that view are automatically dropped.
• Some views restrict inserts, updates, and deletes. These restrictions are
described on the next page.
• You must have full SELECT privileges on all columns order to create a view on a
table.

© Copyright IBM Corp. 2001, 2017 15-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Views: INSERT, UPDATE, and DELETE


• You cannot INSERT, UPDATE, or DELETE from a view if it has:
 A join
 An aggregate
• You cannot UPDATE a view with a virtual column
• You cannot INSERT into a view with a virtual column
• You can DELETE from a view with a virtual column

Views © Copyright IBM Corporation 2017

Views: INSERT, UPDATE, and DELETE


Some restrictions for inserting, updating, and deleting rows of views are shown in the
visual.

© Copyright IBM Corp. 2001, 2017 15-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

The WITH CHECK OPTION clause (1 of 2)


• Compare:

CREATE VIEW no_check AS


SELECT * FROM stock
WHERE manu_code = "HRO";

CREATE VIEW yes_check AS


SELECT * FROM stock
WHERE manu_code = "HRO"
WITH CHECK OPTION;

Views © Copyright IBM Corporation 2017

The WITH CHECK OPTION clause


The views we have created so far let you insert rows into the database even if those
rows are outside the scope of the view.
For example, the view no_check allows you to insert rows with a manu_code with a
value other than HRO. Every row you insert immediately becomes inaccessible through
the view.
You can fix the situation by using the WITH CHECK OPTION clause at the end of the
CREATE VIEW statement. The view yes_check allows you to insert only data that
satisfies the selection criteria of the view.
A view with the CHECK option gives the database administrator the ability to add an
extra level of security. The database administrator can require use of a view to update,
delete, or insert into a table. That view can enforce special restrictions against certain
columns in a table, as in the example.

© Copyright IBM Corp. 2001, 2017 15-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

The WITH CHECK OPTION clause (2 of 2)


Which of the following will succeed - and why?

INSERT INTO no_check


VALUES (1, "ANZ", "soccer ball", 30,
"each", "each");

INSERT INTO yes_check


VALUES (1, "ANZ", "soccer ball", 30,
"each", "each");

Views © Copyright IBM Corporation 2017

The example shows what could happen when the WITH CHECK OPTION clause is not
used:
• A user inserts a row through the view no_check.
• A moment later, the user runs the following:
SELECT * FROM no_check;
• The newly added row does not show up in the output.
How do you determine whether the soccer ball was successfully entered into the
database?
If the user was using yes_check instead of no_check, then the INSERT would have
been rejected with an error message (data value out of range).

© Copyright IBM Corp. 2001, 2017 15-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Views and access privileges


REVOKE ALL ON stock FROM PUBLIC;

REVOKE ALL ON stock_info FROM PUBLIC;

GRANT SELECT ON stock_info TO


dennis, karen, mari

Views © Copyright IBM Corporation 2017

Views and access privileges


You can GRANT and REVOKE table-level privileges on views as if they were tables.
However, INSERT, UPDATE, and DELETE privileges cannot be granted if such
privileges would violate the rules discussed under Restrictions on Views. Also, the
ALTER privilege is not available for views.
You can revoke privileges on a table and then grant privileges on a view that accesses
that table, forcing users to use a view to access the table.
In the example, after the statements are executed, no user can access the stock table
unless the stock_info view is used.

© Copyright IBM Corp. 2001, 2017 15-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

System catalog tables for views


• sysviews
 Stores the CREATE VIEW statement
• sysdepend
 Pairs each view with its underlying tables and views
• systables
 Listed with tabtype = "V"

Views © Copyright IBM Corporation 2017

System catalog tables for views


Information on views is stored in the system catalog tables shown on the visual.

© Copyright IBM Corp. 2001, 2017 15-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Exercise 15
Views
• create simple views
• create complex views

Views © Copyright IBM Corporation 2017

Exercise 15: Views

© Copyright IBM Corp. 2001, 2017 15-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Exercise 15:
Views

Purpose:
In this exercise, you will learn how to create simple and more complex views
and how you can do data validation using views.

Task 1. Creating simple views.


In this task, you will create a view from the customer table. You will drop this view
and create another view from the customer table using different column names.
1. Create a view called customer_names that include the following columns:
• First name
• Last name
• City
• State
2. Select all columns from the customer_names view.
3. Drop the customer_names view.
4. Create another view called customer_names_custom that includes the same
columns as above but with different column names.
5. Select from the customer_names_custom view.
6. Create another view called customer_restrict that will only return companies
with “Golf” in the company name.
7. Select from the customer_restrict view.
Task 2. Creating a view with two tables.
In this task, you will be creating a view that matches customers with orders that
have been placed.
1. Create a view that will match up each customer with the orders he or she has
placed. Include only the orders that have not been shipped yet (ship_date is
null). Display the following information:
• Customer number
• Company name
• Order number
• Order date
• Date paid

© Copyright IBM Corp. 2001, 2017 15-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Task 3. Creating a view with an aggregate function.


In this task, you will be creating a view that will give the total value of each order.
1. Create a view that will give the total value of each order. Display the following:
• Order number
• Total value of the order
Task 4. Creating a view using the WITH CHECK option.
In this task, you will be creating a view that will allow other users to insert an order
without any shipping information.
1. Create a view to only allow users to insert order information into the orders
table but no shipping information, and where the PO number begins with the
letter B.
• The columns ship_date, ship_weight, ship_charge, and ship_instruct
should not be included in the view definition.
• Use the WITH CHECK OPTION.
2. Test the view with an INSERT statement.
Results:
In this exercise, you learned how to create simple and more complex views
and how you can do data validation using views.

© Copyright IBM Corp. 2001, 2017 15-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Exercise 15:
Views - Solutions

Purpose:
In this exercise, you will learn how to create simple and more complex views
and how you can do data validation using views.

Task 1. Creating simple views.


In this task, you will create a view from the customer table. You will drop this view
and create another view from the customer table using different column names.
1. Create a view called customer_names that include the following columns:
• First name
• Last name
• City
• State
CREATE VIEW customer_names AS
SELECT fname, lname, city, state
FROM customer;
2. Select all columns from the customer_names view.
SELECT * FROM customer_names;
3. Drop the customer_names view.
DROP VIEW customer_names;
4. Create another view called customer_names_custom that includes the same
columns as above but with different column names.
CREATE VIEW customer_names_custom
(first, last, city, st) AS
SELECT fname, lname, city, state
FROM customer;
5. Select from the customer_names_custom view.
SELECT * FROM customer_names_custom;
Notice the column heading names that are used.

© Copyright IBM Corp. 2001, 2017 15-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

6. Create another view called customer_restrict that will only return companies
with “Golf” in the company name.
CREATE VIEW customer_restrict
(first, last, city, st, company) AS
SELECT fname, lname, city, state, company
FROM customer
WHERE company MATCHES "*Golf*";
7. Select from the customer_restrict view.
SELECT * FROM customer_restrict;
Task 2. Creating a view with two tables.
In this task, you will be creating a view that matches customers with orders that
have been placed.
1. Create a view that will match up each customer with the orders he or she has
placed. Include only the orders that have not been shipped yet (ship_date is
NULL). Display the following information:
• Customer number
• Company name
• Order number
• Order date
• Date paid
CREATE VIEW order_view AS
SELECT c.customer_num, company,
order_num, order_date, paid_date
FROM customer c, orders o
WHERE c.customer_num = o.customer_num
AND ship_date IS NULL;

© Copyright IBM Corp. 2001, 2017 15-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Task 3. Creating a view with an aggregate function.


In this task, you will be creating a view that will give the total value of each order.
1. Create a view that will give the total value of each order. Display the following:
• Order number
• Total value of the order
CREATE VIEW sum_view (ordno, sumprice) AS
SELECT order_num, sum(total_price)
FROM items
GROUP BY order_num;
Task 4. Creating a view using the WITH CHECK option.
In this task, you will be creating a view that will allow other users to insert an order
without any shipping information.
1. Create a view to only allow users to insert order information into the orders
table but no shipping information, and where the PO number begins with the
letter B.
• The columns ship_date, ship_weight, ship_charge, and ship_instruct
should not be included in the view definition.
• Use the WITH CHECK OPTION.
CREATE VIEW ins_view AS
SELECT order_num, order_date, customer_num,
backlog, po_num, paid_date
FROM orders
WHERE po_num MATCHES "B*"
WITH CHECK OPTION;

© Copyright IBM Corp. 2001, 2017 15-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

2. Test the view with an INSERT statement.


For example:
INSERT INTO ins_view
VALUES (0, TODAY - 1, 107, "n", "M6300", TODAY);

The following error is returned:


385: Data value out of range.

The view only allows rows with a po_num starting with a B to be entered into the
table.

INSERT INTO ins_view


VALUES (0, TODAY - 1, 107, "n", "B6300", TODAY);

SELECT * FROM orders WHERE customer_num = 107;

Results:
In this exercise, you learned how to create simple and more complex views
and how you can do data validation using views.

© Copyright IBM Corp. 2001, 2017 15-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
U n i t 1 5 V i e ws

Unit summary
• Create views
• Use views to present derived and aggregate data
• Use views to hide joins from users

Views © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 15-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

Introduction to stored
procedures

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Unit 16 Introduction to stored procedures

© Copyright IBM Corp. 2001, 2017 16-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

Unit objectives
• Explain the purpose of stored procedures
• Explain advantages of using stored procedures

Introduction to stored procedures © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 16-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

What are stored procedures?


• Characteristics:
 Stored procedure language (SPL) statements
 Include SQL statements
 Stored in database

Introduction to stored procedures © Copyright IBM Corporation 2017

What are stored procedures?


Stored procedures are SQL statements and stored procedure language (SPL)
statements that are stored as objects in a database. Some other characteristics of a
stored procedure:
• A stored procedure can contain any SQL statements except CREATE
PROCEDURE and database statements (CREATE DATABASE, DATABASE,
CLOSE DATABASE).
• The only other non-SQL statements allowed in a stored procedure are the
specialized SPL (Stored Procedure Language) statements.
• The procedure is stored in the database in a set of system catalog tables. A
stored procedure executed for the first time must be retrieved from the database.

© Copyright IBM Corp. 2001, 2017 16-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

Example of a stored procedure


CREATE PROCEDURE credit_order(p_order_num INT)
UPDATE orders
SET paid_date = TODAY
WHERE order_num = p_order_num;
END PROCEDURE;

Introduction to stored procedures © Copyright IBM Corporation 2017

Example of a stored procedure


The example shows a stored procedure that changes a customer order to paid. The
order number is passed in by the calling routine.

© Copyright IBM Corp. 2001, 2017 16-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

SQL statements in a procedure

Regular SQL statement:

application db server
Pass SQL SQL parsed,
statement optimized, and
executed

SQL statement inside a procedure:

application db server

Pass EXECUTE SQL retrieved


PROCEDURE and executed
statement

Introduction to stored procedures © Copyright IBM Corporation 2017

SQL statements in a procedure


The visual compares the execution process of a regular SQL statement that is sent
from the application versus an SQL statement that is in a stored procedure.
Regular SQL statement
An SQL statement is normally passed from the front-end application to the database
server. The statement is then parsed, optimized, and executed. The results are passed
back to the application. A prepared SQL statement is more efficient than an unprepared
SQL statement, but all statements must be parsed and optimized at least once each
time the program is executed.
SQL statement inside a procedure
An SQL statement inside of a procedure does not need to be parsed since that
operation was already done when it was stored. It might or might not need to be
optimized. Only the statement used to execute the procedure (EXECUTE
PROCEDURE) is passed from the front-end application. If several SQL statements are
in a procedure, the traffic between the two processes is much less than it would be if
they were executed outside a procedure.

© Copyright IBM Corp. 2001, 2017 16-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

Compiling a stored procedure


A stored procedure is compiled when the CREATE PROCEDURE statement is
executed. The SPL statements are parsed and optimized, and converted into pcode
(pseudo code that is quickly executed by an interpreter).
Calling a stored procedure
A stored procedure can be called in one of the following ways:
• The SQL statement EXECUTE PROCEDURE can be used to call a stored
procedure. For example:
EXECUTE PROCEDURE credit_order(1012);
• A stored procedure can be implicitly called as part of a SELECT statement. For
example, the following procedure calculates a discounted price for a particular
manufacturer:
CREATE PROCEDURE discount(manuf CHAR(3), price MONEY)
RETURNING MONEY;
IF manuf = "HSK" THEN
RETURN price * 0.9;
ELSE
RETURN price;
END IF;
END PROCEDURE;
SELECT order_num, item_num, manu_code,
discount(manu_code, total_price)
FROM items
WHERE order_num = 1014;
• A stored procedure can be executed as part of the action taken by a trigger. You
will learn about triggers in the next unit.
How the stored procedure is executed
A stored procedure is executed as follows:
1. In Informix, stored procedures are cached inside the virtual portion of Informix
shared memory. When any session requires the use of a stored procedure for the
first time, the database server reads the system catalog tables to retrieve the
code for the stored procedure. The pcode is retrieved from the system catalog
and converted to binary format.
2. The arguments passed by the EXECUTE PROCEDURE or CALL statement are
parsed and evaluated.
3. If changes in the database tables require reoptimization, it occurs at this time. If
an item needed in the execution of the SQL statement is missing (if a column or
table has been dropped, for example), an error occurs at this time.
4. The interpreter then executes the pcode instructions.

© Copyright IBM Corp. 2001, 2017 16-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

Some advantages of stored procedures


• Stored procedures can reduce program complexity
• Performance gains occur in some cases
• An extra level of security can be added
• Business rules can be enforced
• Different applications can share the same code
• In a client/server environment, you do not have to distribute code to
hundreds of clients; there is only one source
• Where code can be executed by clients with different user interfaces,
you have to maintain only one set of code

Introduction to stored procedures © Copyright IBM Corporation 2017

Some advantages of stored procedures


Some advantages of using stored procedures include:
• Stored procedures can reduce program complexity by taking some of the code
that accomplishes a specific function out of the program and putting it in the
database server. For example, a common operation that debits a customer
savings account and credits a checking account might be a perfect candidate for
a stored procedure. The program would execute that procedure, passing in the
needed variable information.
• Some performance gains might be seen by putting multiple SQL statements
inside a stored procedure. The performance gains are from the decreased traffic
between the client application and the database server, especially if the
application resides on another machine. Performance gains can occur because
of the diminished need to parse and optimize the procedure.
• Stored procedures offer an ability to restrict access to a table. Without using
stored procedures, if an administrator grants insert permissions to a user, that
user can insert a row using dbaccess, or a program. This could be a problem if an
administrator wants to enforce any business rules (see next bullet). Rather than
granting insert privileges, an administrator can force users to execute a procedure
to perform the insert.

© Copyright IBM Corp. 2001, 2017 16-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

• Using the extra level of security that a stored procedure provides, you can use
stored procedures to enforce business rules. For example, you can prohibit users
from deleting a row without first storing it in an archive table by writing a stored
procedure to accomplish both tasks and prohibit users from directly accessing the
table.
• Different programs requiring use of the same code can execute a stored
procedure rather than having the same code included in each program. The code
is stored in only one place, eliminating duplicate code.
• Stored procedures are especially helpful in a client/server environment. If a
change is made to application code, it must be distributed to every client. If a
change is made to a stored procedure, it resides in only one location.
• Instead of centralizing database code in applications, you move this code to the
database server. This allows applications to concentrate on user interface
interaction. This is especially important if there are multiple types of user
interfaces required.

© Copyright IBM Corp. 2001, 2017 16-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

Stored procedure performance


• The procedure might be retrieved from disk
• The procedure must be converted from the ASCII representation to the
binary executable form
• A decision whether to reoptimize must be made

Introduction to stored procedures © Copyright IBM Corporation 2017

Stored procedure performance


A stored procedure has extra costs associated with its execution that SQL does not
have:
• Stored procedures are stored in system catalog tables on disk. Before they can
be used for the first time, they must be retrieved from disk. Informix servers use
the stored procedure in shared memory if it is present. Stored procedures are
kept in the stored procedure cache on a most recently used basis.
• Before the procedure can be used, it needs to be changed from character format
to an executable format.
• The procedure is reoptimized if columns, tables, or indexes that are involved in a
procedure have been changed. This checking occurs each time the procedure is
executed.
Because of the extra costs of executing a stored procedure, you may not see any
performance gains by using procedures to execute single SQL statements.

© Copyright IBM Corp. 2001, 2017 16-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

System catalog tables


• The sysprocedures table lists the characteristics for each function and
procedure
• The sysprocbody table describes the compiled version of each
procedure or function
• The sysprocplan table describes the query-execution plans and
dependency lists

Introduction to stored procedures © Copyright IBM Corporation 2017

System catalog tables


The tables listed in the visual are some of the system catalog tables used to track
stored procedures within a database. They include entries for both internal and external
routines and procedures.
For more information on stored procedures, considering taking the course IX711 - IBM
Informix Stored Procedures and Triggers.

© Copyright IBM Corp. 2001, 2017 16-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

Exercise 16
Introduction to stored procedures
• Create a stored procedure

Introduction to stored procedures © Copyright IBM Corporation 2017

Exercise 16: Introduction to stored procedures

© Copyright IBM Corp. 2001, 2017 16-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

Exercise 16:
Introduction to stored procedures

Purpose:
In this exercise, you will learn how to create a stored procedure, execute it
explicitly and how to use it in a SELECT statement.

Task 1. Using a stored procedure in a SELECT statement.


In this task, you will create a stored procedure, execute the procedure explicitly, and
use the procedure in a SELECT statement.
1. From time to time, certain manufacturers offer promotional pricing for their
products. Using the example provided in this module, create a procedure that
accepts a manufacturer’s code value and a price and returns a price based on
the following discounts:
Manufacturer Name Discount
Husky 10%
Norge 15%
ProCycle 5%
For all other manufacturers, the undiscounted price is to be returned.

Hint: SPL Syntax: IF-THEN


IF expression THEN
statement_block
[ELIF expression THEN
statement_block ]
[...]
END IF;

2. Execute the procedure you just wrote using sample manufacturer’s codes and
prices.

© Copyright IBM Corp. 2001, 2017 16-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

3. Write a SELECT statement to select all rows in the items table for order
numbers 1047, 1062, 1065 and 1080. Display the following information sorted
by order and item number:
• Order number
• Item number
• Manufacturer code
• Original Item price
• Discounted price
Results:
In this exercise, you learned how to create a stored procedure, execute it
explicitly and how to use it in a SELECT statement.

© Copyright IBM Corp. 2001, 2017 16-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

Exercise 16:
Introduction to stored procedures - Solutions
Purpose:
In this exercise, you will learn how to create a stored procedure, execute it
explicitly and how to use it in a SELECT statement.

Task 1. Using a stored procedure in a SELECT statement.


In this task, you will create a stored procedure, execute the procedure explicitly, and
use the procedure in a SELECT statement.
1. From time to time, certain manufacturers offer promotional pricing for their
products. Using the example provided in this module, create a procedure that
accepts a manufacturer’s code value and a price and returns a price based on
the following discounts:
Manufacturer Name Discount
Husky 10%
Norge 15%
ProCycle 5%
For all other manufacturers, the undiscounted price is to be returned.
Hint: SPL Syntax: IF-THEN
IF expression THEN
statement_block
[ELIF expression THEN
statement_block ]
[...]
END IF;

CREATE PROCEDURE discount(manuf CHAR(3), price MONEY)


RETURNING MONEY;
IF manuf = "HSK" THEN
RETURN price * .9;
ELIF manuf = "NRG" THEN
RETURN price * .85;
ELIF manuf = "PRO" THEN
RETURN price * .95;
ELSE
RETURN price;
END IF;
END PROCEDURE;

© Copyright IBM Corp. 2001, 2017 16-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

2. Execute the procedure you just wrote using sample manufacturer’s codes and
prices.
EXECUTE PROCEDURE discount ("HSK", 1000);
EXECUTE PROCEDURE discount ("NRG", 1000);
EXECUTE PROCEDURE discount ("PRO", 1000);

3. Write a SELECT statement to select all rows in the items table for order
numbers 1047, 1062, 1065 and 1080. Display the following information sorted
by order and item number:
• Order number
• Item number
• Manufacturer code
• Original Item price
• Discounted price
SELECT order_num, item_num, manu_code, total_price,
discount(manu_code, total_price) AS net_price
FROM items
WHERE order_num IN (1047, 1062, 1065, 1080)
ORDER BY order_num, item_num;

Results:
In this exercise, you learned how to create a stored procedure, execute it
explicitly, and use it in a SELECT statement.

© Copyright IBM Corp. 2001, 2017 16-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

Unit summary
• Explain the purpose of stored procedures
• Explain advantages of using stored procedures

Introduction to stored procedures © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 16-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 16 Introduction to stored procedures

© Copyright IBM Corp. 2001, 2017 16-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Triggers

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Unit 17 Triggers

© Copyright IBM Corp. 2001, 2017 17-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Unit objectives
• Create and execute a trigger
• Drop a trigger
• Use the system catalogs to access trigger information

Triggers © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 17-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

What is a trigger?

Trigger
EVENT on
ACTION
a table

INSERT INSERT
UPDATE UPDATE
DELETE DELETE
SELECT EXECUTE PROCEDURE

Triggers © Copyright IBM Corporation 2017

What is a trigger?
A trigger is a database mechanism that automatically executes an SQL statement when
a certain event occurs. The event that can trigger an action can be an INSERT,
UPDATE, DELETE, or a SELECT statement on a specific table. The statement that
triggers an action can specify either a table, or one or more columns within the table.
The table on which the trigger event operates is called the triggering table.
When the trigger event occurs, the trigger action is executed. The action can be any
combination of one or more INSERT, UPDATE, DELETE, or EXECUTE PROCEDURE
statements.
Triggers are a feature of the database server, so the type of application tool you use to
access the database is irrelevant in the execution of a trigger. By invoking triggers from
the database, a DBA can ensure that data is treated consistently across application
tools and programs. Triggers are frequently used with stored procedures, since the
SQL statement that a trigger executes can be an EXECUTE PROCEDURE statement.

© Copyright IBM Corp. 2001, 2017 17-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

CREATE TRIGGER
CREATE TRIGGER trigger_name ...

• Additional clauses define:


 Trigger event: What event causes the trigger to execute
 REFERENCING clause: Provides values from the trigger action
 WHEN clause: Conditionality
 Trigger action: What action to execute as a result of the event

Triggers © Copyright IBM Corporation 2017

CREATE TRIGGER
The CREATE TRIGGER statement is used to create a trigger in the database.

© Copyright IBM Corp. 2001, 2017 17-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Trigger events
• Trigger events:
 INSERT
 DELETE
 UPDATE
 SELECT
• Can define multiple triggers for same event on table
• Can define multiple INSTEAD OF triggers for same event on same
view

Triggers © Copyright IBM Corporation 2017

Trigger events
The trigger event can be an INSERT, UPDATE, SELECT, or DELETE SQL statement.
• You can define multiple triggers for INSERT, DELETE, UPDATE, and SELECT
types of triggering events on the same table.
• You can define multiple INSTEAD OF triggers for INSERT, DELETE, and
UPDATE types of triggering events on the same view.
Remote tables
The table specified by the trigger event must be a table in the current database. You
cannot specify a remote table.

© Copyright IBM Corp. 2001, 2017 17-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Trigger action
• The trigger action specifies the action that should occur and when it
should occur:
 Execute before rows are processed:
− BEFORE (define action here)
 Execute after each row is processed:
− FOR EACH ROW (define action here)
 Execute after all rows are processed:
− AFTER (define action here)
• The action can be either a SQL statement or a stored procedure.

Triggers © Copyright IBM Corporation 2017

Trigger action
Trigger actions are executed at the following times:
• Before the trigger event occurs: The BEFORE triggered action list executes once
before the trigger event executes. Even if no rows are processed by the trigger
event, the BEFORE trigger actions are still executed.
• After each row is processed by the trigger event: The FOR EACH ROW trigger
action occurs once after each row is processed by the trigger event.
• After the trigger event completes: The AFTER triggered action list executes once
after the trigger event executes. If no rows are processed by the triggering
statement, the AFTER triggered action list is still executed.
You cannot reference the triggering table in any of the trigger action SQL statements.
Exceptions include an UPDATE statement that updates columns not listed in the
triggering table, and SELECT statements in a subquery or stored procedure.

© Copyright IBM Corp. 2001, 2017 17-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

REFERENCING example
• Use the REFERENCING clause to provide before- and after- values to
the action:
CREATE TRIGGER salary_upd
UPDATE OF salary ON employee
REFERENCING NEW AS post OLD AS pre
FOR EACH ROW
(INSERT INTO salary_audit
(update_dtime, whodunit, old_salary, new_salary)
VALUES (CURRENT, USER, pre.salary, post.salary));
CREATE TRIGGER items_upd
UPDATE OF total_price ON items
REFERENCING NEW AS post OLD AS pre
FOR EACH ROW
(UPDATE orders
SET order_price = order_price +
post.total_price – pre.total_price
WHERE order_num = post.order_num);
Triggers © Copyright IBM Corporation 2017

REFERENCING example
The salary_upd trigger shown is an example of how to insert a row into an audit table
whenever the salary of an employee is changed.
The items_upd trigger is an example of how to update the derived value order_price in
the orders table whenever the price of the items in the order has changed.
The NEW and OLD correlation values are needed for the actions in both triggers.
Please note: These examples are for illustration purposes only. The column
order_price does not exist in the demonstration database, nor does the employee table.

© Copyright IBM Corp. 2001, 2017 17-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

The WHEN condition


• The WHEN condition allows you to base the triggered action on the
outcome of a test:
CREATE TRIGGER ins_cust_calls
INSERT ON cust_calls
{Flag problems for billing dept. review}
REFERENCING NEW AS post
FOR EACH ROW WHEN(post.call_code = 'B')
(INSERT INTO warn_billing
VALUES(post.customer_num));

Triggers © Copyright IBM Corporation 2017

The WHEN condition


You can specify that the trigger action only occurs if a certain condition is true by
including the WHEN clause.
When the WHEN condition evaluates to true, the accompanying trigger action
statements are executed. When the WHEN condition evaluates to false or unknown,
the trigger action statements are not executed.
You can include one or more WHEN conditions after the BEFORE, FOR EACH ROW,
and AFTER keywords. Each WHEN condition is evaluated separately; for example:
FOR EACH ROW
WHEN (post.call_code = "B")
INSERT INTO warn_billing VALUES(post.customer_num),
WHEN (post.call_code = "C")
INSERT INTO complaints VALUES(post.customer_num)
The condition can contain boolean expressions such as BETWEEN, IN, IS NULL, LIKE,
and MATCHES. You can use a subquery as a part of the condition. The condition can
also contain keywords such as TODAY, USER, CURRENT, and SITENAME.
The example demonstrates a conditional trigger action. Billing complaints
(call_code = B) are flagged by putting the customer number in the warn_billing table
when they occur.
The WHEN clause is not available with SELECT triggers.

© Copyright IBM Corp. 2001, 2017 17-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Cascading triggers
CREATE TRIGGER del_cust --cascading delete
DELETE ON customer
REFERENCING OLD AS pre_del
FOR EACH ROW (
DELETE FROM orders
WHERE customer_num = pre_del.customer_num,
DELETE FROM cust_calls
WHERE customer_num =
pre_del.customer_num);
CREATE TRIGGER del_orders
DELETE ON orders
REFERENCING OLD AS pre_del
FOR EACH ROW (
DELETE FROM items
WHERE order_num = pre_del.order_num);

Triggers © Copyright IBM Corporation 2017

Cascading triggers
Executing one trigger can cause another trigger to be executed, as shown in the
example. Deleting a customer row causes the del_cust trigger to execute. The del_cust
trigger deletes a row from the orders table, which in turn triggers the del_orders trigger.
When these triggers complete, the DELETE statements is executed in this order:
DELETE customer
DELETE orders
DELETE items
DELETE cust_calls
This technique was frequently used before cascading deletes became a feature of the
CREATE TABLE statement. Cascading deletes make it possible to define a referential
constraint in which the database server automatically deletes child rows when a parent
row is deleted.
Cascading is not supported with ON SELECT triggers.
You can place comments within a trigger in a line by prefixing it with two dashes (--) as
shown in the example. You can also include a comment by enclosing it between two
braces ({}). The use of two dashes is the ANSI-compliant method of introducing a
comment.

© Copyright IBM Corp. 2001, 2017 17-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Multiple triggers
• Multiple triggers on a table can include the same or different columns

• CREATE TRIGGER trig1


UPDATE OF item_num, stock_num ON items
REFERENCING OLD AS pre NEW AS post
FOR EACH ROW(EXECUTE PROCEDURE proc1());

CREATE TRIGGER trig2


UPDATE OF manu_code ON items
BEFORE(EXECUTE PROCEDURE proc2());

CREATE TRIGGER trig3


UPDATE OF order_num, stock_num ON items
BEFORE(EXECUTE PROCEDURE proc3());
Triggers © Copyright IBM Corporation 2017

Multiple triggers
Multiple triggers on a table can include the same or different columns.
In the example, update trigger trig3 on the items table includes stock_num in its column
list, which is also a triggering column in trig1.

© Copyright IBM Corp. 2001, 2017 17-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Multiple triggers: Execution order


• Column number of triggering columns determines order of execution:
 If multiple triggers set on same column or columns, order of execution not
guaranteed
CREATE TABLE taba (a int, b int, c int, d int);
CREATE TRIGGER trig1 UPDATE OF a, c ON taba
AFTER (UPDATE tabb SET y = y + 1);
CREATE TRIGGER trig2 UPDATE OF b, d ON taba
AFTER (UPDATE tabb SET z = z + 1);
UPDATE taba SET (b, c) = (b + 1, c + 1);
 In this example, trig1 executes first
− Column a has lower colnum than column b

Triggers © Copyright IBM Corporation 2017

Multiple triggers: Execution order


When an UPDATE statement updates multiple columns that have different triggers, the
column numbers of the triggering columns determine the order of trigger execution.
Execution begins with the lowest triggering column number and proceeds in order to
the highest triggering column number. If several update triggers are set on the same
column or on the same set of columns, however, the order of trigger execution is not
guaranteed.
The following example shows a table - taba - with four columns (a, b, c, d):
CREATE TABLE taba (a int, b int, c int, d int);
Define trig1 as an update on columns a and c, and define trig2 as an update on
columns b and d, as the following example shows:
CREATE TRIGGER trig1 UPDATE OF a, c ON taba
AFTER (UPDATE tabb SET y = y + 1);
CREATE TRIGGER trig2 UPDATE OF b, d ON taba
AFTER (UPDATE tabb SET z = z + 1);

© Copyright IBM Corp. 2001, 2017 17-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

The following example shows a triggering statement for the update trigger:
UPDATE taba SET (b, c) = (b + 1, c + 1);
Trig1 for columns a and c executes first, and trig2 for columns b and d executes next. In
this case, the lowest column number in the two triggers is column 1 (a), and the next is
column 2 (b).

© Copyright IBM Corp. 2001, 2017 17-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

INSTEAD OF trigger on view


• Initiated when triggering DML references specified view
 Can be INSERT, UPDATE, or DELETE
• INSTEAD OF trigger action replaces trigger event:
 Trigger event not executed
 Trigger action executed instead
• Can have multiple INSTEAD OF triggers defined for each type of
triggering event

Triggers © Copyright IBM Corporation 2017

INSTEAD OF trigger on view


A view can have any number of INSTEAD OF triggers defined for each type of
INSERT, DELETE, or UPDATE triggering event.
The INSTEAD OF trigger replaces the trigger event with the specified trigger action on
a view, rather than executing the triggering INSERT, DELETE, or UPDATE operation.

© Copyright IBM Corp. 2001, 2017 17-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

INSTEAD OF trigger on view: Example


• View on dept and emp tables:
• CREATE VIEW manager_info AS
SELECT d.deptno, d.deptname, e.empno, e.empname
FROM dept d, emp e
WHERE d.manager_num = e.empno;
• INSTEAD OF trigger on view manager_info:
• CREATE TRIGGER manager_info_insert
INSTEAD OF INSERT ON manager_info
REFERENCING NEW AS n
FOR EACH ROW
(EXECUTE PROCEDURE instab(n.deptno, n.empno));
• Triggering event:
• INSERT INTO manager_info(deptno, empno) VALUES (8, 4232)
• INSERT not executed on view
• Procedure instab execured instead
Triggers © Copyright IBM Corporation 2017

INSTEAD OF trigger on view: Example


In this example, dept and emp are tables that list departments and employees:
CREATE TABLE dept (
deptno INTEGER PRIMARY KEY,
deptname CHAR(20),
manager_num INT
);
CREATE TABLE emp (
empno INTEGER PRIMARY KEY,
empname CHAR(20),
deptno INTEGER REFERENCES dept(deptno),
startdate DATE
);
ALTER TABLE dept ADD CONSTRAINT
(FOREIGN KEY (manager_num) REFERENCES emp(empno));
The next statement defines manager_info, a view of columns in the dept and emp
tables that includes all the managers of each department:
CREATE VIEW manager_info AS
SELECT d.deptno, d.deptname, e.empno, e.empname
FROM dept d, emp e
WHERE d.manager_num = e.empno;

© Copyright IBM Corp. 2001, 2017 17-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

The following CREATE TRIGGER statement creates manager_info_insert, an


INSTEAD OF trigger that is designed to insert rows into the dept and emp tables
through the manager_info view:
CREATE TRIGGER manager_info_insert
INSTEAD OF INSERT ON manager_info --defines trigger event
REFERENCING NEW AS n --new manager data
FOR EACH ROW --defines trigger action
(EXECUTE PROCEDURE instab(n.deptno, n.empno));
CREATE PROCEDURE instab (dno INT, eno INT)
INSERT INTO dept(deptno, manager_num)
VALUES(dno, eno);
INSERT INTO emp (empno, deptno)
VALUES (eno, dno);
END PROCEDURE;
The database server treats the following INSERT statement as a triggering event:
INSERT INTO manager_info(deptno, empno) VALUES (8, 4232);
This triggering INSERT statement is not executed, but this event causes the trigger
action to be executed instead, invoking the instab SPL routine.
The INSERT statements in the SPL routine insert new values into both the emp and
dept base tables of the manager_info view.

© Copyright IBM Corp. 2001, 2017 17-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Triggers and stored procedures


• Stored procedures can be called from triggers
• Some restrictions are:
 The stored procedure called from a trigger cannot contain:
− BEGIN WORK
− COMMIT WORK
− ROLLBACK WORK
− SET CONSTRAINTS
 As the trigger action, the stored procedure cannot be a cursory procedure
(returning more than one row)

Triggers © Copyright IBM Corporation 2017

Triggers and stored procedures


A common method to perform complex processing within a trigger is to have it call one
or more stored procedures. However, stored procedures have some restrictions if they
are used as part of a trigger action:
• The stored procedure cannot contain the BEGIN WORK, COMMIT WORK,
ROLLBACK WORK, or SET CONSTRAINTS statement.
• The stored procedure included in a CREATE TRIGGER statement cannot return
more than one row (with the RETURN WITH RESUME statement). The following
error message appears if you do:
686: Procedure (xxx) has returned more than one row.

© Copyright IBM Corp. 2001, 2017 17-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Trigger procedures
• SPL routine which can only be invoked from FOR EACH ROW section
• Must include WITH TRIGGER REFERENCES when using EXECUTE
PROCEDURE statement to invoke trigger
CREATE TRIGGER ins_trig_tab1 INSERT ON tab1
REFERENCING NEW AS post
FOR EACH ROW
(EXECUTE PROCEDURE proc1()
WITH TRIGGER REFERENCES);

Triggers © Copyright IBM Corporation 2017

Trigger procedures
A trigger procedure is an SPL routine that EXECUTE PROCEDURE can invoke only
from the FOR EACH ROW section of the action clause of a trigger definition.
You must include the WITH TRIGGER REFERENCES keywords when you use the
EXECUTE PROCEDURE statement to invoke a trigger procedure.
Such procedures must include the REFERENCING clause and the FOR clause in the
CREATE PROCEDURE statement that defined the procedure.
This REFERENCING clause declares names for correlated variables that the
procedure can use to reference the old column value in the row when the trigger event
occurred, or the new value of the column after the row was modified by the trigger.
The FOR clause specifies the table or view on which the trigger is defined.
The following statement defines an insert trigger on tab1 that calls proc1 from the FOR
EACH ROW section as its triggered action, and performs an INSERT operation that
activates this trigger:
CREATE TRIGGER ins_trig_tab1 INSERT ON tab1
REFERENCING NEW AS post
FOR EACH ROW(EXECUTE PROCEDURE proc1() WITH TRIGGER REFERENCES);

© Copyright IBM Corp. 2001, 2017 17-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Procedures triggers and Boolean operators


• Boolean operators check DML of triggering event:
CREATE PROCEDURE proc1()
REFERENCING OLD AS o NEW AS n FOR tab1;
IF (INSERTING) THEN
LET n.col1 = n.col1 + 1;
INSERT INTO temptab1 VALUES(0,n.col1,1,n.col2);
END IF
IF (UPDATING) THEN
INSERT INTO temptab1 values(o.col1,n.col1,o.col2,n.col2);
END IF
IF (SELECTING) THEN
INSERT INTO temptab1 VALUES(o.col1,0,o.col2,0);
END IF
IF (DELETING) THEN
DELETE FROM temptab1 WHERE temptab1.col1 = o.col1;
END IF
END PROCEDURE;
• New values can be modified:
INSERT INTO tab1 VALUES(111,222); -- inserts values (112, 222)

Triggers © Copyright IBM Corporation 2017

Procedure triggers and Boolean operators


In the trigger example shown, note that the required REFERENCING clause also
includes the clause FOR tab1. The FOR clause identifies the table or view on which the
trigger was defined - in this case tab1.
The following example defines three tables and a trigger procedure that references one
of these tables in its FOR clause:
CREATE TABLE tab1 ( col1 INT, col2 INT );
CREATE TABLE tab2 ( col1 INT );
CREATE TABLE temptab1 (
old_col1 INT,
new_col1 INT,
old_col2 INT,
new_col2 INT
);
CREATE PROCEDURE proc1()
REFERENCING OLD AS o NEW AS n FOR tab1;
IF (INSERTING) THEN
LET n.col1 = n.col1 + 1;
INSERT INTO temptab1 VALUES(0,n.col1,1,n.col2);
END IF
IF (UPDATING) THEN
INSERT INTO temptab1 values(o.col1,n.col1,o.col2,n.col2);
END IF

© Copyright IBM Corp. 2001, 2017 17-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

IF (SELECTING) THEN
INSERT INTO temptab1 VALUES(o.col1,0,o.col2,0);
END IF
IF (DELETING) THEN
DELETE FROM temptab1 WHERE temptab1.col1 = o.col1;
END IF
END PROCEDURE;
This trigger procedure illustrates that the triggered action can be a different DML
operation from the triggering event, the insert trigger from the previous visual.
This procedure inserts a row when an insert trigger calls it and deletes a row when a
delete trigger calls it. It also performs INSERT operations if it is called by a select trigger
or by an update trigger.
The proc1 trigger procedure in this example uses Boolean conditional operators that
are valid only in trigger routines.
The INSERTING operator returns true only if the procedure is called from the FOR
EACH ROW action of an INSERT trigger. This procedure can also be called from other
triggers whose trigger event is an UPDATE, SELECT, or DELETE statement because
the UPDATING, SELECTING and DELETING operators return true if the procedure is
invoked in the triggered action of the corresponding type of triggering event.
The REFERENCING clause of the trigger declares a correlation name for the NEW
value that is different from the correlation name that the trigger procedure declared.
These names do not need to match, because the correlation name that was declared in
the trigger procedure has that procedure as its scope of reference.
The following statement activates the ins_trig_tab1 trigger, which executes the proc1
procedure.
INSERT INTO tab1 VALUES (111,222);
Because the trigger procedure increments the new value of col1 by 1, the values
inserted are (112 and 222), rather than the value 111 that the original triggering event
(INSERT) specified.

© Copyright IBM Corp. 2001, 2017 17-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Discontinuing an operation
• Use a stored procedure to roll back a triggering event:

CREATE PROCEDURE stop_processing()


RAISE EXCEPTION -745;
END PROCEDURE;

CREATE TRIGGER trig1 INSERT ON tab1


REFERENCING NEW AS new_val
FOR EACH ROW WHEN (new_val.col2 > 20)
(EXECUTE PROCEDURE stop_processing());

Triggers © Copyright IBM Corporation 2017

Discontinuing an operation
Stored procedure language has a statement called RAISE EXCEPTION that
discontinues the stored procedure with an error (if the error is not trapped in a stored
procedure with the ON EXCEPTION statement) and returns control to the application.
The RAISE EXCEPTION statement can be used to discontinue both the trigger event
and the trigger action. If the database has been created with logging, the application
can then roll back the transaction.
Error number -745 is reserved for use with triggers. The error message that the users
receive is:
745: Trigger execution has failed.
The application code is responsible for checking for errors after the triggering SQL
statement and issuing a ROLLBACK WORK.
Any error code could be used in the RAISE EXCEPTION statement that is called from
the trigger. You do not have to use error 745.

© Copyright IBM Corp. 2001, 2017 17-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Capturing the error in the application


• Example using 4GL:

BEGIN WORK
INSERT INTO tab1 VALUES(1,30)
IF (sqlca.sqlcode < 0) THEN
DISPLAY 'error ',sqlca.sqlcode, Failure of a trigger action
can be captured by
' on insert statement' checking the sqlcode

ROLLBACK WORK
ELSE
COMMIT WORK
END IF
....

Triggers © Copyright IBM Corporation 2017

Capturing the error in the application


Error -745 (or any other Informix error that occurs from a trigger action) is returned in
the sqlca structure to the application. It is up to the application programmer to check for
the sqlcode after an SQL statement. If an error occurs, the programmer can roll back
the transaction.

© Copyright IBM Corp. 2001, 2017 17-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Customizable error messages


• Use error -746 to display a custom message:

CREATE PROCEDURE stop_processing_col2()


RAISE EXCEPTION -746,0, 'Error, col2 exceeds 20.';
END PROCEDURE;

CREATE TRIGGER trig1 INSERT ON tab1


REFERENCING NEW AS new_val
FOR EACH ROW WHEN(new_val.col2 > 20)
(EXECUTE PROCEDURE stop_processing_col2());

Triggers © Copyright IBM Corporation 2017

Customizable error messages


To customize the error message returned to the user, use error -746 in the RAISE
EXCEPTION statement. The third parameter of the RAISE EXCEPTION statement is
the error message. The customized text is placed in the sqlerrm field of the sqlca
structure. You can place any text within quotes in the third parameter and it is returned
to the application process as the error message. Limit the text of the message to 72
characters or less.

© Copyright IBM Corp. 2001, 2017 17-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Dropping a trigger
• The DROP TRIGGER statement deletes the trigger from the database.

DROP TRIGGER trig_name;

Triggers © Copyright IBM Corporation 2017

Dropping a trigger
Deleting a table causes triggers that reference that table in the Trigger Event clause to
be deleted.
When you alter a table and drop a column, the column is dropped from trigger column
list in the trigger event. Triggers that reference the table in the trigger action are not
deleted. You must find and drop those triggers yourself.

© Copyright IBM Corp. 2001, 2017 17-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Cursors and triggers


• INSERT cursors:
 When the row is flushed to the database server, the complete trigger is
executed for each INSERT statement.

• UPDATE cursors:
 Each UPDATE WHERE CURRENT OF statement executes the complete
trigger.

Triggers © Copyright IBM Corporation 2017

Cursors and triggers


Use an insert cursor to increase performance because it buffers the contents of several
inserts in application memory before they are sent to the database server. The rows are
flushed to the database server when the FLUSH statement is executed or when the
buffer is full. When the data is flushed to the database server, each row causes the
trigger to be executed in full as if it were a singleton INSERT statement. Select cursors
also execute the trigger for each row.
UPDATE or DELETE statements within cursors act differently than a singleton
UPDATE or DELETE statement. The entire trigger is executed with each UPDATE or
DELETE with the WHERE CURRENT OF clause. For example, if five rows are
changed with a cursor, the BEFORE, FOR EACH, and AFTER trigger actions are
executed five times, once for each row.

© Copyright IBM Corp. 2001, 2017 17-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Example of an UPDATE cursor


• Example using 4GL:
WHENEVER ERROR CONTINUE
UPDATE customer
SET customer.* = gr_customer.*
WHERE CURRENT OF lockcust The entire trigger is
executed for each
IF sqlca.sqlcode < 0 THEN UPDATE WHERE CURRENT
OF statement.
ERROR 'Error number ',
sqlca.sqlcode USING '-<<<<',
' has occurred.'
ROLLBACK WORK
ELSE
MESSAGE 'Customer updated.'
COMMIT WORK

Triggers © Copyright IBM Corporation 2017

Example of an UPDATE cursor


The Informix 4GL code excerpt shown is an example of an UPDATE statement within a
cursor. The CURRENT OF keywords are used to update the current row of the active
set of a cursor. In a case like the one shown, if a trigger was activated because of the
UPDATE statement, the entire trigger (BEFORE, FOR EACH ROW, and AFTER
operations) would be executed for each UPDATE statement (for instance, for each
row), even if it occurred within a cursor.

© Copyright IBM Corp. 2001, 2017 17-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Triggers and constraint checking


• Constraint checking is deferred during the execution of the trigger
action
• After the trigger is executed, all constraints are checked for violations

Triggers © Copyright IBM Corporation 2017

Triggers and constraint checking


If logging is enabled for a database, the database server defers checking of all
constraints until the trigger action has completed to prevent a violation of constraints
when the trigger action is executed. All constraints are checked after the trigger action.
This is equivalent to running SET CONSTRAINTS ALL DEFERRED before the
statements and SET CONSTRAINTS constraint_list IMMEDIATE after the statements.
For databases without logging, an error is generated if a constraint violation occurs.

© Copyright IBM Corp. 2001, 2017 17-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

System catalogs for triggers


• Two system catalog tables were created specifically for storing trigger
information:
 systriggers: Holds miscellaneous information about the trigger.
 systrigbody: Contains both the English text for the trigger and the code
used to execute the trigger action.

Triggers © Copyright IBM Corporation 2017

System catalogs for triggers


The systriggers and systrigbody system catalog tables hold information about triggers.

© Copyright IBM Corp. 2001, 2017 17-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Managing triggers
• If a table is dropped, all associated triggers are dropped
• If the database is dropped, all triggers are dropped
• Managing triggers as if they were application code is recommended

Triggers © Copyright IBM Corporation 2017

Managing triggers
Triggers are created with SQL statements, which make them easy to create and drop.
However, triggers contain important rules for the data and can be easily overlooked
when it comes to proper source-code maintenance procedures.
If a DBA unwittingly dropped a database (perhaps to recreate it later) without saving the
triggers associated with the database, the triggers are lost.

© Copyright IBM Corp. 2001, 2017 17-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Security and triggers


• You must have permission to run both the triggering event and the
trigger action unless the owner has WITH GRANT OPTION.
• For stored procedures as trigger actions, you must have permission to
execute the stored procedure, or the owner of the trigger must have
been granted EXECUTE permissions WITH GRANT OPTION.

Triggers © Copyright IBM Corporation 2017

Security and triggers


Without the WITH GRANT OPTION, the user must have permissions on all the tables
in the trigger action.
If a stored procedure is part of the trigger action, the owner of the trigger should be
granted EXECUTE permissions WITH GRANT OPTION for all users to execute the
stored procedure. Without the WITH GRANT OPTION, each user must have
permissions to execute the stored procedure.
In addition, users need permissions to execute SQL statements within the stored
procedure. Creating a DBA stored procedure allows any user to execute statements
within that stored procedure.

© Copyright IBM Corp. 2001, 2017 17-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Exercise 17
Triggers
• create a trigger that will record any deletes from the customer table

Triggers © Copyright IBM Corporation 2017

Exercise 17: Triggers

© Copyright IBM Corp. 2001, 2017 17-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Exercise 17:
Triggers

Purpose:
In this exercise, you will learn how to create and manage triggers.

Task 1 Create a trigger.


In this task, you will create a table called history with the same columns as the
customer table. You will create a delete trigger that takes a row deleted from the
customer table and inserts it into the history table.
1. Create a history table with the same columns as the customer table.
2. Create a DELETE trigger named del_cust that will take a row deleted from the
customer table and insert it into the history table.
3. Execute the following DELETE statement:
DELETE FROM customer WHERE customer_num = 102;
4. Verify that a row was inserted in the history table.
Task 2. Create multiple triggers on the same column.
In this task, you will create and execute multiple SELECT triggers on the same
column in a table.
1. Create the order_queries table, consisting of columns order_num and
customer_num, the login ID of the user doing the select, and a timestamp
column for recording when the query was run down to the second.
2. Create the hit_list table, consisting of columns order_num and num_selects
and a counter for the number of times the column was selected.
3. Initialize the hit_list table with all the order numbers, and a quantity of 0 for
query_hits.
4. Create a SELECT trigger on the order_num column of the orders table. This
trigger will insert an audit record into the order_queries table that consists of
the order number, customer number, the user ID of the user executing the
query, and the timestamp of the date and time (to the second) when the query
was made.
5. Create another select trigger on the order_num column of the orders table.
This trigger will increment the num_selects column of the hit_list table every
time a select of the order_num column is made.

© Copyright IBM Corp. 2001, 2017 17-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

6. Query the order_queries and hit_list tables to check existing data for
order 1017.
7. Select information from the orders table for order 1017.
8. Query the order_queries and hit_list tables again for order 1017 to verify the
trigger actions.
Task 3. Retrieve trigger information from the database server.
In this task, you will use the dbschema utility to retrieve information about the
trigger created in a previous task.
1. Run the following command and view the information on the triggers:
$ dbschema -d stores_demo -t orders
2. Challenge: Execute a SELECT statement to retrieve the CREATE TRIGGER
statement for the del_cust trigger from the system catalog tables.
RESULTS:
In this exercise, you learned how to create and manage triggers.

© Copyright IBM Corp. 2001, 2017 17-33


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Exercise 17:
Triggers - Solutions

Purpose:
In this exercise, you will learn how to create and manage triggers.

Task 1. Create a trigger.


In this task, you will create a table called history with the same columns as the
customer table. You will create a delete trigger that takes a row deleted from the
customer table and inserts it into the history table.
1. Create a history table with the same columns as the customer table.
CREATE TABLE history (
customer_num INTEGER NOT NULL,
fname CHAR(20),
lname CHAR(20),
company CHAR(20),
address1 CHAR(20),
address2 CHAR(20),
city CHAR(15),
state CHAR(2),
zipcode CHAR(5),
phone CHAR(18)
);
2. Create a DELETE trigger named del_cust that will take a row deleted from the
customer table and insert it into the history table.
CREATE TRIGGER del_cust DELETE ON customer
REFERENCING old AS pre
FOR EACH ROW
(INSERT INTO history VALUES
(pre.customer_num, pre.fname,
pre.lname, pre.company, pre.address1,
pre.address2, pre.city, pre.state,
pre.zipcode, pre.phone));

© Copyright IBM Corp. 2001, 2017 17-34


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

3. Execute the following DELETE statement:


DELETE FROM customer WHERE customer_num = 102;
4. Verify that a row was inserted in the history table.
SELECT * FROM history;
Task 2. Create multiple triggers on the same column.
In this task, you will create and execute multiple SELECT triggers on the same
column in a table.
1. Create the order_queries table, consisting of columns order_num and
customer_num, the login ID of the user doing the select, and a timestamp
column for recording when the query was run down to the second.
CREATE TABLE order_queries (
order_num INTEGER,
customer_num INTEGER,
user CHAR(18),
select_dtime DATETIME YEAR TO SECOND);
2. Create the hit_list table, consisting of columns order_num and num_selects
and a counter for the number of times the column was selected.
CREATE TABLE hit_list (
order_num INTEGER,
num_selects INTEGER);
3. Initialize the hit_list table with all the order numbers, and a quantity of 0 for
query_hits.
INSERT INTO hit_list (order_num, num_selects)
SELECT order_num, 0 FROM orders;

© Copyright IBM Corp. 2001, 2017 17-35


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

4. Create a SELECT trigger on the order_num column of the orders table. This
trigger will insert an audit record into the order_queries table that consists of
the order number, customer number, the user ID of the user executing the
query, and the timestamp of the date and time (to the second) when the query
was made.
CREATE TRIGGER trig1
SELECT OF order_num ON orders
REFERENCING OLD AS pre
FOR EACH ROW
(INSERT INTO order_queries
VALUES (pre.order_num, pre.customer_num, USER,
CURRENT));
5. Create another select trigger on the order_num column of the orders table.
This trigger will increment the num_selects column of the hit_list table every
time a select of the order_num column is made.
CREATE TRIGGER trig2
SELECT OF order_num ON orders
REFERENCING OLD AS pre
FOR EACH ROW
(UPDATE hit_list SET num_selects = num_selects + 1
WHERE order_num = pre.order_num);
6. Query the order_queries and hit_list tables to check existing data for
order 1017.
SELECT * FROM order_queries
WHERE order_num = 1017;
SELECT * FROM hit_list
WHERE order_num = 1017;
7. Select information from the orders table for order 1017.
SELECT * FROM orders
WHERE order_num = 1017;
8. Query the order_queries and hit_list tables again for order 1017 to verify the
trigger actions.
SELECT * FROM order_queries
WHERE order_num = 1017;
SELECT * FROM hit_list
WHERE order_num = 1017;

© Copyright IBM Corp. 2001, 2017 17-36


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Task 3. Retrieve trigger information from the database server.


In this task, you will use the dbschema utility to retrieve information about the
trigger created in a previous task.
1. Run the following command and view the information on the triggers:
$ dbschema -d stores_demo -t orders
2. Challenge: Execute a SELECT statement to retrieve the CREATE TRIGGER
statement for the del_cust trigger from the system catalog tables.
SELECT datakey, seqno, data
FROM systrigbody b, systriggers t
WHERE (datakey = 'D' OR datakey = 'A')
AND b.trigid = t.trigid
AND trigname = 'del_cust'
ORDER BY datakey DESC, seqno;
RESULTS:
In this exercise, you learned how to create and manage triggers.

© Copyright IBM Corp. 2001, 2017 17-37


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Unit 17 Triggers

Unit summary
• Create and execute a trigger
• Drop a trigger
• Use the system catalogs to access trigger information

Triggers © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 17-38


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Terminology

Terminology

Informix (12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
A p p e n d i x A Te r m i n o l o g y

© Copyright IBM Corp. 2001, 2017 A-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x A Te r m i n o l o g y

Unit objectives
• Review Informix terminology

Terminology © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 A-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x A Te r m i n o l o g y

There are many terms that you must be familiar with as an Informix DBA. This is a
partial list with brief definitions. If you need more information about a term, please
consult the Informix Knowledge Center or ask your instructor.
• Database server - A database server is the program that manages the content of
the database as it is stored on disk. The database server knows how tables,
rows, and columns are organized in physical computer storage. The database
server also interprets and executes all SQL commands. An Informix database
server, or instance, is the set of database server processes together with the
shared memory and disk space that the server processes manage. Multiple
instances can exist on the same computer.
• Shared Memory - Informix shared memory consists of the resident portion, the
virtual portion, and the message portion. Shared memory is used for caching data
from the disk (resident portion), maintaining and controlling resources needed by
the processors (virtual portion), and providing a communication mechanism for
the client and server (message portion).
• Disk - The disk component is a collection of one or more units of disk space
assigned to the database server. All the data in the databases, plus all the system
information necessary to maintain the server, are stored within the disk
component.
• Processes - The processes that make up the database server are known as
virtual processors. These processes are each called oninit. Each virtual
processor (VP) belongs to a virtual processor class. A VP class is a set of
processes responsible for a specific set of tasks.
• Chunk - A chunk is a contiguous unit of space that is assigned to the server to
use; the server manages the use of space within that chunk.
• Dbspace - A dbspace is a logical collection of chunks. You can create databases
and tables in dbspaces.
• Root dbspace - The root dbspace is a required dbspace where all the system
information that controls the database server is located.
• Temporary dbspace - Special dbspaces, called temporary dbspaces, can be
created for the storing of temporary tables or temporary files. Having temporary
dbspaces prevents temporary tables and files from unexpectedly filling file
systems or contending for space in the dbspace with data tables, and can speed
up the creation of temporary tables.

© Copyright IBM Corp. 2001, 2017 A-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x A Te r m i n o l o g y

• Page - When a chunk is assigned to the database server, it is broken down into
smaller units called pages. The page is the basic unit of I/O for an Informix server.
All data in a server is stored in pages. All pages used by the server have a fixed
data structure. When data is read from disk into a buffer in shared memory, the
entire page on which that data is stored is read into the buffer.
• Extent - An extent is a contiguous group of physical pages allocated to a single
table, index, or table fragment. The size of an extent that stores rows for a table is
specified when the table is created or altered. Each table has two extent sizes
defined: an initial (or first) extent size and a size for all subsequent extents.
• Tblspace - A tblspace is the logical collection of all the pages allocated for a given
table or, if the table is fragmented across dbspaces, a fragment of the table
located in a dbspace. The space represented by a tblspace is not necessarily
contiguous; pages can be spread out on a single chunk, or pages for a table can
be on different chunks. A tblspace is always contained within a single dbspace.
• Simple large object - Simple large objects, or binary large objects (blobs), are
streams of bytes of arbitrary value and length. A simple large object might be a
digitized image or sound, an object module or a source code file. There are two
types of simple large objects: TEXT and BYTE. The TEXT data type is used for
the storage of printable ASCII text such as source code and scripts. The BYTE
data type is used for storing any kind of binary data such as saved spreadsheets,
program load modules, and digitized images or sound.
• Blobspace - To improve the efficiency of storage and retrieval of simple large
objects, Informix also offers special dbspaces with characteristics customized for
these data types, called a blobspace. The blobspace forms a pool of storage
space that can be used for any simple large object columns in the server. Any
single blobspace can contain blob data from different columns in different tables,
even in different databases.
• Blobpage - When a blobspace is created, a blobpage size is specified for that
blobspace. This value is the number of pages that make up a single blobpage.
Simple large object data stored in a blobspace is stored on one or more
blobpages.
• Smart large objects - Smart large objects are a category of large objects that
support random access to the data. There are two smart large object types:
BLOB (binary large object) and CLOB (character large object).
• Sbspace - Another special-purpose dbspace, called a smart blobspace, or
sbspace, is used for storing smart large objects. Unlike blobspaces, an sbspace
stores data in standard Informix pages, just like a dbspace. Since smart binary-
large-object values can be very large, a single value can occupy many pages.

© Copyright IBM Corp. 2001, 2017 A-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x A Te r m i n o l o g y

• System Monitoring Interface (SMI) - The SMI provides you with point-in-time
information about the contents of the shared memory data structures used by
Dynamic Server, as well as information about the various objects contained in the
Informix instance. The SMI is implemented as the sysmaster database.
• Sysmaster - The sysmaster database holds the tables for the System Monitoring
Interface. One sysmaster database is automatically created for each Dynamic
Server instance the first time the instance is brought online. The sysmaster
database contains its own system-catalog tables and views, and a set of virtual
tables that serve as pointers to shared memory data. It can be used to gather
status, performance, and diagnostic information about the Informix instance.
• Dbaccess - Dbaccess is a query tool designed for character-based environments
is dbaccess. Dbaccess offers many other features in addition to being a query
tool with a full-screen text editor. It provides a menu-driven interface for creating
databases and tables, as well as an option for selecting column, index,
permission, constraint, and status information for existing tables. Status
information includes such things as the number of rows in the table and the row
size of the table.
• Dbaccess can also be used to execute a text file containing SQL commands
• OpenAdmin Tool (OAT) - The OpenAdmin Tool, also known as OAT, is a web-
based tool for administering one or more Informix database servers from a single
location. OAT's graphical interface includes options to perform administrative
tasks and to analyze system performance.

© Copyright IBM Corp. 2001, 2017 A-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x A Te r m i n o l o g y

Unit summary
• To review Informix terminology

Terminology © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 A-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x A Te r m i n o l o g y

© Copyright IBM Corp. 2001, 2017 A-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Data types

Data types

Informix (12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
A p p e n d i x B D a t a t yp e s

© Copyright IBM Corp. 2001, 2017 B-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x B D a t a t yp e s

Unit objectives
• Review Informix data types

Data types © Copyright IBM Corporation 2017

Unit objectives

© Copyright IBM Corp. 2001, 2017 B-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x B D a t a t yp e s

Informix supports many data types that you must be familiar with as an Informix DBA.
This is a partial list with brief definitions. If you need more information about a data type,
please consult the Informix Knowledge Center or ask your instructor.
Informix contains a number of built-in data types that allow you to store and manage
data for most application needs. Built-in data types include character, Boolean,
numeric, time, large object, JSON and BSON data types.
Additionally, Informix includes DataBlade data types. A DataBlade (extension or
add-on) is a package of user-defined data types and routines that are designed to
extend the database server capabilities for a particular purpose. Some DataBlades
are included with some Informix editions, and others can be purchased separately.
Finally, Informix includes the ability to handle extended data types. Extended data
types enable you to characterize data that cannot be easily represented with the
built-in data types. They include complex data types and user-defined data types.
(Use of extended data types is not covered in this course.)
Character data types
Three of the character data types in Informix are CHAR (or CHARACTER),
VARCHAR, and LVARCHAR. CHAR holds fixed length character strings;
VARCHAR and LVARCHAR hold varying length character strings. Character data
types are also known as alphanumeric data types.
• CHARACTER - The CHARACTER (or CHAR) data type stores any combination
of letters, numbers, and symbols. Tabs and spaces can be included. No other
non-printable characters are allowed. A CHAR(n) column has a fixed length of n
bytes, where n is a value between 1 and 32767.
• VARCHAR - The VARCHAR(m,r) data type stores character strings of varying
length, where m is the maximum size (in bytes) of the column and r is the
minimum number of bytes reserved for that column.
• LVARCHAR - The LVARCHAR(m) data type also stores character strings of
varying length, where m is the maximum size (in bytes of the column.
Boolean data types
• BOOLEAN - The BOOLEAN data type stores the Boolean values for true and
false using a single byte of storage. The Boolean data type can be set to a value
of either 'T' (or 't') or 'F' (or 'f'). The Boolean data type is not case sensitive.

© Copyright IBM Corp. 2001, 2017 B-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x B D a t a t yp e s

Numeric data types


There are many numeric data types in Informix, including the following. SMALLINT,
INTEGER (or INT), INT8, and BIGINT data types hold the whole number
representation of a value. SERIAL, SERIAL8, and BIGSERIAL are special numeric
data types that are useful for providing unique identifiers (primary key values) for
rows by incrementing the value by 1 each time they are used. FLOAT and
SMALLFLOAT data types store binary floating point numbers. DECIMAL and
MONEY data types store numbers based on the number of digits specified by the
user.
• SMALLINT - The SMALLINT data type holds numbers from -32,767 to +32,767
and requires 2 bytes of disk space. The SMALLINT value is stored internally as a
signed binary integer.
• INTEGER - The INTEGER (or INT) data type holds numbers from -2,147,483,647
to +2,147,483,647 and requires 4 bytes of disk space. An INTEGER value is
stored internally as a signed binary integer.
• INT8 - The INT8 data type holds numbers from -9,223,372,036,854,775,807 to
+9,223,372,036,854,775,807. INT8 storage requirements are platform-
dependent; 8 bytes on 64-bit platforms and 10 bytes on 32-bit platforms.
• BIGINT - The BIGINT data type stores integers in the range of -
9,223,372,036,854,807 to +9,223,372,036,854,775,807, and uses 8 bytes of
storage. It has some storage advantages over INT8. Also, because they are
stored in native 8 byte format, BIGINT data types can provide better performance
than INT8 data types.
• SERIAL - The SERIAL(n) data type stores a sequential integer using the INT data
type. The default starting value is 1, but you can assign a different initial value by
specifying n.
• SERIAL8 - The SERIAL8(n) data type stores a sequential integer using the INT8
data type. The SERIAL8 data type behaves the same as the SERIAL data type,
but with a longer range of values.
• BIGSERIAL - The BIGSERIAL(n) data type stores a sequential integer using a
BIGINT data type. The BIGSERIAL data type behaves the same as the SERIAL
data type, but with a longer range of values.

© Copyright IBM Corp. 2001, 2017 B-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x B D a t a t yp e s

• SMALLFLOAT - The SMALLFLOAT data type stores single-precision floating-


point numbers with approximately nine significant digits.
• FLOAT - The FLOAT data type stores double-precision floating-point numbers
with up to 17 significant digits.
• DECIMAL - The DECIMAL(p,s) data type format the numbers with a given
precision and scale.
• p - Precision is the total number of digits (1 to 32).
• s - Scale is the number of digits to the right of the decimal point.
• MONEY - The MONEY(p,s) data type stores currency amounts. Like the
DECIMAL data type, MONEY stores fixed point numbers up to a maximum of 32
significant digits, where p (precision) is the total number of digits, and s (scale) is
the number of digits after the decimal point.
Date and time data types
There are two date / time data types: DATE and DATETIME. The DATE data type
stores a calendar date. The DATETIME data type stores an instance in time,
expressed as a calendar date and the time of day. The precision of a DATETIME
data type can range from a year to a fraction of a second.
• DATE - The DATE data type is an integer representing the number of days since
December 31, 1899.
• DATETIME - The DATETIME data type allows the granularity to which a point in
time is measured to be selectable; that is, you can define data items that store
points of time with granularities from a year to a fraction of a second.
Interval data type
The INTERVAL data type stores a value that represents a span of time.
• INTERVAL - INTERVAL data types are divided into two types: year-month and
day-time. Year-month interval classes are YEAR and MONTH. Day-time interval
classes are DAY, HOUR, MINUTE, SECOND, and FRACTION (of a second with
up to 5 positions, default of 3).

© Copyright IBM Corp. 2001, 2017 B-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x B D a t a t yp e s

Large objects
Large object data types store large ASCII or binary data values. Informix provides
four large object data types, BYTE, TEXT, BLOB, and CLOB. The BYTE and TEXT
types are referred to as simple large objects. The BLOB and CLOB data types are
referred to as smart large objects.
• BYTE - BYTE data types represent large amounts of unstructured data with
unpredictable contents.
• TEXT - TEXT data types represent large text files, and can contain both single-
byte and multibyte characters that the locale supports.
• BLOB - The BLOB data type stores any type of binary data. BLOBs offer features
not available with BYTE, including random access to object data.
• CLOB - The CLOB data type stores any type of text data. CLOBs offer features
not available with BYTE, including random access to object data.
JSON and BSON data types
JSON and BSON are Informix built-in data types that are used to support relational
database operations on data in JSON or BSON document store format. The JSON data
type is in plain text format, while the BSON data type is the binary representation of the
JSON data type.
• JSON - The acronym JSON stands for JavaScript Object Notation. It is a plain
text format for entering and displaying structured data. It is language
independent, and is a self-describing data-interchange format.
• BSON - BSON is the binary (internal) representation of a JSON document.

© Copyright IBM Corp. 2001, 2017 B-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x B D a t a t yp e s

DataBlade data types


A DataBlade (extension or add-on) is a package of user-defined data types and
routines that are designed to extend the database server capabilities for a particular
purpose. They include data type definitions, casts to other data types, access
methods (for indexing), and support functions. Some DataBlades are included with
some Informix editions, and others can be purchased separately.
• Binary - The Binary DataBlade Module incorporates two data types: BINARYVAR
and BINARY18.
• Basic Text Search - The Basic Text Search (BTS) DataBlade Module allows
basic text searching for words and phrases in a document stored in a column of a
table.
• Spatial - The Spatial DataBlade Module integrates spatial and nonspatial data,
providing a seamless point of access using SQL (Structured Query Language). It
provides SQL functions capable of comparing the values in spatial columns to
determine if they intersect, overlap, or are related in many different ways.
• Time Series - The TimeSeries DataBlade Module makes it possible to store and
manipulate a series of date and time data values. The Timeseries DataBlade
supports both regularly and irregularly occurring time series.
• Web - The Web DataBlade Module is used to create web applications that
include data from the Informix database. Web DataBlade tags and functions are
embedded in HTML pages, and dynamically run SQL statements to access data
from the database and format and display the results.

Extended data types

Extended data types allow you to characterize data that cannot easily be represented
with the built-in data types. Extended data types include complex and user-defined.
• Complex - A complex data type is usually a composite of other existing data
types. Complex data types include collection data types (LIST, SET, MULTISET)
and row types (named and unnamed).
• User-Defined - A user-defined data type (UDT) is a data type that derived from an
existing data type They can be used to extend the built-in types already available
and create customized data types.

© Copyright IBM Corp. 2001, 2017 B-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x B D a t a t yp e s

Unit summary
• To review Informix data types

Data types © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 B-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x B D a t a t yp e s

© Copyright IBM Corp. 2001, 2017 B-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
XML publishing

XML publishing

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Appendix C XML publishing

© Copyright IBM Corp. 2001, 2017 C-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

Objectives
• Describe the XML capabilities of the Informix server
• Create XML documents using SQL

XML publishing © Copyright IBM Corporation 2017

Objectives

© Copyright IBM Corp. 2001, 2017 C-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

XML publishing
• Provides way to transform results of SQL Query to XML structure
• Can optionally include XML schema and header
• Special characters automatically handled
• Can store results in Informix database
• Must start idsxmlvp VP to use XML functions

XML publishing © Copyright IBM Corporation 2015

XML publishing
XML publishing provides a way to transform the results of SQL queries into XML
structures.
When you publish an XML document using the built-in XML publishing functions, you
transform the result set of an SQL query into an XML structure, optionally including an
XML schema and header.
Special characters, such as the less than (<), greater than (>), double quote (“),
apostrophe (‘), and ampersand (&) characters are automatically converted to their XML
notation.
You can store the XML in the Informix database for use in XML-based applications.
To use these functions, you must start the idsxmlvp virtual processor.

© Copyright IBM Corp. 2001, 2017 C-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

XML functions (1 of 2)
• Two sets of functions publish XML from SQL queries
• Functions return LVARCHAR or CLOB:
 genxml, genxmlclob
− Return rows of SQL results as XML elements
 genxmlelem, genxmlelemclob
− Return each column value as separate elements
 genxmlschema, genxmlschemaclob
− Return an XML schema in XML format
 genxmlquery, genxmlqueryclob
− Return results of SQL query in XML format
 genxmlqueryhdr, genxmlqueryhdrclob
− Return results of SQL query in XML format with XML header

XML publishing © Copyright IBM Corporation 2017

XML functions
Several functions let you publish XML from SQL queries. The functions that are
provided in Informix are of two types: functions that return LVARCHAR and functions
that return CLOB. All of the functions handle NULL values and special characters.
The functions are:
• genxml and genxmlclob: Return rows of SQL results as XML elements.
• genxmlelem and genxmlelemclob: Return each column value as separate
elements.
• genxmlschema and genxmlschemaclob: Return an XML schema and result in
XML format.
• genxmlquery and genxmlqueryclob: Return the result set of a query in XML
format.
• Accepts an SQL query as a parameter
• genxmlqueryhdr and genxmlqueryhdrclob: Return the result set of a query in XML
with the XML header.
• Provide a quick method for generating the required XML header

© Copyright IBM Corp. 2001, 2017 C-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

XML functions (2 of 2)
• Functions return LVARCHAR or CLOB (continued):
 extract, extractxmlclob
− Evaluate XPATH expression on XML column, document, or string and return XML
fragment
 extractvalue, extractxmlclobvalue
− Evaluate XPATH expression on XML column, document, or string and return value
of the XML node
 existsnode
− Determines whether XPATH evaluation results in at least one XML element.
 idsxmlparse
− Parses XML document to determine whether it is well formed

XML publishing © Copyright IBM Corporation 2017

More XML functions:


• extract and extractxmlclob: Evaluate an XPATH expression on an XML column,
document, or string, returning an XML fragment.
• Example: Evaluate the XML contained in col2 of table tab1 and return the
given name for Jason Ma.
SELECT extract(col2,’/personnel/person[@id=”Jason.Ma]/name/given’)
FROM tab1;
• The second parameter passed is the XPATH to the XML element
desired.
• The results would look like:
<given>Jason</given>
• extractvalue and extractxmlclobvalue: Return the value of the XML node.
• Example: return the number of docs in various cities.
SELECT warehouse_name, extractvalue(e.warehouse_spec,
’/Warehouse/Docks’) "Docks" FROM warehouses e
WHERE warehouse_spec IS NOT NULL;

© Copyright IBM Corp. 2001, 2017 C-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

• existsnode: Verifies that a specific node exists in an XML document.


• Example: Return a list of warehouse IDs and names for every warehouse that
has an associated dock.
SELECT warehouse_id, warehouse_name FROM warehouses WHERE
existsnode(warehouse_spec, ’/Warehouse/Docks’) = 1;
• idsxmlparse: Parses an XML document to determine whether it is well formed.

© Copyright IBM Corp. 2001, 2017 C-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

XML publishing examples (1 of 5)


• The classes table: classid class subject

1 125 Chemistry
2 250 Physics
3 375 Mathematics
4 500 Biology
• The query:
SELECT genxml(classes, "row") FROM classes;
 Returns data in XML format
 If first parameter is name of table, returns all columns
 Second parameter is name of XML element
• The results:
<row classid="1" class="125" subject="Chemistry"/>
<row classid="2" class="250" subject="Physics"/>
<row classid="3" class="375" subject="Mathematics"/>
<row classid="4" class="500" subject="Biology"/>

XML publishing © Copyright IBM Corporation 2017

XML publishing examples


This example shows how to retrieve XML rows from an SQL query.
Using the genxml function, the first parameter passed specifies what data to return. In
this case, the parameter is the name of the table, which indicates to return all columns.
The second parameter, row, is the name of the XML element that contains each
returned row.

© Copyright IBM Corp. 2001, 2017 C-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

XML publishing examples (2 of 5)


• The query:
SELECT genxml(row(classid, class), "row")
FROM classes;
 Use “row( )” construct in first parameter to specify columns to return
• The results:
<row classid="1" class="125"/>
<row classid="2" class="250"/>
<row classid="3" class="375"/>
<row classid="4" class="500"/>

XML publishing © Copyright IBM Corporation 2017

From the same classes table as the example in the previous slide, this example uses
the row( ) construct to return only the columns classid and class.

© Copyright IBM Corp. 2001, 2017 C-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

XML publishing examples (3 of 5)


• The query:
SELECT genxmlelem(classes, "classes") FROM classes
WHERE classid = 1;
 Returns columns as individual elements
 Table name as first parameter specifies return all columns
• The results:
<classes>
<classid>1</classid>
<class>125</class>
<subject>Chemistry</subject>
</classes>

XML publishing © Copyright IBM Corporation 2017

This example uses the same classes table as the previous examples.
The genxmlelem function returns the columns in the table as individual elements.
The first parameter, the name of the table, species that all columns are to be returned.
The same syntax as in the previous visual could be used to only return specific
columns:
SELECT genxmlelem(row(classid, subject), "classes")
FROM classes
WHERE classid = 1;
The results of that query would be:
<classes>
<row>
<classid>1</classid>
<subject>Chemistry</subject>
</row>
</classes>

© Copyright IBM Corp. 2001, 2017 C-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

XML publishing examples (4 of 5)


• The query:
EXECUTE FUNCTION genxmlqueryhdr("classes",
"SELECT * FROM classes");
 Returns data in XML format with XML header
 First parameter specifies name of XML element
 Second parameter is SQL query
• The results:
<?xml version="1.0" encoding="en_US.819" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<classes>
<row>
<classid>1</classid>
<class>125</class>

XML publishing © Copyright IBM Corporation 2017

The genxmlqueryhdr function produces XML output similar to the genxmlelem function
shown on the previous slide.
The difference is that in addition to the data, the genxmlqueryhdr function also produces
an XML header in the output.

© Copyright IBM Corp. 2001, 2017 C-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

XML publishing examples (5 of 5)


• genxmlschema XML function
 Returns table schema and data in XML format
• The query:
SELECT genxmlschema(classes,"classes")
FROM classes;
 First parameter specifies columns to return
− If name of table, returns all columns
− Can use the “row(col1, col2, …)” format to limit columns
 Second parameter specifies name of XML element

XML publishing © Copyright IBM Corporation 2017

The genxmlschema function is identical to the genxml function, but also generates an
XML schema along with the data output.
Using the name of the table in the first parameter specifies that all columns are to be
returned.
You can use the row(col1, col2, …) format in the first parameter to specify the list of
columns desired.
The second parameter is the name of the XML element to be returned in the results.
<?xml version="1.0" encoding="en_US.819" ?> xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://schemas.ibm.com/informix/2006/sqltypes"
xmlns="http://schemas.ibm.com/informix/2006/sqltypes"
ElementFormDefault="qualified"> <xs:element name="classes"><xs:complexType>
<xs:sequence> <xs:element name="classid" type="xs:serial"/> <xs:element
name="class" type="xs:smallint"/> <xs:element name="subject"
type="xs:char(15)"/> </xs:sequence> </xs:complexType> </xs:element>
</xs:schema> <classes> <row> <classid>1</classid> <class>125</class>
<subject>Chemistry </subject> </row> <row> <classid>2</classid>
<class>250</class> <subject>Physics </subject> </row> <row>
<classid>3</classid> <class>375</class> <subject>Mathematics </subject>
</row> <row> <classid>4</classid> <class>500</class> <subject>Biology
</subject> </row> </classes>

© Copyright IBM Corp. 2001, 2017 C-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

Exercise
XML publishing
• configure the Informix instance for XML publishing
• Generate XML output from SQL queries

XML publishing © Copyright IBM Corporation 2017

Exercise: XML publishing

© Copyright IBM Corp. 2001, 2017 C-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

Exercise:
XML publishing

Purpose:
In this exercise, you will learn how to use the XML publishing features of
Informix.

Task 1. Configure Informix for XML publishing.


In this task, you will configure the instance to enable XML Publishing and generate
several formats of XML documents. The process will include configuring the correct
virtual processor within the Informix engine.
1. Edit the $ONCONFIG file.
• Add the parameter to configure one XML VP to the bottom of the
configuration file.
• Set the SPSPACENAME parameter to the name of your sbspace.
2. Save the $ONCONFIG file and cycle the engine (bring it offline and back
online).
Task 2. Generate XML output.
1. Create and run an SQL query to unload all the data from the customer table in
XML format to a file named customer.xml. Assume the output will be greater
than 32 KB in size. Use the OUTPUT TO syntax to create the file.
2. Create and run an SQL query to unload only the customer number, last name,
and first name from the customer table for customer number 101 in XML
format to a file named cust_101.xml.
3. Generate an XML table schema with data for the stock table containing only
stock number, manufacturer code, and description, and only items from the
manufacturer Hero (HRO). Generate the output to a file named
stock_schema.xml.
4. Create and run an SQL query to unload only the customer last name, first
name, and phone number from the customer table for customer number 122 in
XML format to a file named cust_122.xml. Make sure the output includes an
XML header.
Results:
In this exercise, you learned to use the XML publishing features of Informix.

© Copyright IBM Corp. 2001, 2017 C-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

Exercise:
XML publishing - Solution

Purpose:
In this exercise, you will learn how to use the XML publishing features of
Informix.

Task 1. Configure Informix for XML publishing.


In this task, you will configure the instance to enable XML Publishing and generate
several formats of XML documents. The process will include configuring the correct
virtual processor within the Informix engine.
1. Edit the $ONCONFIG file.
vi $INFORMIXDIR/etc/$ONCONFIG
• Add the parameter to configure one XML VP to the bottom of the
configuration file.
VPCLASS idsxmlvp,num=1
• Set the SPSPACENAME parameter to the name of your sbspace.
SBSPACENAME s9_sbspc
2. Save the $ONCONFIG file and cycle the engine (bring it offline and back
online).
onmode -yuck
oninit -v

© Copyright IBM Corp. 2001, 2017 C-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

Task 2. Generate XML output.


1. Create and run an SQL query to unload all the data from the customer table in
XML format to a file named customer.xml. Assume the output will be greater
than 32 KB in size. Use the OUTPUT TO syntax to create the file.
OUTPUT TO 'customer.xml'
SELECT genxmlclob(customer, "rows")
FROM customer;
2. Create and run an SQL query to unload only the customer number, last name,
and first name from the customer table for customer number 101 in XML
format to a file named cust_101.xml.
OUTPUT TO 'cust_101.xml'
SELECT genxml(row(customer_num, lname, fname), "cust101")
FROM customer
WHERE customer_num = 101;
3. Generate an XML table schema with data for the stock table containing only
stock number, manufacturer code, and description, and only items from the
manufacturer Hero (HRO). Generate the output to a file named
stock_schema.xml.
OUTPUT TO stock_schema.xml
SELECT genxmlschema(row(stock_num,
manu_code,description),"stock")
FROM stock
WHERE manu_code = "HRO";
4. Create and run an SQL query to unload only the customer last name, first
name, and phone number from the customer table for customer number 122 in
XML format to a file named cust_122.xml. Make sure the output includes an
XML header.
OUTPUT TO 'cust_122.xml'
SELECT genxmlqueryhdr(“cust122”,
“SELECT lname, fname, phone FROM customer WHERE
customer_num = 122;”)
FROM customer
WHERE customer_num = 122;
Results:
In this exercise, you learned to use the XML publishing features of Informix.

© Copyright IBM Corp. 2001, 2017 C-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

Unit summary
• Describe the XML capabilities of the Informix server
• Create XML documents using SQL

XML publishing © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 C-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix C XML publishing

© Copyright IBM Corp. 2001, 2017 C-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Basic Text Search DataBlade module

Basic Text Search


DataBlade module

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

© Copyright IBM Corp. 2001, 2017 D-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Objectives
• Describe the features of the Basic Text Search DataBlade Module
• Search the database for text content using the Basic Text Search
DataBlade Module
• Use the XML data index and search features of the Basic Text Search
DataBlade Module

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Objectives

© Copyright IBM Corp. 2001, 2017 D-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Basic Text Search DataBlade module


• Bundled with Informix
• Allows text search in unstructured repository stored in database table
• Must be stored in column of type:
 BLOB
 CHAR, NCHAR
 CLOB
 LVARCHAR
 VARCHAR, NVARCHAR
• Can be used with most multi-byte character sets:
 Only ASCII character set supported
 Does not support ideographic languages

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Basic Text Search DataBlade module


The Basic Text Search (BTS) DataBlade allows you to search words and phrases in an
unstructured document repository stored in a column of a table.
In traditional relational database systems, you must use a LIKE or MATCHES condition
to search for text data and use the database server to perform the search.
This text search package and its associated functions, known as the text search
engine, is designed to perform fast retrieval and automatic indexing of text data. The
text search engine runs in one of the database server-controlled virtual processes.
The Basic Text Search DataBlade has two principle components, the bts_contains
search predicate and the set of Basic Text Search DataBlade functions.
To use Basic Text Search, you must store the text data in a column of data type BLOB,
CHAR, NCHAR, CLOB, LVARCHAR, VARCHAR, or NVARCHAR.
Although you can store searchable text in a column of the BLOB data type, Basic Text
Search does not support indexing binary data. BLOB data type columns must contain
ASCII text.

© Copyright IBM Corp. 2001, 2017 D-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Configure Basic Text Search


• Register BTS DataBlade Module in database
 blademgr>register bts.1.10 mydb
• Define one bts virtual processor in onconfig file
 VPCLASS bts,noyield,num=1
• Create extspace for BTS index:
 Create directory on disk
− mkdir /bts_extspc
 Create extspace using onspaces command
− onspaces –c –x ext_space1 –l "/bts_extspc"
 Index structure created in BTS external space
− extspace/db_name/owner_name/index_name/

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Configure Basic Text Search


The steps necessary in order to use the text search capabilities of the Basic Text
Search DataBlade are:
1. Register the BTS DataBlade in the database you want to search. If you are
using Informix, version 11.70 or greater, this is done automatically when
you first access one of the DataBlade objects. This can be accomplished
manually using the BladeManager tool.
2. Define a bts virtual processor in the ONCONFIG file. The syntax is:
VPCLASS bts,noyield,num=1
Once the ONCONFIG file has been modified for the BTS VP, the engine
must be stopped and restarted. Only one BTS VP can be configured.
3. Create an external space for the BTS index.
Create a directory on disk for storing the index. Note: It is not necessary to
create the directory at this point. As long as user informix has permissions
to create the directory, it is created automatically when the first BTS index
is created. Permissions on this directory must be 770.

© Copyright IBM Corp. 2001, 2017 D-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Use the onspaces command to create the extspace, pointing to the


directory just created.
The index resides in the directory structure:
<extspace>/<database_name>/<index_owner_name>/<index_name>/
External spaces are not listed in the onstat -d display, but information about
them can be obtained from the sysmaster:sysextspaces table.

© Copyright IBM Corp. 2001, 2017 D-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Text DataBlade indexing


• Data indexable using BTS index:
 Supports comparison:
− Equal
− Greater than, greater than or equal to
− Less than, less than or equal to
 Supports text searches
• Must create BTS index for each column to be searched
 Index cannot be altered—it must be dropped/recreated
• Does not support:
 Composite index
 Fillfactor
 Index clustering
 Unique indexes

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Text DataBlade indexing


The Text DataBlade uses a BTS index structure. The index supports comparisons such
as equal, greater than, less than, and so on, in addition to various text search
capabilities. You must create a BTS index for each text column you plan to search.
You cannot alter the characteristics of a BTS index after you create it. Instead, you
must drop the index and re-create it with the desired characteristics.
The following characteristics are not supported for BTS indexes:
• Composite indexes
• Fill factors
• Index clustering
• Unique indexes

© Copyright IBM Corp. 2001, 2017 D-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Creating a BTS index


• CREATE INDEX parameters:
 index_name
 table_name
 column_name and operator_class
 XML index parameters
 deletion_mode [optional]:
− Immediate
− Deferred (default)
 stopwords [optional]
 extspace_name
• Assuming a table products with a column brand of data type CHAR,
and an external space named ext_space1:
CREATE INDEX brand_idx
ON products (brand bts_char_ops)
USING bts (delete = 'deferred') IN ext_space1;
Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Creating a BTS index


Creating a BTS index is similar to creating a standard B+ tree index: using the CREATE
INDEX SQL statement, specifying the name of the index, the name of the table, and the
name of the column on which the index is being created.
The column parameter must include the index ops specification. The exact specification
depends on the data type of the column being indexed, as shown on the following
page.
In the example, the column brand is of data type CHAR, requiring the operator clause
bts_char_ops.
Next, the syntax must specify USING bts for the access method. An optional delete
clause specifies the delete method.
The delete method determines when the references to deleted rows are physically
removed from the index structure.
The delete method can be deferred or immediate, where immediate is the default.
These are discussed on a later page.
You can also specify a custom stopword list using the stopword parameter. This
parameter is optional.
Finally, you must specify the name of the external space where the index structure
is to be stored.

© Copyright IBM Corp. 2001, 2017 D-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

BTS Index Operator class


• Must specify operator class when creating bts index
• Operator class dependent on data type of text column

Data Type Operator Class

BLOB bts_blob_ops
CHAR bts_char_ops
CLOB bts_clob_ops
LVARCHAR bts_lvarchar_ops
VARCHAR bts_varchar_ops
NCHAR bts_nchar_ops
NVARCHAR bts_nvarchar_ops

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

BTS Index Operator class


This table shows the operator classes that must be specified when creating a BTS
index.
The operator class depends on the data type of the column on which the index is being
created.

© Copyright IBM Corp. 2001, 2017 D-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Stopwords
• Can create custom stopword list
• Specify stopwords parameter when creating bts index
• Stopwords must be lowercase
• Input can be:
 Inline with comma-delimited values
− stopwords=“(word1,word2,word3)”

 An external file
− stopwords=“file:/directory/filename”

 A table column
− stopwords=“table:table_name.column_name”

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Stopwords
You can optionally create a custom stopword list to replace the default list. Invoking a
custom stopword list is done by specifying the stopword parameter when creating the
BTS index.
The stopword list can be entered either inline in the stopword parameter, or by
referencing an external file or a column in a database table.

© Copyright IBM Corp. 2001, 2017 D-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Maintaining BTS indexes: Manual maintenance


• When index created, default method for delete is ‘deferred’
• Index item marked for deletion when table row deleted
• Queries do not return item marked for deletion
• Disk space not released
• Must manually maintain index using oncheck
oncheck –ci –y <dbname>:<table_name>#<index_name>
or bts_index_compact function
bts_index_compact('extspace/db_name/owner_name/idx_name/')
• Index locked while rewritten

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Maintaining BTS indexes: Manual maintenance


When a BTS index is created, you can specify a delete mode of either deferred or
immediate. If no delete mode is specified, it is deferred.
When the delete mode is deferred and an item with a BTS index is deleted from the
database table, the index item is only marked for deletion but is not removed from the
index.
Delete operations are faster in deferred mode, but disk space is not released.
Maintaining a BTS index when the delete option is deferred is a manual process.
In order to remove items marked for deletion and release disk space, one of two
processes must be run.
oncheck
The first option is to run the oncheck –ci command. This command removes any
items marked for deletion from the index.
bts_index_compact
The other method is to run the bts_index_compact function. The result is the same
as running the oncheck command.

© Copyright IBM Corp. 2001, 2017 D-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

The difference is that you only need to know the names of the database, table, and
index for the oncheck command, while you must know the directory path to the BTS
index in order to use the bts_index_compact function.
The index is locked while these processes run.
The deferred mode is best for large indexes that are updated frequently.

© Copyright IBM Corp. 2001, 2017 D-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Maintaining BTS indexes: Automatic maintenance


• Create index with (delete=‘immediate’) option
• Index item immediately removed when table row deleted
• Disk space released immediately
• Index locked while rewritten

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Maintaining BTS indexes: Automatic maintenance


By setting the delete mode to immediate when creating a BTS index, maintenance
becomes automatic.
The index is rewritten immediately each time a row is deleted from the database table,
and this releases disk space. However, delete operations are slower, and the index is
locked and unavailable during the time it takes to rewrite it.

© Copyright IBM Corp. 2001, 2017 D-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

BTS query syntax


• Syntax:
bts_contains(column, 'query_string', score # REAL);
• column
 Name of column to be searched
• query_string:
 Word or phrase being searched
 Enclosed in single quotes
 Includes optional search operators
• score # REAL:
 Optional argument to limit search based on relevance score
 Values from 0.0 to 100.0 inclusive
 Can use ‘score’ as query filter
SELECT id FROM products
WHERE bts_contains(brands, 'standard', score # REAL)
AND score > 70.0;
Basic Text Search DataBlade module © Copyright IBM Corporation 2017

BTS query syntax


BTS searches use the bts_contains predicate, which takes two required parameters
and one optional parameter (score).
Column name
The first required parameter is the name of the column containing the data to be
searched. In order for bts_contains to use this column, a BTS index must have been
created on this column.
Query string
The second required parameter is the query string, enclosed in single quotation marks.
The query string can consist of any combination of single words and phrases. If the
query string contains a phrase, the phrase is enclosed within double quotation marks
within the outer single quotation marks required by the bts_contains predicate.
Score
Score, an optional parameter, is a REAL number between 0.0 and 100, and indicates
the relevance of the results to the search criteria compared to that of other indexed
records. The higher the document score value, the more closely the results match the
criteria.

© Copyright IBM Corp. 2001, 2017 D-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

BTS search restrictions


• Searches not case-sensitive
• White space, punctuation ignored
• Can create custom stopword list
• Terms within "<", ">" brackets not interpreted as HTML/XML tags
 Use XML index parameters to search specific XML field content
• Boolean operators AND, OR, and NOT:
 Cannot use between bts_contains predicates
 Can be use between words within search string
 Not allowed:
bts_contains(column1,'word1')
AND bts_contains(column1,'word2')
 Allowed:
bts_contains(column1,'word1 AND word2')
Basic Text Search DataBlade module © Copyright IBM Corporation 2017

BTS search restrictions


There are some restrictions in the use of the BTS Basic Text Search DataBlade:
• Searches are not case-sensitive
• White space and punctuation are ignored
• Terms within brackets (< >) are not interpreted as HTML or XML tags unless you
create the index using XML index parameters. These parameters are discussed
later in this unit.
• You cannot use the boolean operators AND, OR, and NOT between bts_contains
predicates.
This means you cannot create a search query using the construct:
bts_contains(column1, 'word1') AND bts_contains(column1,'word2')
You can use boolean operators within a single text search string, such
as:
bts_contains(column1, 'word1 AND word2')

© Copyright IBM Corp. 2001, 2017 D-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

BTS searches: Words and phrases


• Word
bts_contains(column, 'Coastal')
• Phrase
bts_contains(column, ' "Black and Tan" ')
• Wildcards:
 Can use ‘*’ for multiple-character wildcard search
bts_contains(column, 'c*t')
Can use ‘?’ for single-character wildcard search
bts_contains(column, 'te?t')
Cannot use wildcard as first character of search string
• Special characters
 Require leading backslash (‘\’)
• Regular expressions not supported
Basic Text Search DataBlade module © Copyright IBM Corporation 2017

BTS searches: Words and phrases


A BTS search supports the following types of searches:
• Words and phrases
• A search can be on a single word or a phrase.
• The query string is enclosed in single quotes. If the search string is a phrase, then
the phrase is enclosed in double quotes within the single-quoted query string.
• Wildcards
• Wildcards are supported.
• Multiple-character wildcard searches are represented by the asterisk.
• Single-character wildcard searches are represented by the question mark.
• Wildcards cannot be used as the first character in a search string.
• Special characters, such as *, ?, and \, must be preceded by a backslash
character.
• Regular expressions are not supported.

© Copyright IBM Corp. 2001, 2017 D-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

BTS searches: Boolean operators (1 of 2)


• AND:
 Must contain all terms:
(column, ' UNIX AND "operating system" ')
(column, ' UNIX && "operating system" ')
• OR:
 Must contain at least one of the terms
 Default operator:
(column, ' UNIX Windows ')
(column, ' UNIX OR Windows ')
(column, ' UNIX || Windows ')
• Required operator ‘+’:
(column, ' +UNIX +"operating system" ')
(column, ' +UNIX "operating system" ')
Basic Text Search DataBlade module © Copyright IBM Corporation 2017

BTS searches: Boolean operators


The following Boolean operators can be used in a BTS search:
• AND: The entire query string is enclosed in single quotes. Within the single
quotes, you can use the AND operator between words or phrases to indicate that
all terms must be found in order to be returned.
• OR: The OR condition is the default action when no Boolean operator is specified
between words or phrases.
• Required operator: You can specify that a word or phrase must be present by
using the plus sign immediately before that word or phrase. The bts_contains
predicate supports both the word AND and the standard double-ampersand (&&)
notation. You can specify OR by using the word OR or the double-pipe (||) syntax.

© Copyright IBM Corp. 2001, 2017 D-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

BTS searches: Boolean operators (2 of 2)


• NOT:
 Must not contain negated term:
(column, ' UNIX AND NOT Windows ')
(column, ' UNIX AND !Windows ')
(column, ' +UNIX -Windows ')
• Grouping terms:
 Can group terms using parentheses:
(column, ' (UNIX OR Windows) and
"operating system" ')

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

These Boolean operators are also supported:


• NOT: The NOT operator instructs the search engine to reject data that contains
the specified word or phrase.
• Grouping terms: Terms can be grouped in parentheses to form more complex
queries. You can use the word NOT, the exclamation point, or the minus sign to
specify the NOT condition.
In the example above, the search is looking for a result containing the phrase operating
system, plus either the words UNIX or Windows.

© Copyright IBM Corp. 2001, 2017 D-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

BTS searches: Fuzzy and proximity searches


• Fuzzy:
 Searches for text with close match
 Use tilde ‘~’
bts_contains(column,' bank~ ')
 Can add degree of similarity
− Values 0-1
− Default value 0.5
bts_contains(column,' bank~0.9 ')
• Proximity:
 Enclose search terms in double-quotes
 Add tilde and specify maximum number of non-search words between items
bts_contains(column, ' "curb lake"~8 ')

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

BTS searches: Fuzzy and proximity searches


Fuzzy searches and proximity searches are supported in the following way:
• Fuzzy searches:
• Fuzzy searches look for text that closely matches the word in the query string.
This can help when you are not sure of the spelling of the search term.
• To specify a fuzzy search, append a tilde to the term in the query string.
• You can also specify a degree of similarity by appending a value between 0
and 1 after the tilde.
• The default value for the degree of similarity is 0.5.
• Proximity searches:
• In situations where you know the text to be searched contains specific terms,
but are not sure of how many words there are between them, use a proximity
search.
• This is done by enclosing the terms within double-quotes, then appending a
tilde after the phrase, and specifying the number of non-search words allowed.
• The terms within the expression are not order dependent.
• You can also use Boolean operators between proximity expressions, meaning
you might have multiple proximity search expressions within a query string.

© Copyright IBM Corp. 2001, 2017 D-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

BTS searches: Range searches and boosting


• Range searches:
 Inclusive:
− Enclose values in square brackets ‘ [ ] ‘
− Includes values in search
bts_contains(column,' [2105 TO 2401] ')
 Exclusive:
− Enclose values in curly braces ‘ { } ’
− Excludes values from search
bts_contains(column,' {2105 TO 2401} ')
• Weighting (boosting) terms:
 Can weight importance of search term
 Specified by using caret ‘^’
 By default, all terms equal
 Must be positive integer but may be <1
 Results same but boosted term appears higher in list
bts_contains(column, ' ocean^20 lake^10 river ')
Basic Text Search DataBlade module © Copyright IBM Corporation 2017

BTS searches: Range searches and boosting


BTS searches also support range searches and boosting terms.
• Range searches:
• Two types of range searches are also supported.
• One includes the values in the range expression, and one excludes those
values.
• To include the values in the range expression, use square brackets to enclose
the range expression. In the example above, the values greater than or equal
to 2105 and less than or equal to 2401 would both returned.
• To exclude the values in the range expression, use curly braces to enclose the
range expression. In the example, only results greater than 2105 and less than
2401 would be returned.

© Copyright IBM Corp. 2001, 2017 D-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

• Weighting (boosting) terms:


• Terms within a query string can be weighted to give them more importance in
the return list. This is called boosting.
• Boosting does not affect the search, it only affects the return listing.
• Terms that have been boosted are displayed first in the list, with the highest
boosted value displayed first.
• To boost a term, append a caret and a positive integer greater than 1,
although decimal values less than 1 can be used.
• In the example, documents containing the word ocean would be displayed
before documents containing the word lake, before documents containing the
word river.

© Copyright IBM Corp. 2001, 2017 D-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML searches
• Must specify fields to be searched:
 With xmltags index parameter, default field is first field
 With all_xmltags index parameter, no default field
• Specify field/path name followed by colon (:)
bts_contains(column,'fruit:Orange')
bts_contains(column,'fruit:"Orange Juice"')
bts_contains(column,'/fruit/citrus:"Orange Juice"')
• If enable include_namespaces index parameter must escape colon
within namespace
bts_contains(column,'fruit\:citrus:Orange')
• BTS search modifiers also apply to XML searches

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

XML searches
Searches of XML data are also done using the bts_contains predicate, but now you can
specify specific tags or XML paths to search. This is done by specifying the field or path
name followed by a colon (:) and then the search word or phrase.
When creating an index for searching XML data there are a number of parameters you
can specify that are specific to XML data. These parameters are discussed in the
following pages.
The search modifiers discussed in previous pages, such as fuzzy, proximity, range, and
wildcards also apply to XML searches.

© Copyright IBM Corp. 2001, 2017 D-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters


• XML documents indexed as unstructured text unless use XML index
parameters
• Gives ability to search XML data in different ways
• Parameters
 xmltags
• Flag parameters (yes/no):
 all_xmltags
 xmlpath_processing
 include_namespaces
 include_subtag_text
 include_contents
 strip_xmltags

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

XML index parameters


XML index parameters give you the ability to manipulate searches of XML data in
different ways.
When you do not use XML index parameters, XML documents are indexed as
unstructured text. The XML tags, attributes, and values are included in searches and
are indexed in a single field called contents. By contrast when you use XML index
parameters, the XML tag values are indexed in separate fields either by tag name or by
path. The attributes of XML data are not indexed.
The xmltags or all_xmltags parameters identify the tags to index. The
xmlpath_processing parameter enables searches based on XML paths. The
include_namespaces parameter allows you to index text and to search data based on
namespaces. The include_subtag_text parameter allows you to index text within
markup tags. The include_contents parameter puts the XML data in original format into
the contents field. The strip_xmltags parameter puts the XML data in an untagged
format into the contents field.

© Copyright IBM Corp. 2001, 2017 D-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: xmltags (1 of 3)


• Identifies tags or paths to index
• Mutually exclusive with all_xmltags
• Specify by list, file, or table column:
xmltags="(field1,field2,path)"
xmltags="file:directory/filename"
xmltags="table:tabname.colname"
• If field names are upper/mixed case use file or table column
• XML path can be full or relative path when xmlpath_processing
parameter enabled
• Tags or paths become field names in bts index
• Default field is first tag or path in field list

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

XML index parameters: xmltags


When you specify the xmltags index parameter, the XML tags or paths that you specify
become the field names in the bts index. Attributes of XML tags are not indexed in the
field. The text values within fields can be searched.
In searches, the default field is the first tag or path in the field list.
The Basic Text Search DataBlade does not check to see if the tags exist in the column.
This means that you can specify fields for tags that you add to the column after you
have created the index.
The input for the field names for the xmltags parameter can be one of three forms:
• Inline comma-separated values
• An external file
• A table column
If the xmlpath_processing parameter is enabled, you can specify paths for the xmltags
values. For example:
xmltags="(/text/book/title,/text/book/author,/text/book/date)"

© Copyright IBM Corp. 2001, 2017 D-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML tags are case-sensitive. When you use the inline comma-separated field names
for input, the field names are transformed to lowercase characters. If the field names
are uppercase or mixed case, use an external file or a table column for input instead.
The file or table that contains the field names must be readable by the user creating the
index. The file or table is read only when the index is created. If you want to add new
field names to the index, you must drop and re-create the index. The field names in the
file or table column can be separated by commas, whitespaces, newlines, or a
combination.

© Copyright IBM Corp. 2001, 2017 D-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: xmltags (2 of 3)


• XML data:
<book>
<title>Graph Theory</title>
<author>Stewart</author>
<date>January 14, 2008</date>
</book>
• CREATE INDEX statement:
CREATE INDEX book_idx
ON books(xml_data bts_lvarchar_ops) USING bts
(xmltags=“(title,author,date)”)
IN ...;
 Indexed fields:
− title
− author
− date
Basic Text Search DataBlade module © Copyright IBM Corporation 2017

This visual shows the CREATE INDEX statement using the xmltags index parameter to
specify indexing the title, author, and date XML fields.
The fields indexed are:
• title:graph theory
• author:stewart
• date:january 14, 2008

© Copyright IBM Corp. 2001, 2017 D-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: xmltags (3 of 3)


• SELECT statement:
SELECT xml_data FROM books
WHERE bts_contains(xml_data,’author:stewart’);
• Data returned:
<book>
<title>Graph Theory</title>
<author>Stewart</author>
<date>January 14, 2008</date>
</book>

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Given the XML data and CREATE INDEX statement from the previous visual, the
example query returns the data shown in XML format.

© Copyright IBM Corp. 2001, 2017 D-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: all_xmltags (1 of 3)


• Enables indexing of all XML tags
• Set to "yes" or "no" in CREATE INDEX statement
• With xmlpath_processing parameter:
 Full paths indexed
 Attributes of XML tags not indexed in individual fields
• Syntax:
...
USING bts(all_xmltags="yes")
...

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

XML index parameters: all_xmltags


Use the all_xmltags index parameter to enable searches on all the XML tags or paths in
a column.
The all_xmltags has the format all_xmltags="yes" or all_xmltags="no".
All the XML tags are indexed as fields in the BTS index.
If you use the xmlpath_processing parameter, full paths are indexed. The attributes of
XML tags are not indexed in a field. The text value within fields can be searched.

© Copyright IBM Corp. 2001, 2017 D-28


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: all_xmltags (2 of 3)


• XML data:
 <book>
 <title>Graph Theory</title>
 <author>Stewart</author>
 <date>January 14, 2008</date>
 </book>
• CREATE INDEX statement:
− CREATE INDEX book_idx
− ON books(xml_data bts_lvarchar_ops) USING bts
− (all_xmltags=“yes”)
− IN ...;
 Indexed fields:
− title
− author
− date
Basic Text Search DataBlade module © Copyright IBM Corporation 2017

To create an index for all the XML tags, use the SQL statement:
CREATE INDEX ... USING bts(all_xmltags="yes") ...;
The index contains three fields that can be searched:
• title:graph theory
• author:stewart
• date:january 14, 2008
The top-level <book></book> tags are not indexed because they do not contain text
values. If you enable path processing with the xmlpath_processing parameter, you can
index the full paths:
CREATE INDEX...USING bts(all_xmltags="yes",xmlpath_processing=”yes”)...;
The index contains three fields with full paths that can be searched:
• /book/title:graph theory
• /book/author:stewart
• /book/date:january 14, 2008

© Copyright IBM Corp. 2001, 2017 D-29


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: all_xmltags (3 of 3)


• SELECT statement:
SELECT xml_data FROM books
WHERE bts_contains(xml_data,’author:stewart’);
• Data returned:
<book>
<title>Graph Theory</title>
<author>Stewart</author>
<date>January 14, 2008</date>
</book>

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Given the XML data and CREATE INDEX statement from the previous visual, the query
in this example returns the data shown in XML format.

© Copyright IBM Corp. 2001, 2017 D-30


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: xmlpath_processing (1 of 3)


• Enables searches based on XML paths
• Set to "yes" or "no" in CREATE INDEX statement
• Paths can be relative or absolute
• Requires using xmltags or setting all_xmltags parameter
• Syntax:
...
USING bts(all_xmltags="yes",xmlpath_processing="yes")
...

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

XML index parameters: xmljpath_processing


Use the xmlpath_processing index parameter to enable searches based on XML paths.
The xmlpath_processing has the format xmlpath_processing="yes" or
xmlpath_processings="no". Setting the parameter to yes enables the parameter.
The xmlpath_processing index parameter requires that you specify tags with the
xmltags parameter or that you enable the all_xmltags parameter.
When you enable xmlpath_processing, the path becomes the field. Text within a tag
that is not within the path cannot be searched.
If xmlpath_processing is not enabled, only individual tags can be searched. The XML
path can be either a full path or a relative path.
Full paths begin with a slash (/). If you use the all_xmltags parameter with
xmlpath_processing, all of the full paths are indexed.
You can index specific full or relative paths when you use the xmltags parameter.

© Copyright IBM Corp. 2001, 2017 D-31


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: xmlpath_processing (2 of 3)


• XML data:
<book>
<title>Graph Theory</title>
<author>Stewart</author>
<date>January 14, 2008</date>
</book>
• CREATE INDEX statement:
CREATE INDEX book_idx
ON books(xml_data bts_lvarchar_ops) USING bts
(all_xmltags=“yes”,xmlpath_processing=“yes”)
IN ...;
 Indexed paths:
− /book/title
− /book/author
− /book/date
Basic Text Search DataBlade module © Copyright IBM Corporation 2017

This example shows the CREATE INDEX statement using the xmlpath_processing
index parameter.
This indexes the following paths:
• /book/title:graph theory
• /book/author:stewart
• /book/date:january 14, 2008

© Copyright IBM Corp. 2001, 2017 D-32


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: xmlpath_processing (3 of 3)


• SELECT statement:
SELECT xml_data FROM books
WHERE bts_contains
(xml_data,’/book/author:stewart’);
• Data returned:
<book>
<title>Graph Theory</title>
<author>Stewart</author>
<date>January 14, 2008</date>
</book>

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Given the XML data and CREATE INDEX statement from the previous visual, the query
in this example returns the data shown in XML format.

© Copyright IBM Corp. 2001, 2017 D-33


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: include_contents (1 of 3)


• Allows search of XML tag values and XML tag names
• Set to "yes" or "no" in CREATE INDEX statement
• Requires using xmltags or setting all_xmltags parameter
• Syntax:
...
USING bts(all_xmltags="yes",include_contents="yes")
...
• Adds single contents field to index:
 Includes all XML tags and values in one field
contents:<book> <title>Graph Theory</title>
<author>Stewart</author> <date>January 14, 2008</date>
</book>

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

XML index parameters: include_contents


The include_contents index parameter allows you to search the tag names and attribute
names in addition to the text.
When you do not use XML index parameters, XML documents are indexed as
unstructured text in the contents field.
Use the include_contents index parameter to add the contents field the index. The
contents field is a single field that includes all the data in the text, including the tags.
The include_contents parameter has the format include_contents="yes" or
include_contents="no".
The include_contents index parameter must be used with either the xmltags parameter
specified or with the all_xmltags parameter enabled.

© Copyright IBM Corp. 2001, 2017 D-34


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: include_contents (2 of 3)


• XML data:
<book>
<title>Graph Theory</title>
<author>Stewart</author>
<date>January 14, 2008</date>
</book>
• CREATE INDEX statement:
CREATE INDEX book_idx
ON books(xml_data bts_lvarchar_ops) USING bts
(all_xmltags="yes",include_contents="yes")
IN ...;
 Indexed fields:
− title
− author
− date
− contents
Basic Text Search DataBlade module © Copyright IBM Corporation 2017

This visual shows the CREATE INDEX statement using the include_contents index
parameter.
This indexes the title, author, and date fields, and adds the contents field, which
includes all text and tags.
The actual fields indexed are:
• title:graph theory
• author:stewart
• date:january 14,2008
• contents:<book> <title>Graph Theory</title> <author>Stewart</author>
<date>January 14, 2008</date> </book>

© Copyright IBM Corp. 2001, 2017 D-35


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: include_contents (3 of 3)


• SELECT statement:
SELECT xml_data FROM books
WHERE bts_contains
(xml_data,'contents:stewart');
• Data returned:
<book>
<title>Graph Theory</title>
<author>Stewart</author>
<date>January 14, 2008</date>
</book>

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Given the XML data and CREATE INDEX statement from the previous visual, the query
on the contents field in this example returns the data shown in XML format.

© Copyright IBM Corp. 2001, 2017 D-36


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: include_namespaces


• Allows indexing of XML tags that include namespaces in format of
prefix:localpart
• Set to "yes" or "no" in CREATE INDEX statement
• Requires using xmltags or setting all_xmltags parameter
• Syntax:
...
USING bts(all_xmltags="yes",include_namespaces="yes")
...
• Must escape colon within namespace with backslash ( \ )
bts_contains("/book/book\:title:theory")

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

XML index parameters: include_namespaces


The include_namespaces parameter allows you to index XML tags that include
namespaces in the qualified namespace format prefix:localpart.
For example: <book:title></book:title>
The include_namespaces has the format include_namespaces="yes" or
include_namespaces="no".
The include_namespaces index parameter must be used with either the xmltags
parameter specified or with the all_xmltags parameter enabled.
When you enable include_namespaces and the data includes the namespace in the
indexed tags, you must use the namespace prefix in your queries and escape each
colon with a backslash.
For example, to search for the text Theory, in the field book:title:, use the format:
bts_contains("/book\:title:theory").

© Copyright IBM Corp. 2001, 2017 D-37


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: include_subtag_text (1 of 2)


• Allows indexing of XML tags and subtags as one string
• Set to "yes" or "no" in CREATE INDEX statement
• Requires using xmltags or setting all_xmltags parameter
• Syntax:
...
USING bts(all_xmltags="yes",include_subtag_text="yes")
...

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

XML index parameters: include_subtag_text


Use the include_subtag_text parameter to index XML tags and subtags as one string.
Use this parameter when you want to index text that has been formatted with bold
(<b></b>) or italic (<i></i>) tags.
The include_subtag_text parameter has the format include_subtag_text="yes" or
include_subtag-text="no".
The include_subtag_text index parameter must be used with either the xmltags
parameter specified or with the all_xmltags parameter enabled.

© Copyright IBM Corp. 2001, 2017 D-38


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: include_subtag_text (2 of 2)


• XML data:
<comment>
this
<bold> highlighted </bold>
text is very
<italic> <bold>important</bold> </italic>
to me
</comment>
• Indexed fields:
 include_subtag_text disabled
comment:this
comment:text is very
comment:to me
 include_subtag_text enabled
comment:this highlighted text is very important to me
Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Given the XML fragment shown, if you create a BTS index with the include_subtag_text
parameter disabled (include_subtag_text=”no”), the index has three separate comment
fields:
• comment:this
• comment:text is very
• comment:to me
If you create a BTS index with the include_subtag_text parameter enabled
(include_subtag_text=”yes”), all of the text is indexed in a single comment field:
comment:this highlighted text is very important to me

© Copyright IBM Corp. 2001, 2017 D-39


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: strip_xmltags (1 of 3)

• Adds untagged values to contents field in index


• Do not need to specify xmltags or all_xmltags parameters
• Creates contents field automatically
• Set to "yes" or "no" in CREATE INDEX statement
• Syntax:
...
USING bts(strip_xmltags="yes")
...

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

XML index parameters: strip_xmltags


You can use the strip_xmltags index parameter to add the untagged values to the
contents field in the index.
The strip_xmltags has the format strip_xmltags="yes" or strip_xmltag="no".
Unlike other XML index parameters, you can use strip_xmltags in a CREATE INDEX
statement without specifying the xmltags or enabling the all_xmltags index parameters.
In this case, the contents field is created automatically.
However, if either you specify xmltags or if you enable all_xmltags, then you must also
enable the include_contents parameter.

© Copyright IBM Corp. 2001, 2017 D-40


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: strip_xmltags (2 of 3)


• XML data:
<book>
<title>Graph Theory</title>
<author>Stewart</author>
<date>January 14, 2008</date>
</book>
• CREATE INDEX statement – untagged values only:
CREATE INDEX book_idx
ON books(xml_data bts_lvarchar_ops) USING bts
(strip_xmltags=“yes”)
IN ...;
 Indexed fields:
− contents

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

To create an index with the untagged values only, use the statement:
CREATE INDEX ... USING bts(strip_xmltags="yes") ...;
The index contains a single contents field:
contents:Graph Theory Stewart January 14, 2008

© Copyright IBM Corp. 2001, 2017 D-41


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

XML index parameters: strip_xmltags (3 of 3)

• CREATE INDEX statement – tagged and untagged values:


CREATE INDEX book_idx
ON books(xml_data bts_lvarchar_ops) USING bts
(all_xmltags="yes",include_contents="yes",
strip_xmltags="yes")
IN ...;
 Indexed fields:
− title
− author
− date
− contents

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

To create an index that has XML tag fields as well as a field for the untagged values,
use the statement:
CREATE INDEX ... USING
bts(all_xmltags="yes",include_contents="yes",strip_xmltags="yes") ...;
The index contains XML tag fields as well as the untagged values in the contents field:
title:graph theory author:stewart date:january 14, 2008 contents:Graph
Theory Stewart January 14, 2008

© Copyright IBM Corp. 2001, 2017 D-42


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Basic Text Search DataBlade restrictions

• Replication is not supported


• Distributed queries are not supported
• Parallel database queries (PDQ) are not supported

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Basic Text Search DataBlade restrictions


There are a few restrictions with the Basic Text Search DataBlade:
• Replication is not supported
• Distributed queries are not supported
• Parallel database queries (PDQ) are not supported.

© Copyright IBM Corp. 2001, 2017 D-43


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Exercise
Basic Text Search DataBlade module
• configure the Informix instance for basic text searches using the Basic
Text Search DataBlade module
• conduct searches of textual data using the functions of the BTS
DataBlade
• explore indexing and searching of XML data

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Exercise: Basic Text Search DataBlade module

© Copyright IBM Corp. 2001, 2017 D-44


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Exercise:
Basic Text Search DataBlade module

Purpose:
In this exercise, you will learn how to use the Basic Text Search DataBlade.

Task 1. Configure the Basic Text Search DataBlade module.


In this task, you will configure the instance to use the Basic Text Search (BTS)
DataBlade Module, and conduct several text searches. The process will include
configuring the correct virtual processor within the Informix engine and registering
the BTS DataBlade Module with the database.
1. Edit the $ONCONFIG file to start a BTS VP when the engine starts.
• Continuing to edit the $ONCONFIG file, enter a default sbspace for the
system to use - with a value of. s9_sbspc
• Add the parameter to configure one BTS VP to the bottom of the
configuration file. This VP should not yield.
2. Save the $ONCONFIG file and cycle the engine (bring it offline and back
online).
3. Run the BladeManager program by typing in the command blademgr at the
command line.
4. List any DataBlade Modules which are registered with the database
stores_demo.
5. If not registered, list the DataBlade Modules installed in the instance.
The BladeManager command to list the DataBlade Modules which have
been installed in the instance is show modules. This can be abbreviated to
sho mod.
6. Register the Basic Text Search DataBlade Module in the stores_demo
database.
You must type the name of the DataBlade Module exactly as it is shown in
the sho mod listing. The name is case sensitive. If it has the letter “c” after it,
the letter “c” can be ignored.
You will be prompted to be sure you want to install the DataBlade Module.
The default response is Y. Click the Return key to this prompt.
You will get a message indicating the BTS DataBlade Module was
successfully registered with the stores database.

© Copyright IBM Corp. 2001, 2017 D-45


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

7. Exit BladeMgr by typing bye.


8. Create a subdirectory named bts_space under /home/informix/local to be
used as an external space by the instance and make the permissions on the
directory 770.
9. Use the onspaces command to create an external space in the instance, using
the directory just created.
10. Change directories to /home/informix/labs.
11. Modify the customer table to add a text column by running the command:
dbaccess stores_demo alter_cust.sql
This script adds a column thoughts of data type LVARCHAR, and
populates the column with text.
12. In dbaccess, create a BTS index on the newly-created and loaded thoughts
column of the customer table. Use the deferred delete method.
Task 2. Conduct BTS searches.
1. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains either of the words vegetable or fruit.
2. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains the exact phrase white chocolate.
3. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains the words eat and chocolate within 8 words of
each other.
4. You know a customer mentioned something about an emergency plan. You
know the words in the comment are close to each other, but not adjacent.
Create and run an SQL query that lists the customer full name and thoughts text
for this customer.
5. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains a word similar to contract

© Copyright IBM Corp. 2001, 2017 D-46


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Task 3. Conduct XML searches.


1. Create and load a new table named boats by running the following command:
$ dbaccess stores_demo boats.sql
This script creates a table named boats in your database with an ID column
of type INTEGER and an xml_data column of type LVARCHAR, and loads
three rows into the table.
2. Create a BTS index on the xml_data column of the boats table.
You should index all XML tags and include a contents field.
3. Create and run an SQL query that returns the XML data for records that have
the word “Black” anywhere in the record.
4. Create and run an SQL query that returns the XML data for records that have a
word similar to “Quinn” in the boat name.
Results:
In this exercise, you learned how to use the Basic Text Search DataBlade.

© Copyright IBM Corp. 2001, 2017 D-47


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Exercise:
Basic Text Search DataBlade module - Solutions

Purpose:
In this exercise, you will learn how to use the Basic Text Search DataBlade.

Task 1. Configure the Basic Text Search DataBlade module.


In this task, you will configure the instance to use the Basic Text Search (BTS)
DataBlade Module, and conduct several text searches. The process will include
configuring the correct virtual processor within the Informix engine and registering
the BTS DataBlade Module with the database.
1. Edit the $ONCONFIG file to start a BTS VP when the engine starts.
$ vi $INFORMIXDIR/etc/$ONCONFIG
• Continuing to edit the $ONCONFIG file, enter a default sbspace for the
system to use.
• Locate the entry for the SBSPACENAME parameter (tip: to locate a line,
do an ESC, then /SBSPACENAME)
• Set the parameter's value to s9_sbspc
SBSPACENAME s9_sbspc
• Add the parameter to configure one BTS VP to the bottom of the
configuration file. This VP should not yield. (use PGDN [page down] keys to
get to the bottom of the file - it is a large file; once at end of file, scroll to end
of line and then enter ‘A’ to append)
VPCLASS bts,noyield,num=1
2. Save the $ONCONFIG file and cycle the engine (bring it offline and back
online).
$ onmode -ky
$ oninit -v
3. Run the BladeManager program by typing in the command blademgr at the
command line.
$ blademgr
4. List any DataBlade Modules which are registered with the database
stores_demo.
dev> list stores_demo

© Copyright IBM Corp. 2001, 2017 D-48


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

5. If not registered, list the DataBlade Modules installed in the instance.


The BladeManager command to list the DataBlade Modules which have
been installed in the instance is show modules. This can be abbreviated to
sho mod.
dev> show mod
6. Register the Basic Text Search DataBlade Module in the stores_demo
database.
You must type the name of the DataBlade Module exactly as it is shown in
the sho mod listing. The name is case sensitive. If it has the letter “c” after it,
the letter “c” can be ignored.
You will be prompted to be sure you want to install the DataBlade Module.
The default response is Y. Click the Return key to this prompt.
You will get a message indicating the BTS DataBlade Module was
successfully registered with the stores database.
dev> register bts.3.10 stores_demo
7. Exit BladeMgr by typing bye.
dev> bye
8. Create a subdirectory named bts_space under /home/informix/local to be
used as an external space by the instance and make the permissions on the
directory 770.
$ mkdir /home/informix/local/bts_space
$ chmod 770 /home/informix/local/bts_space
9. Use the onspaces command to create an external space in the instance, using
the directory just created.
$ onspaces -c -x bts_space -l /home/informix/local/bts_space
10. Change directories to /home/informix/labs.
$ cd /home/informix/labs
11. Modify the customer table to add a text column by running the command. Use a
prewritten script - in the alter_cust.sql file - to accomplish this:
$ dbaccess stores_demo alter_cust.sql
The script adds a column - thoughts (of data type LVARCHAR) - and
populates the column with text.

© Copyright IBM Corp. 2001, 2017 D-49


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

12. In dbaccess, create a BTS index on the newly-created and loaded thoughts
column of the customer table. Use the deferred delete method.
CREATE INDEX bts_idx
ON customer (thoughts bts_lvarchar_ops)
USING bts
(delete = 'deferred')
IN bts_space;
Task 2. Conduct BTS searches.
1. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains either of the words vegetable or fruit.
SELECT TRIM(fname) || " " || lname, thoughts
FROM customer
WHERE bts_contains(thoughts,'vegetable fruit');
2. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains the exact phrase white chocolate.
SELECT TRIM(fname) || " " || lname, thoughts
FROM customer
WHERE bts_contains(thoughts,'"white chocolate"');

3. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains the words eat and chocolate within 8 words of
each other.
SELECT TRIM(fname) || " " || lname, thoughts
FROM customer
WHERE bts_contains(thoughts,'"eat chocolate"~8');

4. You know a customer mentioned something about an emergency plan. You


know the words in the comment are close to each other, but not adjacent.
Create and run an SQL query that lists the customer full name and thoughts text
for this customer.
SELECT TRIM(fname) || " " || lname, thoughts
FROM customer
WHERE bts_contains(thoughts,'"emergency plan"~3');

© Copyright IBM Corp. 2001, 2017 D-50


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

5. Create and run an SQL query that lists the customer full name and thoughts text
where the thoughts text contains a word similar to contract.
SELECT TRIM(fname) || " " || lname, thoughts
FROM customer
WHERE bts_contains(thoughts,'contract~');

Task 3. Conduct XML searches.


1. Create and load a new table named boats by running the following command:
$ dbaccess stores_demo boats.sql
This script creates a table named boats in your database with an ID column
of type INTEGER and an xml_data column of type LVARCHAR, and loads
three rows into the table.
2. Create a BTS index on the xml_data column of the boats table.
You should index all XML tags and include a contents field.

CREATE INDEX boats_idx ON boats (xml_data bts_lvarchar_ops)


USING bts(all_xmltags="yes",include_contents="yes")
IN bts_space;

3. Create and run an SQL query that returns the XML data for records that have
the word “Black” anywhere in the record.
SELECT * FROM boats
WHERE bts_contains(xml_data, ' contents:black ');

4. Create and run an SQL query that returns the XML data for records that have a
word similar to “Quinn” in the boat name.
SELECT * FROM boats
WHERE bts_contains(xml_data, ' boatname:quinn~ ');

Results:
In this exercise, you learned how to use the Basic Text Search DataBlade.

© Copyright IBM Corp. 2001, 2017 D-51


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
A p p e n d i x D B a s i c Te x t S e a r c h D a t a B l a d e m o d u l e

Unit summary
• Describe the features of the Basic Text Search DataBlade Module
• Search the database for text content using the Basic Text Search
DataBlade Module
• Use the XML data index and search features of the Basic Text Search
DataBlade Module

Basic Text Search DataBlade module © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 D-52


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Node DataBlade module

Node DataBlade module

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Appendix E Node DataBlade module

© Copyright IBM Corp. 2001, 2017 E-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Objectives
• Describe how to use the Node DataBlade Module to index
hierarchical data
• Write SQL queries to return hierarchical data using the Node
DataBlade

Node DataBlade module © Copyright IBM Corporation 2017

Objectives

© Copyright IBM Corp. 2001, 2017 E-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Solving the hierarchical problem


• Who works for whom?
CREATE TABLE employee
(emp_id serial PRIMARY KEY,
mgr_id int)
FOREIGN KEY (emp_id) REFERENCES employee (mgr_id);
• Procedure:
 Select all the employees for the specified manager
 Recursively select all the employees under each person
 Major performance impact
− Execution time increases exponentially with the number of levels

Node DataBlade module © Copyright IBM Corporation 2017

Solving the hierarchical problem


The hierarchical problem of self-referencing tables has been an ongoing problem for
DBAs.
In the typical example shown, there is an employee table with the employee ID as the
primary key.
Since each employee has a manager, who is also an employee, each employee record
references back to the main employee table for the employee manager record.
This results in a recursive select of employees under each manager, which can have a
significant performance impact, especially as the number of levels increases.

© Copyright IBM Corp. 2001, 2017 E-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Typical employee tree structure

Employee/Manager

Employee Employee/Manager Employee

Employee Employee/Manager Employee

Employee Employee Employee

Node DataBlade module © Copyright IBM Corporation 2017

Typical employee tree structure


This diagram shows a typical employee tree. Each manager has several employees
working under them, some might be managers themselves with subordinates.

© Copyright IBM Corp. 2001, 2017 E-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Standard query
• For each level, select the employee count:
SELECT count(*) FROM employee e1, employee e2, employee e3,
employee e4, employee e5
WHERE e5.mgr_id = :value AND e4.mgr_id = e5.emp_id
AND e3.mgr_id = e4.emp_id AND e2.mgr_id = e3.emp_id
AND e1.mgr_id = e2.emp_id;
• Without an extensible engine, must flatten relationships into master-
detail relationship:
 The relationships are nested, not regular 1:N model format
 Requires multiple passes through the data to find all the recursive
relationships
 Can only be solved with procedural or set processing
 As levels increase, programming becomes more complex, losing the ability
to dynamically create SQL operations

Node DataBlade module © Copyright IBM Corporation 2017

Standard query
To run a query such as the one shown above against the tree shown in the previous
page requires multiple passes through the data. It can have significant performance
impact, particularly as the number of levels increase.
Without an extendable database engine (such as Informix), the relationships must be
flattened into master-detail or parent-child relationships.

© Copyright IBM Corp. 2001, 2017 E-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Node DataBlade module


• Improves query performance for many recursive queries
• The Node data type:
 Opaque data type
 Variable length up to 256 characters
 Supports indexes and other relational functionality
 Operations involving ER replication are supported
 Deep copy and LIKE matching statements not supported
• With an extensible engine, can use a data type and associate functions
that represent data in its native, business format as a hierarchical
relationship

Node DataBlade module © Copyright IBM Corporation 2017

Node DataBlade module


The Node DataBlade Module, which is included as part of the Informix bundle, can
significantly improve performance for many recursive queries.
It is able to represent data in its native, business format as a hierarchical relationship.
The Node data type is an opaque type of variable length up to 256 characters, supports
indexes and other relational functionality, and also can be replicated.
The Node data type does not support the LIKE matching statement, however.

© Copyright IBM Corp. 2001, 2017 E-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

A workable hierarchy

1.0
CREATE TABLE employee (
emp_id node PRIMARY KEY,
name varchar(50));

1.1 1.2 1.3

1.2.1 1.2.2 1.2.3

1.2.3.2 1.2.3.3 1.2.3.4

1.2.3.4.5
Node DataBlade module © Copyright IBM Corporation 2017

A workable hierarchy
This chart depicts how the employee table is represented in a hierarchical format using
the Node data type of the Node DataBlade Module.

© Copyright IBM Corp. 2001, 2017 E-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

How the Node DataBlade works (1 of 2)


• Adjacent levels represent the manager and employee identification:
1.2 employee #2, manager #1
Similarly:
1.2.3.4 employee #4, manager #3 reports to manager #2
reports to manager #1
• Employee table
CREATE TABLE employee (
emp_id node PRIMARY KEY,
name varchar(50));

INSERT INTO employee VALUES ('1.0', 'Sam Smith');


INSERT INTO employee VALUES ('1.2', 'Bob Apple');
INSERT INTO employee VALUES ('1.2.3', 'Roy Brown');
INSERT INTO employee VALUES ('1.4', 'Jane Rogers');
INSERT INTO employee VALUES ('1.4.2', 'Tom Jones');

Node DataBlade module © Copyright IBM Corporation 2017

How the Node DataBlade works


Adjacent levels represent the relationships between the entities, in this case, the
employee and the manager.

© Copyright IBM Corp. 2001, 2017 E-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

How the Node DataBlade works (2 of 2)


• Performance impact
 Becomes linear processing using either table scan or partial index scan
• Functional comparisons are now possible:
 LessThan, LessThanOrEqual, Equal, GreaterThan, GreaterThanOrEqual,
NotEqual
1.12.1 > 1.4.17.8
 IsAncestor, IsChild, IsDescendant, IsParent, Ancestors
• Other admin functions on the structure of the data
 Graft, Increment, NewLevel, GetMember, GetParent

Node DataBlade module © Copyright IBM Corporation 2017

Because of the way the Node DataBlade represents the hierarchy, processing
becomes linear instead of recursive, using either table scans or partial index scans.
This representation allows functional comparisons such as equal, less than, greater
than, less than or equal to, equal to or greater than, and not equal.
Other functions allow administrative tasks to be performed on the structure:
• Graft: Moves sections of the node tree.
• Increment: Determines the next node at the same level.
• NewLevel: Creates a node level.
• GetMember: Returns information about a node level.
• GetParent: Returns the parent node.
• Ancestors: Returns the ancestor list back to the root node.

© Copyright IBM Corp. 2001, 2017 E-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Node DataBlade queries


• Who is your boss?
SELECT name FROM employee
WHERE emp_id = GetParent('1.4.3');
• Who works for you?
SELECT name FROM employee
WHERE emp_id > '1.4.3'
AND emp_id < Increment('1.4.3')
AND Length(emp_id) = 4;
• The revised employee count query
SELECT count(*) FROM employee
WHERE emp_id > GetParent('1.4.3.2.5')
AND emp_id < Increment(GetParent('1.4.3.2.5'))
AND Length(emp_id) = 5;

Node DataBlade module © Copyright IBM Corporation 2017

Node DataBlade queries


The example SQL statements briefly show how the Node DataBlade could be used to
get various information about managers, subordinates, and counts from the employee
table.
The first query returns the name of the manager of the employee with an emp_id of 1.4.
The second query returns the names of any persons this employee directly manages.
The Length function is used to limit the return to the employees - one level below the
current employee. Without this filter, the query would return a cumulative count of all
employees at all levels directly below the current employee.
The third query is the rewritten version of the query from several slides ago which used
multiple iterations of self-joins, depending on which level of the tree the count was
requested for.
Using the Node DataBlade, the query does not need to be rewritten based on which
level of the tree is being searched. Only the value of the employee ID and the length
need to be changed as variables.

© Copyright IBM Corp. 2001, 2017 E-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Node functions
• Equals • NewLevel
• NotEqual • GetParent
• LessThan • IsParent
• LessThanOrEqual • IsChild
• GreaterThan • Ancestors
• GreaterThanOrEqual • Graft
• Compare • IsDecendant
• Increment • GetMember
• Length
• Depth

Node DataBlade module © Copyright IBM Corporation 2017

Node functions
A list of the functions that are part of the Node DataBlade Module is shown here.

© Copyright IBM Corp. 2001, 2017 E-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Exercise
Node DataBlade module
• configure and use the Node DataBlade module for indexing
hierarchical data

Node DataBlade module © Copyright IBM Corporation 2017

Exercise: Node DataBlade module

© Copyright IBM Corp. 2001, 2017 E-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Exercise:
Node DataBlade module

Purpose:
In this exercise, you will learn how to use the Node DataBlade.

Task 1. Register the Node DataBlade module in the database.


In this task, you will register the Node DataBlade module with the database.
1. Run the BladeManager program by typing in the command blademgr at the
command line.
2. List any DataBlade Modules which are registered in your stores_demo
database.
3. If not registered, list the DataBlade Modules installed in the instance.
The BladeManager command to list the DataBlade Modules which have
been installed in the instance is show modules. This can be abbreviated to
sho mod.
4. Register the Node DataBlade Module in your stores_demo database.
You must type the name of the DataBlade Module exactly as it is shown in
the sho mod listing. The name is case sensitive. If it has the letter “c” after it,
the letter “c” can be ignored.
You will be prompted to be sure you want to install the DataBlade Module.
The default response is Y. Click the Return key to this prompt.
You will get a message indicating the Node DataBlade Module was
successfully registered with your stores_demo database.
5. Exit BladeMgr by typing bye.

© Copyright IBM Corp. 2001, 2017 E-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Task 2. Explore the traditional recursive query.


In this task, you will explore the traditional recursive query and then observe how
the query changes by utilizing the Node DataBlade Module.
1. If the table emp1 exists, drop it.
2. Create the emp1 table by executing the following command:
$ dbaccess stores_demo emp1.sql
This query creates and populates the emp1 table with 15 rows.
3. Run the following query by executing the command:
$ dbaccess stores_demo emp1_q1.sql
This is a recursive query, generating 5 passes through the table in order to get
the data it needs.
The query also generates an Explain output file.
SET EXPLAIN ON;
SELECT count(*) FROM emp1 e1, emp1 e2, emp1 e3,
emp1 e4, emp1 e5
WHERE e5.mgr_id = 9
AND e4.emp_id = e5.mgr_id
AND e3.emp_id = e4.mgr_id
AND e2.emp_id = e3.mgr_id
AND e1.emp_id = e1.mgr_id;
SET EXPLAIN OFF;
4. Examine the output for the access and join methods and the cost.
Note that the engine processed the query as if it were a 5-table join, essentially
making 5 passes through the table.

© Copyright IBM Corp. 2001, 2017 E-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Task 3. What difference the Node DataBlade makes.


In this task, you will utilize the Node DataBlade model to index hierarchical data and
examine the Explain output to evaluate the effects.
1. Create the emp2 table by executing the following command:
$ dbaccess stores_demo emp2.sql
This query creates and populates the emp2 table with 15 rows.
CREATE TABLE emp2 (
emp_id SERIAL,
emp_node NODE,
name VARCHAR(30),
ssn CHAR(11),
phone CHAR(14),
dept (CHAR(10)
);
Note that there is no column for mgr_id.
2. Run the following query by executing the command:
$ dbaccess stores_demo emp2_q1.sql
This query will obtain the same data as the recursive query from the previous
section.
The query also generates an Explain output file.
SET EXPLAIN ON;
SELECT count(*) FROM emp2
WHERE emp_id > GetParent('1.1.1.1.2')
AND emp_id < increment(getparent('1.1.1.1.2'))
AND length(emp_id) = 5;
SET EXPLAIN OFF;
3. Examine the Explain file generated for access and join methods and costs.
Note that the engine now only makes one pass through the table to get the
information it needs.
In addition, to run the query for different levels of the hierarchy, only the value of
the Node and level variables need change; whereas in the previous example
the entire query would have to be rewritten specific to each level.
Results:
In this exercise, you learned how to use the Node DataBlade.

© Copyright IBM Corp. 2001, 2017 E-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Exercise:
Node DataBlade module - Solutions

Purpose:
In this exercise, you will learn how to use the Node DataBlade.

Task 1. Register the Node DataBlade module in the database.


In this task, you will register the Node DataBlade module with the database.
1. Run the BladeManager program by typing in the command blademgr at the
command line.
2. List any DataBlade Modules which are registered in your stores_demo
database.
dev>list stores_demo
3. If not registered, list the DataBlade Modules installed in the instance.
The BladeManager command to list the DataBlade Modules which have been
installed in the instance is show modules. This can be abbreviated to sho mod.
4. Register the Node DataBlade Module in your stores_demo database.
You must type the name of the DataBlade Module exactly as it is shown in the
sho mod listing. The name is case sensitive. If it has the letter “c” after it, the
letter “c” can be ignored.
You will be prompted to be sure you want to install the DataBlade Module. The
default response is Y. Click the Return key to this prompt.
You will get a message indicating the Node DataBlade Module was successfully
registered with your stores_demo database.
dev>register Node.2.0 stores_demo
5. Exit BladeMgr by typing bye.
dev>bye

© Copyright IBM Corp. 2001, 2017 E-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Task 2. Explore the traditional recursive query.


In this task, you will explore the traditional recursive query and then observe how
the query changes by utilizing the Node DataBlade Module.
1. If the table emp1 exists, drop it.
2. Create the emp1 table by executing the following command:
$ dbaccess stores_demo emp1.sql
This query creates and populates the emp1 table with 15 rows.
3. Run the following query by executing the command:
$ dbaccess stores_demo emp1_q1.sql
This is a recursive query, generating 5 passes through the table in order to get
the data it needs.
The query also generates an Explain output file.
SET EXPLAIN ON;
SELECT count(*) FROM emp1 e1, emp1 e2, emp1 e3,
emp1 e4, emp1 e5
WHERE e5.mgr_id = 9
AND e4.emp_id = e5.mgr_id
AND e3.emp_id = e4.mgr_id
AND e2.emp_id = e3.mgr_id
AND e1.emp_id = e1.mgr_id;
SET EXPLAIN OFF;

© Copyright IBM Corp. 2001, 2017 E-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

4. Examine the output for the access and join methods and the cost.
Note that the engine processed the query as if it were a 5-table join, essentially
making 5 passes through the table.
$ more sqexplain.out

© Copyright IBM Corp. 2001, 2017 E-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

© Copyright IBM Corp. 2001, 2017 E-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Task 3. What difference the Node DataBlade makes.


In this task, you will utilize the Node DataBlade model to index hierarchical data and
examine the Explain output to evaluate the effects.
1. Create the emp2 table by executing the following command:
$ dbaccess stores_demo emp2.sql
This query creates and populates the emp2 table with 15 rows.
CREATE TABLE emp2 (
emp_id SERIAL,
emp_node NODE,
name VARCHAR(30),
ssn CHAR(11),
phone CHAR(14),
dept (CHAR(10)
);
Note that there is no column for mgr_id.
2. Run the following query by executing the command:
$ dbaccess stores_demo emp2_q1.sql
This query will obtain the same data as the recursive query from the previous
section.
The query also generates an Explain output file.
SET EXPLAIN ON;
SELECT count(*) FROM emp2
WHERE emp_id > GetParent('1.1.1.1.2')
AND emp_id < increment(getparent('1.1.1.1.2'))
AND length(emp_id) = 5;
SET EXPLAIN OFF;
3. Examine the Explain file generated for access and join methods and costs.
Note that the engine now only makes one pass through the table to get the
information it needs.
$ more sqexplain.out
In addition, to run the query for different levels of the hierarchy, only the value of
the Node and level variables need change; whereas in the previous example
the entire query would have to be rewritten specific to each level.

© Copyright IBM Corp. 2001, 2017 E-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Results:
In this exercise, you learned how to use the Node DataBlade.

© Copyright IBM Corp. 2001, 2017 E-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

Unit summary
• Describe how to use the Node DataBlade module to index
hierarchical data
• Write SQL queries to return hierarchical data using the Node
DataBlade

Node DataBlade module © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 E-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix E Node DataBlade module

© Copyright IBM Corp. 2001, 2017 E-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Using Global Language Support

Using Global Language


Support

Informix (v12.10)

© Copyright IBM Corporation 2017


Course materials may not be reproduced in whole or in part without the written permission of IBM.
Appendix F Using Global Language Support

© Copyright IBM Corp. 2001, 2017 F-2


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Objectives
• Set environment variables necessary for Global Language Support
• List the components of a locale
• Use the NCHAR and NVARCHAR data types
• Explain the effect of collation sequence on various SQL statements

Using Global Language Support © Copyright IBM Corporation 2017

Objectives

© Copyright IBM Corp. 2001, 2017 F-3


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Global Language Support


• Global Language Support (GLS) provides support for:
 International characters (non-ASCII)
 Localized collation sequence
 National currency symbols and format
 Local date format
 Local time format BE030
 Code set conversion

À ö
é
ñ
Using Global Language Support © Copyright IBM Corporation 2017

Global Language Support


Global Language Support (GLS) provides support for international cultural and
language conventions.
Using GLS:
• All user-specifiable objects such as tables, columns, views, statements, cursors,
and variables, can be identified with national code sets including multibyte code
sets.
• You have the option of using a localized collation sequence by using the data
types NCHAR and NVARCHAR instead of CHAR and VARCHAR. The localized
collation sequence is used in the ORDER BY and GROUP BY clauses of the
SELECT statement and when an index is created on an NCHAR or NVARCHAR
column. It is also used in the WHERE clause whenever logical, relational, or
regular expression operators are used.
• Monetary formats can be used which reflect the language or cultural specifics of a
country or culture outside the U.S.
• Different code sets can be specified for client applications, the database, and the
database server. A process called code set conversion translates the characters
passed from one locale to another.

© Copyright IBM Corp. 2001, 2017 F-4


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

What is a locale?
• A locale is a language environment composed of:
 A code set
 A collation sequence
 A character classification
 Numeric (non-money) formatting
 Monetary formatting
 Date and time formatting
 Messages
• Define a locale with an environment variable. For example:
 $ setenv CLIENT_LOCALE ja_jp.sjis

Using Global Language Support © Copyright IBM Corporation 2017

What is a locale?
A GLS locale represents the language environment for a specific location. It contains
language specifications as well as regional and cultural information. A locale consists of
a code set, a collation sequence, formatting specifications for numeric, money, date,
and time values, and message definitions.
You can define separate locales for the client application, the database, and the
database server. The three environment variables which you can set are:
• CLIENT_LOCALE
• DB_LOCALE
• SERVER_LOCALE
The specification of a locale defines the GLS behavior. No other flags or environment
variables need to be set. The default locale for either the application, database, or
database server is US 8859-1 English (en_us.8859-1).

© Copyright IBM Corp. 2001, 2017 F-5


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Locale naming convention


A locale name is composed of the following set of identifiers: language, territory, and
code set. The language and the territory identifiers are each two characters separated
by an underscore. The code set identifier is the suffix and is prefaced with a period. The
client and database locales must be the same (unless there is code set conversion,
which is explained in the following pages). The optional, four character modifier
specifies an override collation sequence such as phone or dictionary (phon or dict).
Acceptable modifiers are listed along with the locale names when you run glfiles (see
following pages).
Messages
If you want error and warning messages in a language other than English, install an
Informix Language Supplement for a particular language. To reference pre-existing
error messages in non-GLS directories, use the DBLANG environment variable.

© Copyright IBM Corp. 2001, 2017 F-6


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

A locale specifies a code set

ASCII Codes (decimal)

65 67 77 69 32 67 11

A C M E C o

ASCII Symbol

Using Global Language Support © Copyright IBM Corporation 2017

A locale specifies a code set


All data accessed by computers is represented by a series of ones and zeroes. Non-
binary data, such as a letter or symbol, must have a unique binary code to be
recognized by the computer. A set of character codes used to represent all the
characters and symbols in a language is called a code set. A code set is a mapping of
characters to their binary representations. The ASCII code set is composed of 128
symbols including lower and uppercase letters, digits 0-9, and various additional
symbols such as {,/,+, and (.
One byte of storage is required for every character in the ASCII code set. The
maximum number of symbols that can be represented by 1 byte is 256. Many
languages are able to extend the ASCII code set beyond the standard 128 symbols and
still use only 1 one byte of storage per character.
A locale specifies a particular code set. The default code set used by Informix
databases is ISO8859-1 (ASCII is a subset of ISO8859-1). The ISO8859-1 code set
uses 8 bits, whereas ASCII uses 7 bits.

© Copyright IBM Corp. 2001, 2017 F-7


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Multibyte code sets

Logical character representation

A B C D

A1 A2 B C1 C2 C3 D

Physical storage representation

Using Global Language Support © Copyright IBM Corporation 2017

Multibyte code sets


One byte of storage can represent a maximum of 256 symbols. If a code set defines
more than 256 characters, some characters require more than 1 byte of storage. These
are referred to as multibyte code sets. Some Asian languages use thousands of
characters, some of which require 2 or 3 bytes of storage. Informix GLS supports
multibyte code sets using up to 4 bytes of storage per symbol.
In an environment that uses multibyte code sets, character strings can contain a
mixture of single-byte and multibyte characters. All character data types (CHAR,
VARCHAR, NCHAR, and NVARCHAR) can accommodate multibyte characters.
The diagram illustrates the physical storage of multibyte characters and the
corresponding logical characters. For example, A1A2 represent the first and second
bytes of a logical character designated as A. B and D are single-byte characters, and C
is a multibyte character that requires 3 bytes of storage.

© Copyright IBM Corp. 2001, 2017 F-8


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Using Multibyte code sets


3 logical characters and
6 bytes of storage

A1 A2 B1 B2 C1 C2

1 2 3 4 5 6

• Column length: must accommodate physical length


CREATE TABLE gls_test(
multi_col CHAR(6)
...)
• Substring designators: operate on physical length
multi_col[1,2] - displays logical character A
multi_col[2,4] - displays logical character B

Using Global Language Support © Copyright IBM Corporation 2017

Using Multibyte code sets


Column lengths, substring offsets, and substring lengths are defined in terms of the
physical number of bytes, not the logical number of characters. When defining column
length in a multibyte code set environment, make allowances for the maximum number
of bytes each character can require. If the multibyte maximum is 2 bytes per character,
the maximum length for any of the character data types would be:
2 * (maximum number of logical characters)
You might want to use VARCHAR, NVARCHAR, or TEXT if the number of characters is
variable.
Substring designators
Substring designators specify the byte offset and length of a portion of a string. They
operate on physical storage, not on logical characters. For example, in a string
composed of 2 byte characters, the expression multi_col[1,2] retrieves the first
character A1A2.
It is possible to retrieve a partial character with a substring designator. For example, if
mult_col contains the string A1A2B1B2C1C2, the expression mult_col[2,4] retrieves
logical character B and partial characters A2 and C1. The database server resolves
partial characters by returning white spaces. This behavior is exhibited by data types
CHAR, VARCHAR, NCHAR, and NVARCHAR.

© Copyright IBM Corp. 2001, 2017 F-9


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

For data types BYTE and TEXT, the database server returns all bytes without partial
character replacement.
Substring designators should be used only when it is possible to determine the physical
location of the logical characters desired. The SQL functions LENGTH,
OCTET_LENGTH, and CHAR_LENGTH can be used to determine the physical and
logical lengths of strings in columns. Function LENGTH returns the string length in
bytes minus trailing white spaces. Function OCTET_LENGTH returns the number of
bytes and, function CHAR_LENGTH returns the number of logical characters.
SQL identifiers
GLS allows you to use any alphabetic characters of a code set to form most SQL
identifiers (names of tables, columns, views, indexes, and so on). The servername,
dbspace names, and blobspace names are the exceptions. The locale defines which
characters within a code set are considered alphabetic. Multibyte characters can be
used within an identifier, but the physical length of an identifier must be 18 bytes or less.
An identifier with multibyte characters has fewer logical characters than the length of the
identifier in bytes.

© Copyright IBM Corp. 2001, 2017 F-10


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

A locale specifies a collation order

• Code set order: the physical order of characters in the code set
• Localized order: the language-specific order of characters

Code Set Order Localized Order

A A
C À
a a
b b
c C
À c

Using Global Language Support © Copyright IBM Corporation 2017

A locale specifies a collation order


Collation is the order in which characters are sorted within a code set. Informix
database servers support two types of collation.
• Code set order: The order of the character codes in the code set determines the
sort order. For example, in the ASCII code set, A = 65 and B = 66. A sorts before
B because 65 is less than 66. However, because a = 97 and M = 77, the string
abc sorts after Me.
• Localized order: The locale determines the sort order. For example, even though
the character À might be represented by a code set code of 133, the locale file
could list this character after A and before B (A = 65, À = 133, B = 66). This
represents the more proper sort order for the language represented by the locale.

© Copyright IBM Corp. 2001, 2017 F-11


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

NCHAR and NVARCHAR

Data Types Collation Order

CHAR Code set order


VARCHAR(max,reserve)
TEXT

NCHAR Localized order


NVARCHAR(max,reserve)

Using Global Language Support © Copyright IBM Corporation 2017

NCHAR and NVARCHAR


The data types NCHAR and NVARCHAR differ from the CHAR and VARCHAR data
types in that OnLine sorts the data with a localized collation order. For example, an
index created on an NCHAR column is ordered in the localized sequence, whereas an
index created on a CHAR column is ordered in code set sequence. Data selected from
a CHAR column and sorted with the ORDER BY clause is output in code set order,
whereas the output from an NCHAR column is in localized order.
The syntax for using NCHAR and NVARCHAR is essentially the same as for CHAR
and VARCHAR:
CREATE TABLE gls_test(
col_1 NCHAR(20),
col_2 NVARCHAR(128,10)
);

© Copyright IBM Corp. 2001, 2017 F-12


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Collation Order and SQL statements


• Data types NCHAR and NVARCHAR only:

CREATE INDEX nchar_idx ON gls_test(nchar_col1)

SELECT * FROM gls_test ORDER BY nchar_col1

SELECT * FROM gls_test


WHERE nchar_col1 BETWEEN 'À' and 'b'
...WHERE nchar_col1 IN ('Àlvin','Johnson','Lane')
...WHERE nchar_col1 MATCHES 'À*'
...WHERE nchar_col1[1,1] > 'À'

Using Global Language Support © Copyright IBM Corporation 2017

Collation Order and SQL statements


The collation order of NCHAR and NVARCHAR data types depends on the localized
order as defined in the locale. The major instances where localized collation order
impacts processing are:
• CREATE INDEX on an NCHAR or NVARCHAR column
• SELECT...ORDER BY <NCHAR or NVARCHAR column>
• SELECT...WHERE <NCHAR or NVARCHAR column> clause containing
relational operators (=,<,>,>=,<=,!=), IN, BETWEEN, LIKE, or MATCHES
The localized collation order specified by a locale dictates a specific ordering of
characters. When the statement:
SELECT col1 FROM tab1 WHERE col1 < "c"
is executed on a CHAR or VARCHAR column (code set order) might return: A, C, a,
and b. Executed on an NCHAR or NVARCHAR column (localized order), the results
might be different: A, À, a, b, and C.

© Copyright IBM Corp. 2001, 2017 F-13


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Localized collation sequences can specify case folding (case insensitivity) or characters
which are equivalents. For example, if collation is in code set order (data types CHAR
or VARCHAR), the statement:
SELECT lname FROM customer
WHERE lname IN ('Azevedo','Llaner','Oatfield')
returns only one of Azevedo, azevedo, or Àzevedo whereas done in localized order, all
three might be returned.

© Copyright IBM Corp. 2001, 2017 F-14


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Locales: Numeric and Monetary formats

• Numeric:
 US English
− 3,225.01

 French
−3 225,01
• Monetary:
 US English
− $100,000.49

 French
− 100 000,49FF

Using Global Language Support © Copyright IBM Corporation 2017

Locales: Numeric and Monetary formats


Numeric formats can be specified by the locale. They can impact the decimal separator,
the thousands separator (and the number of digits in between), and the positive and
negative symbol. This type of formatting applies to the end-user formats of numeric
data (DECIMAL, INTEGER, SMALLINT, FLOAT, SMALLFLOAT) within a client
application. It does not impact the format of the numeric data types in the database.
The locale can have monetary formatting information that impacts values stored as
the MONEY data type. The format might impact the currency symbol, the decimal
separator, the thousands separator (and the number of digits in between), the
positive and negative position and symbol, and the number of fractional digits to
display. This formatting applies to the end-user format of MONEY data within a
client application. It does not impact the format of the data stored in the database.
DBMONEY
You can also use the DBMONEY environment variable to specify the currency
symbol for monetary values and the location of that symbol. The DBMONEY
environment variable overrides the settings of the monetary category of the locale
file.

© Copyright IBM Corp. 2001, 2017 F-15


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

A locale specifies Date and Time formats

Julian Year Ming Guo Year

1993 82
1912 01
1911 -01
1910 -02
1900 -12

Using Global Language Support © Copyright IBM Corporation 2017

A locale specifies Date and Time formats


The locale can include date and time formatting specifications. This can influence
DATE and DATETIME column values, and can include names and abbreviations for
days of the week and months of the year, commonly used representations for dates,
time (12-hour and 24-hour), and date/time. GLS supports non-Gregorian calendars (for
example, the Taiwanese Ming Guo year and the Arabic lunar calendar).
Locale- specific date and time formatting impacts the presentation and entry of data at
the client, not the way in which the data is stored in the database.

© Copyright IBM Corp. 2001, 2017 F-16


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Date and Time customization


• Order of precedence:
 DBDATE
• setenv DBDATE Y4MD-
• => 1995-10-25
 DBTIME (for DATETIME year to second)
• setenv DBTIME ‘%y - %m -%d %H:%M:%S’
• => 1995-10-25 16:30:28
 GL_DATE
• setenv GL_DATE Day %d Month %m Year %Y (%A)
• => Day 14 Month 11 Year 1995 (Tuesday)
 GL_DATETIME (for DATETIME year to second)
• setenv GL_DATETIME %b %d, %Y at %H h %M m %S s
• => Oct 25, 1995 at 16 h 30 m 28 s

Using Global Language Support © Copyright IBM Corporation 2017

Date and Time customization


GLS recognizes the following environment variables for customizing date and time
values (listed in order of precedence):
• DBDATE
• DBTIME (ESQL/C and ESQL/COBOL only)
• GL_DATE
• GL_DATETIME
It is recommended that you use GL_DATE and GL_DATETIME because of the greater
flexibility. DBDATE and DBTIME are compatible with earlier versions.
Extensive date and time customization is available using these environment variables.
They provide support for alternative dates and times including (Asian) formats such as
the Taiwanese Ming Guo year and the Japanese Imperial-era dates.
When the client requests a connection, it sends the date and time environment
variables to the database server.

© Copyright IBM Corp. 2001, 2017 F-17


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Locales: Client, Database, and Server

Using Global Language Support © Copyright IBM Corporation 2017

Locales: Client, Database, and Server


A separate locale exists for a client application, a database, and a database server.
When a database is created, a condensed version of the database locale is stored in
the systables system catalog table. This information is used by the database server for
operations such as handling regular expressions, collating character strings, and
ensuring proper use of code sets (collation, character classification, and code set). The
database locale for a particular database cannot be changed. If you want to change the
locale of a database, you must:
• Unload the data.
• Drop the database.
• Create a database with the desired locale (by setting DB_LOCALE).
• Load the data.
Client applications use the client locale when they perform read and write operations on
the client computer. Operations include reading a keyboard entry or a file, and writing to
the screen, a file, or a printer. Most localized date, number, money, and message
processing is performed by the client.

© Copyright IBM Corp. 2001, 2017 F-18


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

The server locale determines how the database server performs I/O operations on the
server computer. These I/O operations include reading or writing the following files:
• Diagnostic files that the database server generates to provide additional
diagnostic information
• Log files that the database server generates to record events
• Explain file, sqexplain.out, that is generated by executing the SQL statement SET
EXPLAIN
The database server is the only IBM Informix product that needs to know the server
locale.
Locale compatibility
The languages and territories of the client, database, and server locales might be
different if the code sets are the same. Be careful, however, because GLS does not
provide semantic translation. If the locale stored in the database is us_en.8859-1 and
the CLIENT_LOCALE is fr_fr.8859-1, a value stored in the database as $10.00 is
displayed on the client as 10,00FF. There is no exchange rate calculation.
Additionally, the code set of the locale stored in the database might differ from the
CLIENT_LOCALE code set. However, there are restrictions. If a database is created
with DB_LOCALE = aa_bb.cs1, then the CLIENT_LOCALE might equal any locale,
cc_dd.cs2, but only if locale cc_dd.cs1 exists and there is code set conversion between
cs1 and cs2 (code set conversion is explained later in the unit). If cc_dd.cs1 does not
exist, then an error -23101 is returned.
If the SERVER_LOCALE is not compatible with the DB_LOCALE (that is, the code sets
are different and not convertible), data is written to external files without code set
conversion.
Most processing relating to collation sequence or character classification is handled
by the database server. Most processing related to formatting of date, number, and
money values is performed by the client.

© Copyright IBM Corp. 2001, 2017 F-19


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Specifying locales
• Default:
 setenv CLIENT_LOCALE en_us.8859-1
 setenv DB_LOCALE en_us.8859-1
 setenv SERVER_LOCALE en_us.8859-1
• Example:
 setenv CLIENT_LOCALE ja_jp.sjis
 setenv DB_LOCALE ja_jp.ujis
 setenv SERVER_LOCALE ja_jp.ujis

Using Global Language Support © Copyright IBM Corporation 2017

Specifying locales
The following three environment variables specify the locales for the client application,
database, and database server.
• CLIENT_LOCALE
• DB_LOCALE
• SERVER_LOCALE
When the client requests a connection, it sends CLIENT_LOCALE and DB_LOCALE to
the database server. If the client and database locales sent by the client are not
compatible with what is stored in the database, a warning is returned to the client in the
SQL communications area (SQLCA) via the SQLWARN7 warn flag (except when the
code sets differ and code set conversion is available). The client application should
check this flag after connecting to a database.
The server locale, specified by SERVER_LOCALE, determines how the database
server reads and writes external files.

© Copyright IBM Corp. 2001, 2017 F-20


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Multiple locales: Code set conversion

Using Global Language Support © Copyright IBM Corporation 2017

Multiple locales: Code set conversion


In a client/server environment, character data might need to be converted from one
code set to another if the client, database, or server computers use different code sets
to represent the same characters. Converting character data from one code set to
another is called code set conversion.
Code set conversion is needed when:
• One language has different code sets representing subsets of the language.
• Different operating systems encode the same characters in different ways.
In the client/server environment, the following situations require code set conversion:
• If the client locale and database locale specify different code sets, the client
application performs code set conversion so that the server computer is not
loaded with this type of processing.
• If the server locale and server processing locale specify different code sets, the
database server performs code set conversion when it writes to and reads from
operating-system files such as log files.

© Copyright IBM Corp. 2001, 2017 F-21


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Code set conversion does not convert words to different languages. For example, it
does not convert the English word yes to the French word oui. It only ensures that each
character is processed or printed the same regardless of how it is encoded. Code set
conversion does not:
• Perform semantic translation. Words are not translated from one language to
another.
• Create characters which do not exist in the target code set. Conversion is from a
valid source character to a valid target character via a conversion file.
Code set conversion file
A code set conversion file is used to map source characters to target characters. If a
conversion file does not exist for the source-to-target relationship, an error is returned to
the client application when it begins execution. BYTE data is never converted. Use the
glfiles utility to generate a listing of the code set conversion files available on your
system.
Compatible locales
The code set of the CLIENT_LOCALE (cc_dd.cs2) might differ from the code set of
the locale stored in the database (aa_bb.cs1), only if locale cc_dd.cs1 exists and
there is a code set conversion file between cs1 and cs2.

© Copyright IBM Corp. 2001, 2017 F-22


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Conversion: Performance consideration


• Minimize code set conversion
• Determine number and locales of clients
• Build databases with the locales that minimize code set conversion
• Be aware of where code set conversion occurs:
 Client: CLIENT_LOCALE != DB_LOCALE
 Database server: DB_LOCALE != SERVER_LOCALE

Using Global Language Support © Copyright IBM Corporation 2017

Conversion: Performance consideration


Code set conversion requires processing resources. You should analyze your system
configuration to determine the locale settings for clients, databases, and database
servers which minimize code set conversion. For example, if an environment consists
of 100 clients with locale ja_jp.ujis and 2 clients with locale ja_jp.sjis, it would be
reasonable to create the database with locale ja_jp.ujis.

© Copyright IBM Corp. 2001, 2017 F-23


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Multibyte character: Utilities and APIs

• Informix utilities:
onaudit onshowaudit dbaccess
dbload onstat dbexport
oncheck onunload dbimport
onload dbschema

• ESQL/C
• ESQL/COBOL

Using Global Language Support © Copyright IBM Corporation 2017

Multibyte character: Utilities and APIs


Most Informix utilities support multibyte characters (and 8-bit characters). ESQL/C and
ESQL/COBOL support multibyte characters as long as your compiler supports the
same single-byte or multibyte code set that the ESQL source file uses. If your C
compiler does not support the code set, you can use the CC8BITLEVEL environment
variable (documented in the Guide to GLS Functionality) as a workaround to specify the
preprocessing environment for your C compiler. For example, setting CC8BITLEVEL to
0 indicates to the ESQL preprocessor that the compiler does not support using the
eighth bit in strings and comments.

© Copyright IBM Corp. 2001, 2017 F-24


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

The glfiles utility


• Output from the glfiles utility displays GLS files on your system:
 GLS locale files
 Informix code set conversion files
 Informix code set files:
− glfiles –lc {for locale files}
− glfiles –cv {for conversion files}
− glfiles –cm {for code set files}

Using Global Language Support © Copyright IBM Corporation 2017

The glfiles utility


You can use the glfiles utility to find out what locale files, code set conversion files, and
code set files are stored on your system. When you execute the glfiles utility, the output
is stored in a series of files in the current directory.
Locale files
The locales are stored in a file named lcX.txt, where X is the version of the locale object
file. The lcX.txt file lists the locales in alphabetic order sorted on the name of the GLS
object locale file.
Code set conversion files
The code set conversion files are stored in files named cvY.txt, where Y is the version
number of the code set conversion object file. The cvY.txt file lists the code set
conversions in alphabetic order, sorted on the name of the object code set conversion
file. Most code set to code set conversions have two code set conversion files: code set
A => code set B and code set B => code set A.
Code set files
The list of code set files is stored in files named cmZ.txt, where Z is the version number
of the code set object file format. The cmZ.txt file lists the code sets in alphabetic order,
sorted on the name of the GLS object code set file.

© Copyright IBM Corp. 2001, 2017 F-25


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Migrating to GLS from NLS or ALS


• Determine the NLS or ALS locale.
• Determine whether GLS supports the old locale.
• If the old locale is supported, decide whether to keep the old locale or
convert to a GLS custom locale.
• If staying with the old locale, no special steps are needed. The
database is converted to GLS when opened.
• If changing to a new locale, the database must be unloaded and then
re-created and loaded with the new locale.

Using Global Language Support © Copyright IBM Corporation 2017

Migrating to GLS from NLS or ALS


Older versions of Informix supported NLS (Native Language Support) and ALS (Asian
Language Support). If you are using NLS or ALS and are migrating to Version 7.2 or
greater, you must convert to GLS.
Two types of locales are supported in Version 7.2:
• GLS custom locales: These are the same for all operating systems. Distributed
queries across different platforms yield the same results as queries between
different databases on the same database server.
• Locales compatible with operating-system locales: These might be different from
one platform to another. In pre-7.2 versions of Informix, NLS and ALS use
operating system locales.
Decide whether to use the current locale or to convert to a GLS custom locale.
• Upgrading with the current, operating-system locale requires no special action on
your part. However, distributed queries across dissimilar platforms might give
incorrect results because of different locale category definitions.
• Changing from an operating system locale to a custom locale requires that you
unload, then reload your data.

© Copyright IBM Corp. 2001, 2017 F-26


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
Appendix F Using Global Language Support

Unit summary
• Set environment variables necessary for Global Language Support
• List the components of a locale
• Use the NCHAR and NVARCHAR data types
• Explain the effect of collation sequence on various SQL statements

Using Global Language Support © Copyright IBM Corporation 2017

Unit summary

© Copyright IBM Corp. 2001, 2017 F-27


Course materials may not be reproduced in whole or in part without the prior written permission of IBM.
IBM Training

© Copyright IBM Corporation 2017. All Rights Reserved.

S-ar putea să vă placă și