Sunteți pe pagina 1din 87

DB2 Basic

Material

-1-
INTRODUCTION TO A RELATIONAL
DATABASE

-2-
What is a Data base Management System.
A Database management system (DBMS) is a software package that manages data stored
in databases. You use a DBMS to control and use the information stored on your computer
system more effectively and efficiently than you can with standard file processing. The
DBMS is an additional layer of software between the program and the stored data .It’s the
DBMS rather than the application program that has a complete picture of the physical and
the logical structure of the data stored.

In a relational system
• Data is perceived by users as a collection of tables called relations
• Operators generate new tables from old ones
• The rows are referred to as tuples
• Columns are referred to as attributes
• Each row in a relation is distinct i.e duplicate rows are not allowed
• Each relation must have a primary key, an attribute or a combination of attributes
whose value uniquely identifies the rows.
• A relation may also contain foreign keys which reference other relations in a database.
• Entire information content of the database is represented as explicit data values (as
explicit values in column positions within rows of a table)

DB2
IBM’S Database 2 , commonly referred to as DB2, became generally available in the fall
of 1983. DB2 was marketed as a relational database management system for the MVS
operating system. It uses tables to represent the data that it manages .

HOW DOES DB2 ORGANIZE AND ACCESS INFORMATION.

Db2 is a database management system. A more complete description is that db2 is a


relational database management system that uses the industry standard Structured Query
Language (SQL) as a central component . A relational database system presents all
information in the form of tables. A table is just a two dimensional array with horizontal
rows and vertical columns. The intersection of a row and a column is a value .A value
may be a text string , a number or nothing (null value).
The name relational database comes from the technical term for a table , a relation . In
addition , the formal names for the elements of a relation are tuple (row) and
attribute(column). But these aren’t the only terms. A row is called a record and a column
is called a field.
Although files and tables have similar structures ,they are processed differently . in
standard file processing , the unit of processing is a an individual record while in db2 it is
an entire table ,which may have one ,many or no rows.

-3-

D4
DB2 ARCHITECTURE
DB2 as perceived by an individual User
user
SQL
Virtual

VIEW1 VIEW2

Real

Base B2 B3 B4
Table B1

Stored

VSAM D3 D4
Dataset D2
D1

-4-
DB2 OBJECTS

-5-
STORAGE STRUCTURE
The total collection of stored data is divided up into a number of disjoint databases . Each
database is divided into a number of disjoint spaces – several table spaces and index
spaces. A space is a dynamically extendable collection of pages ,where a page is a block of
physical storage (it is a unit of i/o i.e the unit transferred between primary and secondary
storage in a physical i/o operation. The pages are given size of either 4k or 32 k. Each
table space contains one or more stored tables . a stored table is is a physical representation
of a base table .

DATABASE:

A Database in DB2 is a collection of logically related objects –that is a logical collection


of stored tables that belong together in some way ,together with their associated indexes
and the various spaces that contain those tables and indexes . it thus contains a set of table
spaces (each containing one or more stored tables ) together with a set of index spaces
(each containing exactly one index). A given stored table and all its associated indexes
must be wholly contained within a single database. Objects are grouped together into the
same database primarily for operational reasons .Tables can be moved from one database
to another without having any logical impact on user or user programs.

TABLE SPACES:

A table space can be thought of as a logical address space on secondary storage that is use
to hold one or more stored tables (logical because it is typically not just a set of physically
adjacent areas) . As the amount of data in those table grows (or as the number increases),
so storage will be acquired from the appropriate storage group and added to the table space
accordingly. One table space can be approx. 64 billion bytes in size and there is effectively
no limit to the number of table spaces in a database ,nor to the no. of databases .
Fundamentally the table space is the storage unit for recovery and re-organization
purposes; i.e. it is the unit that can be recovered via the recover utility . If the table space
is very large , however , recovery and re-organization could take a very long time.DB2
therefore provides the option to partition a large table space into simple , partitioned and
segmented forms.
A simple table space can contain more than one stored table , though one is usually the
best option. One reason for having more than one is that stored records for different stored
tables can be clustered together in such a way as to improve access times to logically
related data. In particular , certain join queries would then be handled more efficiently
since the I/O operations would be reduced.
Partitioned table spaces are intended for stored tables that are sufficiently large. A
partitioned table space thus contains exactly one stored table , partitioned in accordance
with value ranges of a partitioning columns or column combination. Individual partitions
of a partioned table space are independent of each other in the sense that they can be
independently recovered or reorganized.
They can also be associated with different storage groups i.e. it is possible to store
different partitions on different devices and thereby spread the table space I/O load.

-6-
SEGMENTED TABLE SPACE:

Like simple table spaces, they can contain any number of stored tables ;unlike simple table
spaces , however, segmented table spaces do not support any kind of cross table
clustering- that is they do not allow records for different stored tables to be interleaved on
a single page. Instead , they keep the tables physically separated.

INDEX SPACE

An index space is to an index what a table space is to a stored table. However, since the
correspondence between indexes between indexes and index spaces is always one to one,
there are no data definition statements for index spaces; instead, the necessary index space
parameters are specified on the corresponding index definition statements. Thus for e.g.
there is no create index space; instead the index space is created automatically when the
corresponding index is created. Like Table spaces ,index spaces can be reorganized and
recovered independently.

INDEX

Index in db2 are based on a structure known as B-tree. A b-tree is a multilevel, tree
structured index with the property that the tree is always balanced; i.e. all its leaf entries in
the structure are equidistant from the root of the tree, and this property is maintained as
new entries are inserted into the tree and existing entries are deleted. As a result, the index
provides uniform and predictable performance for retrieval operations. A given table can
have any number of associated indexes. Also if the installation follows our
recommendation that every base table have a primary key, then every stored table will
have at least one index, namely the primary index.
To perform an exhaustive search on a given stored table according to a given index, the
data manager will access all records in that stored table in the sequence defined by that
index and since that sequence may be quite different from the stored table’s physical
sequence, a given data page might be accessed many times. It follows that an exhaustive
search via a physical sequence unless the index concerned is a clustering index for which
the sequence defined by the index is the same as, or close to, the physical sequence. The
index is used to control the physical placement of the indexed records. Clustering indexes
are extremely important for optimization purpose. The optimizer will try to choose an
access path that is based on a clustering index, if one is available, and if clustering
sequence is appropriate for the sql under consideration.

-7-
STORAGE GROUPS:

A storage group is a named collection of direct access volumes, all of which are the same
device type. Each table space and each index space normally has an associated storage
group. When storage is needed for the space or partition , it is taken from the specified
storage group. Within each Storage group, spaces and partitions are stored using VSAM
linear data sets .

SYNONYM:

Is an alternate name for a base table or view. Each user can assign his or her own
synonyms for any table or view that was created by some other fully qualified name.

VIEW:

It is a predefined selection of data in base tables. It is a virtual table that does not
physically exist but is processed as a table. It is derived from one or more base tables, or
views or combinations of views and tables. The view definition is stored in the DB2
catalog. Changes to the data in the view can change the data in the base table.

CATALOG :

Contains information about every DB2 object that is maintained by it. Whenever a DB2
object is created, dropped or altered by any ddl statement or whenever control over the
object is changed by any dcl statement of sql, an entry is made in the appropriate catalog
table. Catalog table can be accessed using sql, they however cannot be updated by a
user(but done so by the system).
To retrieve information from the DB2 catalog, the select privilege on the catalog is
needed. Some of the catalog tables are :
• SYSTABLES
• SYSSTOGROUP
• SYSTABLESPACE

-8-
COMPONENTS OF DB2

-9-
SYSTEM STRUCTURE:
The major components of DB2 are
• SYSTEM SERVICES
• LOCKING SERVICES
• DATABASE SERVICES

RELATIONAL DATABASE SYSTEM

MVS OPERATING SYSTEM

TSO IMS CICS


Subsystem Subsystem Subsystem

DB2 Subsystem
RESOURCE DATABASE SYSTEM
LOCK SERVICES SERVICES
MANAGER
Access Data Perform logging
Isolate Users Create Objects Manage Log
Locking Services DB2 Catalog Inventory Manage
Threads

- 10 -
SYSTEM SERVICES:
This component handles all system wide tasks including connections to other MVS
subsystems.
• It also handles system startup and shut down operator communication.
• Managing the system log : The system log is a set of predefined disk data sets that are
used to record information for recovering user and system data in the event of a failure.
When an active log data set becomes full (or on operator command ), DB2 switches to
a new data set and copies the old one to an archive log data set on disk or tape. When
the active data sets are full, they are recycled. Information regarding the data sets
themselves is recorded in the BOOT STRAP DATA SET (BSDS).
• Gathering system wide statistics, performance and accounting information.

LOCKING SERVICES:
Provided by an MVS subsystem called the IMS Resource Lock Manager (IRLM). Despite
the “IMS” in its name , the IRLM does not really have anything to do with it. It is a
general purpose lock manager which controls the concurrent access to db2 data.

DATABASE SERVICES:
The primary purpose of the database component is to support the definition , retrieval and
update of database data- to implement the functions of the sql. The necessary support is
provided by a series of five sub components :
• Precompile
• Bind
• Runtime Supervisor
• Data Manager
• Buffer Manager

Together these components support the preparation of the application programs for
execution and the subsequent execution of those programs. The functions of the individual
components are as follows:

PRECOMPILER

This component is a preprocessor for the host language. Its function is to analyse the host
language source module stripping out all the sql statements it finds and replaces them by
the host language CALL statements. From the sql it encounters, the precompiler constructs
a database request module which becomes the input to the bind component.

- 11 -
THE PROCESS:

• Searches for and expands DB2 related include members .


• Searches for sql statement in the body of the programs source code .
• Creates a modified version of the source program in which every sql statement in the
program is commented out and a call to the db2 runtime interface module along with
applicable parameters , replaces each original sql statement.
• Extracts all sql statements from the program and places them in a dbrm to ensure that
these two items are inextricably tied.
• Reports on the success or failure of the precompile process.
• The precompiler searches for sql statements embedded in EXEC SQL and END-
EXEC.

PRECOMPILATION
COBOL SOURCE
MODULE P

- Source listing
PRECOMPILER - Diagnostics
- Cross-References
etc.

-
-
- -
CALL Long-Interface -
-
- -
Modified -COBOL - DBRM for P
-
Source Module P. (SQL statements, etc.)

 It uses the SQL statements to build a DBRM for P, which it


stores away as a member of an MVS partitioned data set
 DBRM contains a copy of the original SQL statements,
together with additional information

- 12 -
BIND:

This component is the compiler for the sql statements. It reads the sql statements from the
DBRM’S and produces a mechanism to access data as directed by the sql statements being
bound. The bind plan accepts as input one or more DBRM’s produces from the previous
DB2 program precompilation. The output of the bind plan is an application plan
containing executable logic representing optimized access paths to DB2 data. An
application plan is executable only with a corresponding module. Before u can run a DB2
program, regardless of the environment, an application plan name must be specified.
The major functions of bind are:
• Parsing and syntax checking
• Optimization
• Authorization
• Check that DB2 tables and columns being accessed conforms to the corresponding
DB2 catalog information.

BIND PLAN vs. BIND PACKAGE:

In some cases however, a DBRM may not be bound directly into a plan. Instead it may
first be bound into a package and then finally, the packages may be bound into a plan.
There are certain disadvantages of directly binding the DBRM’s into an application plan :

• If an individual DBRM needed to be recompiled for any reason(for eg some index was
dropped),the entire plan had to be recompiled and rebound.
• If multiple plans involved the same DBRM , that same DBRM had to be compiled
multiple times –and , if that DBRM ever needed to be recompiled , then all relevant
plans had to be recompiled and rebound in their entirety.
• Adding a new DBRM to an existing plan required (again) a recompilation and rebind
of the entire plan.
• Partly as a consequence of the foregoing points, bind and rebind times are becoming
unacceptably high in some DB2 installations and availability was suffering as a result.

The package concept was introduced to remedy these deficiencies. If a given DBRM
needs to be recompiled, all that has to be done is an appropriate package bind –it is not
necessary to recompile the entire plan. Indeed, it may not be necessary to do a new plan
bind to incorporate the new package either.

- 13 -
RUNTIME SUPERVISOR:
The runtime supervisor is resident in main memory when the application program is
executing. Its job is to oversee that execution. When the application program requests
some database operation to be performed (wishes to execute some sql), control first goes
to the runtime supervisor, which uses control information in the application plan to request
the appropriate operations on part of the data manager.

DATA MANAGER:
The data manager can be thought of as a very sophisticated access method. It performs all
of the normal access method functioning such as search, retrieval, update , index
maintenance etc. Broadly speaking, the data manager is the component that manages the
physical database(s). It invokes other system components as necessary in order to perform
detailed functions such as locking, logging, I/O operations etc during the performance of
its basic task.

BUFFER MANAGER:
The buffer manager is the component responsible for physically transferring data between
external storage and virtual memory. It employs sophisticated techniques to get the best
performance out of the buffer pools under its care and to minimize the amount of physical
I/O actually performed.

DB2 Application Programs Preparation & Execution :-


(Control Flow Diagram Overview)

Source
Module
Modified
Source Precompiler DBRM
Module

Compiler BIND

(Load Module)
Object application plan Appln.
Module Plan

Linkage Runtime
Editor Supervisor

Data
Manager DB
Load
Module Buffer
Manager

- 14 -
DATATYPE FUNCTIONS
AND
OPERATORS

- 15 -
INTRODUCTION:

The basic data object is the scalar value ;the object appearing at the intersection of a given
row and a given column of a given table. Each scalar value is of some scalar data type. For
each such data type, there is also an associated format for writing literals of that type.
Scalar objects can be operated upon by means of certain scalar operators. DB2 also
provides certain scalar functions which can be regarded as scalar operators. Scalar objects
and operators can be combined to form scalar expressions.

DB2 DATATYPES
DATA TYPE DESCRIPTION VALURANGE STORAGE
REQUIRED
INTEGER Full word
binary integer 1 - 2 billion 4 bytes
SMALL INT Half word
binary integer 1-32767 2 bytes
DECIMAL(p,q) p-total no. of 0<P<16 packed
digits 0<Q<P decimal
q - digits to right format
of decimal point
FLOAT long floating 5.4E - 79 to 8 bytes
point 7.2E + 75
CHARACTER(n) Fixed length 0<n<255 upto 254
string bytes

- 16 -
DB2 DATATYPES
DATA TYPE DESCRIPTION STORAGE RANGE
REQUIRED VALUE
VARCHAR(n) Variable length 0<n4097 upto
string 4096
GRAPHIC(n) Fixed length 0<n<123 upto 254
string of exactly n bytes
16-bit character
DATE Year 0001-9999 4 bytes
Month 1 - 12
Day 1-n
TIME HOUR 0 - 24 3 bytes
MINUTES 0 - 59
SECONDS 0 - 59
TIMESTAMP Seven Part value contains 10 bytes
Data, TIME and Microseconds
YYYY mmddnhmmssnnnnnn

- 17 -
CONSTANTS & LITERALS

Integer : Signed or unsigned decimal


integer with no decimal point
Eg : 4 -15 +364 0

Decimal : Signed or unsigned decimal


number with decimal point
Eg : 40 -95.7 +364.05 0.0007

Float : Written as a decimal constant,


followed by a character E,
followed by an integer constant
Eg : 4E3 -95.7E46 +364E.5 07E1

Character : Written either as a string of characters


string enclosed in a single quotes or a string of pairs
hexadecimal digits enclosed in single quotes
preceded by letter X.
Eg : ‘123MAIN St.’
‘PIGON’
X ‘FIF2F3F40D156247’
Graphic: : Written as a string of double byte characters
string preceded by the character ‘<‘ and followed
by ‘>‘ the whole enclosed in single quotes and
preceded by letter G.
Eg : G ‘<STRING>‘

- 18 -
NUMERIC OPERATORS:

ARE : +,-,*,/

FUNCTIONS:

• CONCATENATION ( ): can be used to concatenate two character strings or two


graphic strings. E.g (INITIALS || LASTNAME)

• CHAR: Converts a date, time ,timestamp to its character string representation.

• DATE: Converts a scalar value to a date.

• DAY: Extracts the day portion of a date or a timestamp.

• DAYS: Converts a date or timestamp to a number of days.

• DECIMAL: Converts a number to decimal representation.

• DIGITS: Converts a number(decimal or integer) to a character string representation.

• FLOAT: Converts a number to a floating point representation.

• HEX: Converts a scalar value to a character string representing a internal HEX code

• HOUR: Extracts the hours portion of a time or timestamp.

• INTEGER: Converts a number to integer representation.

• LENGTH: Computes the length of the scalar value in bytes .

• MICROSECOND: Extracts the microsecond portion of a timestamp.

• MINUTE: Extracts the minute portion of a date or timestamp.

• SECOND: Extracts the second portion of a date or timestamp.

• SUBSTR: Extracts a part of the string from a string.


Assume = ‘AARTI’
e.g. SUBSTR(SNAME,1,3) would extracts ‘AAR’

• TIME: Converts a scalar value to time.

• TIMESTAMP: Converts either (a) Single scalar value or


(b) a pair of scalar values (representing
a date and time resp.) to a timestamp
• VALUE: Converts a null value into a non null value.
e.g. Select custno,fname,lname,value(‘(H):’||homeph,’(w):’|| workph, ’no phone’)

- 19 -
• VARGRAPHIC: Converts a character sting to a graphic string.

• YEAR: Extracts the year portion of a date or timestamp.

SPECIAL REGISTERS

Each individual user is assigned an authorization ID. That id is used to sign on to the
system, and serves as the primary id for the user in question. Tables and other objects that
are purely private to that user will typically be created under the control of, and hence be
owned by the primary id.

Each functional area(eg each department) in the organization is also assigned an


authorization id. However, that id is typically not given the sign-on authority; users sign
on to the system under their primary id. Once signed on, users can operate under their
primary id or using the sql statement (SET CURRENT SQLID) they can switch to a
secondary id. An external subsystem such as IBM’s RACF keeps track of the secondary
id(s) that can legitimately be used by a given primary id. A given primary id can have any
number of secondary ids also that the same secondary id can be used by any number of
primary ids.

SCALAR EXPRESSIONS

Six types of scalar expression are


• Numeric
• Character String
• Graphic String
• Date
• Time
• Timestamp expressions

Assignments :
Assignment operations are performed when values are retrieved from the DataBase or
stored into the DataBase.

Comparisons :
The general form of a comparison is
• Comparand operator comparand where
 Two comparands must be compatible
 The operator is any of the following
> , <, = ,>= ,<= , ~=

- 20 -
Examples
WEIGHT * 454=1000
SUBSTR (PNAME,1,1) = ‘C’
FINAL_DATE = REVIEW DATE - CURRENT DATE
SUM (QTY) =500

NULLS :
Means missing information. The problem with missing information is that is frequently
encountered in the real world. For example , historical records sometimes include such
entries as “date of birth not known:; meeting agenda’s often show a speaker as “to be
announced” and police records may have entries as “present whereabouts not known”.
Hence it is desirable to have some way of dealing with such situations in our formal
database systems. SQL systems like DB2 represent missing information by means of
special markers called NULLS. For e.g. we might say loosely that the weight of some part
is null. What we mean by such a statement is that (a) the part exists (b) of course , it has a
weight but (c) we do not know what that weight is. Instead we mark that slot as null.
In general any column can contain nulls unless the definition of that column explicitly
specifies NOT NULL. If a given column is allowed to contain nulls, and a row is inserted
into the table and no value is provided for that column, DB2 will automatically place a null
in that position.
In DB2, a column that can accept nulls is physically represented in the stored database by
two columns, the data column itself and a hidden indicator column ,one byte wide ,that is
stored as a prefix to the actual data column.
NOT NULL WITH DEFAULT means that the column in question cannot contain NULLS,
but that it is nevertheless still legal to omit a value for the column on insert. If a row is
inserted and no value is provided for some column on which not null with default
applies,DB2 automatically places one of the following default values in that position:
• Zero for numeric columns
• Blanks for fixed length columns
• Empty (zero-length string) for varying length string columns.

- 21 -
EMBEDDED SQL

- 22 -
EMBEDDED SQL:

Any SQL statement that can be used at the terminal can also be used in an application
program. There are various differences of detail between a given interactive SQL
statement and its corresponding embedded form.

• Embedded SQL statements are prefixed by EXEC SQL and are terminated by END-
EXEC

• SQL statements can include references to host variables

• Any tables (base tables or views) used in the program must be declared

• After any SQL statement has been executed, feedback information is returned to the
program in an area called the SQL Communication Area (SQLCA)

• The embedded SQL SELECT statement requires an INTO clause specifying the host
variables.

- 23 -
HOST VARIABLES

- 24 -
HOST VARIABLES:
Host variables are variables declared in the working storage section and are used by DB2
when it moves data between your program and a table. They are defined according to the
rules of programming language.
Host variables can appear in SQL data manipulation statements. They are generally used
for designating a target for retrieval. Such variables can appear in the following positions :
• Into clause in select or fetch (target for retrieval)
• Where or having clause (value to be compared)
• Set clause in update (value to be assigned)
• Values clause in insert (value to be inserted)

Example

EXEC SQL
Select status , city into :status , :city
from employee
where sno = :given-sno;
END-EXEC

SQL statements can include references to host variables, such references are prefixed with
a colon to distinguish them from SQL column names

- 25 -
INDICATOR VARIABLES

- 26 -
INDICATOR VARIABLE

In general if there is a chance that the source or a retrieval operation might be null, the user
should include an INDICATOR VARIABLE in the into clause in addition to the target
variable as illustrated in the following example:

EXAMPLE :

• EXEC SQL
Select status , city
Into :status:status-ind, :city:cityind
From s
Where sno = :Given-Sno
END-EXEC.

For processing and finding out whether the fields were NULL or not, a check can be
performed as :
IF STATUS-IND < 0 THEN /*STATUS WAS NULL */

• Indicator variables can be used in the VALUES clause to insert NULL VALUES.

EXAMPLE: -

IF COLORIND < 0 OR CITYIND < 0

EXEC SQL
INSERT INTO P (PNO,COLOR, CITY)
VALUES
(:PNO,:PCOLOR:COLOR-IND, :PCITY:CITYIND)
END-EXEC.

• Each indicator variable is a half-word integer (pic S9(4) comp).


• If defined as an array (occurs clause) ,then it may be used for a list of columns with the
first occurrence of the indicator variable corresponding to the first column in that list
using a single indicator variable, Nulls can be handled for all the fields of the table.
• The indicator variable will have a negative value if the select returns a null value.
• If the indicator variable returns a value > 0, it indicates a truncated variable is the
length of the character string before truncation
• If the Indicator variables contain :
 A negative No. indicates that the column has been set to Null
 The value 2 indicates that the column has been set to Null as a result of
Data Conversion Error
 A positive or a zero value indicates that the column is not Null

- 27 -
• If a column, defined as a char data type,is truncated on retrieval because the host
variable is not large enough, the indicator Variable, contains the original length of the
truncated column

EXAMPLE: -

01 DEPT-INDICATORS.
10 DEPT-IND OCCURS 5 TIMES PIC S9(4) COMP.
EXEC SQL
SELECT DEPTNO, DNAME, MGR, HO, LOC
INTO :DCLDEPT: DEPT-IND
FROM DEPT
WHERE DEPTNO = ‘A00’
END-EXEC

• Indicator variables can appear on the right hand side of an assignment in the SET
clause to set a value to NULL.

EXAMPLE: -

IF RANKIND = < 0

EXEC SQL
UPDATE S
SET STATUS = :STATUS:STATUS-IND
WHERE CITY =’LONDON’
END-EXEC.

- 28 -
CURSORS

- 29 -
CURSORS:

A mechanism allowing an application program to retrieve a set of rows. The cursor facility
allows a COBOL program to gain addressability to individual row occurrences of a many-
rows result table

Following steps to be performed to use cursors


• Declare cursor : to define cursor
• Open cursor : to create result table
• Fetch cursor : to retrieve rows from cursor, one at a time to be
executed in a loop
• Close cursor : to close the cursor

Declaring (defining) a cursor is done in the data division of your program.


This is purely declarative in nature. Therefore no information is being retrieved from the
database yet. The Cursor name should begin with a letter and must not exceed 18
characters.
When the open statement is encountered the select in the cursor declaration is executed.
The open cursor statement not only executes the selection of data from the DB2 database
but also establishes the initial position of the cursor in the results table. One program can
have multiple cursors.

Declaring a Cursor
EXEC SQL
DECLARE cursor CURSOR FOR SELECT col1, col2.... FROM table
WHERE condition [FOR UPDATE OF col1, col2 ..]
END-EXEC

Opening a Cursor
EXEC SQL
OPEN cursor
END-EXEC

Fetching a row from cursor :


EXEC SQL
FETCH cursor INTO : host var1, :host var2,...
END-EXEC.

Closing a cursor :
EXEC SQL
CLOSE cursor
END-EXEC.

- 30 -
Example-Based on a Single Table

Operations Involving Cursors :-

EXEC SQL
DECLARE X CURSOR FOR
SELECT S#, SNAME, STATUS
FROM S WHERE CITY = :Y
END-EXEC.

EXEC SQL
OPEN X
END-EXEC.

LOOP UNTIL NO MORE ROWS OR ERROR

EXEC SQL
FETCH X INTO :S#,:SNAME.:SNAME-IND
END-EXEC.

PROCESSING STATEMENTS .

EXEC-SQL
CLOSE X
END-EXEC.

- 31 -
DATA MODIFICATION:

Often an application program must read data and then based on its value, either update or
delete data. One can use the UPDATE or DELETE SQL statements to modify and delete
rows from DB2 tables. These statements are similar to select statements which operate on
a set of data at any given point of time.

This is accomplished with a cursor and a special clause of the update and delet statements
usable only by embedded SQL namely :WHERE CURRENT OF. The cursor is declared
with a special FOR UPDATE OF CLAUSE.

A for update of clause appears with:

• Select statement to indicate what columns can be updated when retrieved


• The columns to be updated must be listed in the for update of clause or the declare
• You do not have to select a column to update it

Rules for Update

• The select statement must be on a single table and not on a join


• If the declare cursor statement contains a subquery it must not be on the same table as
the main query
• You cannot use distinct, group by, order by or built-in functions
• Only those columns are eligible for updation which are selected in the for.. update
clause

The current forms of UPDATE & DELETE

EXEC SQL
UPDATE table
SET field = : exp [, field = : exp]
WHERE CURRENT OF cursor
END-EXEC.

EXEC SQL
DELETE FROM table
WHERE CURRENT OF cursor
END-EXEC.

- 32 -
Example: (Use of Cursor)

EXEC SQL
Declare C1 cursor for
Select Deptno, Deptname, Mgrno From Dept
Where ADMRDEPT = :ADMRDEPT
for update of MGRNO
END-EXEC.

PROCEDURE DIVISION.

MOVE ‘A00’ TO ADMRDEPT.

EXEC SQL
OPEN C1
END-EXEC.

PERFORM 200-MODIFY-DEPT-INFO UNTIL NO-MORE-ROWS.

EXEC SQL
CLOSE C1
END-EXEC.

200-Modify-Dept-Info.

EXEC SQL
Fetch C1 into :deptno, : deptname, : mgrno
END-EXEC.

If sqlcode < 0
GO TO 9999-error-paragraph.

If sqlcode = +100
Move ‘NO’ to more-rows
Else
EXEC SQL
Update Dept Set MGRNO = ‘00000’
Where current of C1
END-EXEC.

- 33 -
DATA RETRIEVAL
Pseudocode for retrieving data from an application join ,
using cursors:

EXEC SQL
Declare Deptcur cursor for
Select Deptno, Deptname
From Dept
END-EXEC.

EXEC SQL
Declare Empcur cursor for
Select empno, salary from emp
where workdept = :hv-workdept
END-EXEC.

EXEC SQL
Open Deptcur
END-EXEC.

Loop until no more dept rows or error

EXEC SQL
Fetch deptcur into :Deptno , :Deptname
END-EXEC.

Move deptno to HV-WORKDEPT

EXEC SQL
Open Empcur
END-EXEC.

Loop until no more employee rows or error

EXEC SQL
Fetch empcur into
:empno, :salary
END-EXEC.

Process retrieved data


End Loop (2)
End of Loop (1)

- 34 -
WITH HOLD OPTION:

This is an optional specification on a cursor declaration. The significance can be


understood by considering what happens in its absence.
Suppose we need to process some large table, one row at a time by means of a curso and
update a few of the as we go. It is often desirable to divide the work up into batches and to
make the processing of each batch into a separate transaction (by issuing a separate
commit at the end of each one); thus, e.g. a table of one million rows might be processed
by a sequence of 10,000 transactions, each one dealing with just 100 rows. This way , for
e.g. if it becomes necessary to roll a given transaction back, then at most 100 updates will
have to be undone, instead of potentially as many as a million.

The problem with this approach ,however is that every time we issue a commit, we
implicitly close the cursor, thereby losing our position within the table. The first thing each
transaction has to do, therefore is to execute some re-positioning code in order to get back
to the row that is due to be processed next. And that re-positioning code can often be quite
complex, especially if the processing sequence is determined by a combination of several
columns.

If the cursor declaration specifies with hold, however, commit does not close the cursor,
instead, leaves it open, positioned such that the next FETCH will move it to the next row
in sequence. The possibly complex code for repositioning is thus no longer required.

However it is important to note that the first operation on the cursor following the commit
must be fetch. Update and delete current are illegal.

- 35 -
ERROR / EXCEPTION
HANDLING

- 36 -
WHENEVER:

The whenever statement has the syntax :

EXEC SQL
WHENEVER condition action
END-EXEC.

Where “condition” is one of the following :

NOT FOUND means SQLCODE = 100

SQLWARNING means SQLCODE >0 & NOT = 100

SQLERROR means SQLCODE < 0

“ACTION ” specifies a CONTINUE or a GO TO statement.

The whenever statement causes the program to automatically check the sqlcode. Based
on the value it finds, it takes the action you specify. Some application programmers prefer
(for structured reasons) to avoid the whenever statement, and use all specific error
checking after each SQL statement is issued.
If you omit the action for a whenever statement, the default of continue will apply for that
condition.There is no limit to the no. of whenever statements you can use.

Example using ‘WHENEVER’

1000-INQUIRY.

EXEC SQL
WHENEVER SQLERROR
GOTO 1000-UNDO
END-EXEC.

EXEC SQL
SELECT FLD1, FLD2 INTO : FLD1, :FLD2
FROM EMP-TABLE WHERE CODE = 113
END-EXEC.

1000-UNDO
DISPLAY ‘ERROR! CAN’T PROCEED’
EXEC SQL
ROLLBACK
END-EXEC.

1000-EXIT.
EXIT.

- 37 -
TRANSACTION PROCESSING

- 38 -
WHAT IS A TRANSACTION ?
A transaction is a logical unit of work. Consider the following example. Suppose for the
sake of example the parts table contains an additional column totqty representing the total
quantity for the part in question. The value of the totqty for any part is supposed to be
equal to the sum of all sp.qty values taken all over sp rows for that part. Now consider the
following sequence of operations, the intent of which is to add a new shipment
(S5,P1,1000) to the database:

EXEC SQL
WHENEVER SQLERROR GOTO UNDO
END-EXEC.

EXEC SQL
INSERT INTO SP VALUES (‘S5’,’P1’,1000)
END-EXEC.

EXEC SQL
UPDATE P
SET TOTQTY = TOTQTY + 1000
WHERE P# = ‘P1’
END-EXEC

EXEC SQL
COMMIT WORK
END-EXEC.

UNDO:
EXEC SQL
ROLLBACK
END-EXEC

FINISH:
RETURN

The insert adds the new shipment to the sp table, the update updates the totqty column for
the part p1 appropriately.

The point of example is that what is presumably intended to be a single atomic transaction
“create a new shipment” –infact involves 2 updates to the database. What is more is that
the database is not consistent even between thise two updates. Thus a logical unit of work
is not necessarily just one SQL operation ; rather it is a sequence of several such
operations, in general, that transforms a consistent state of the database into another
consistent state, without necessarily preserving consistency at all intermediate points.
Now it is clear that what must not be allowed to happen in the example is for one of the

- 39 -
two updates to executed and the other not (because then the database would be in an
inconsistent state). Ideally we would want that both the updates are done, but we cannot
have such a guarantee: there is always a chance that things will go wrong. For example a
system crash may occur between the two updates.

But a system that supports transaction processing does provide the next best thing to such
a guarantee. Specifically, it guarantees that if the transaction executes some updates and
then a failure occurs, for whatever reason before the transaction reaches its normal
termination, the those updates will be undone. Thus the transaction either executes in its
entirety or is totally cancelled.

The system component that provides this atomicity is know as the transaction manager and
the COMMIT WORK and ROLLBACK WORK are the keys to the way it works.

The COMMIT WORK operation signals successful end-of-transaction :it tells that
transaction manager that a logical unit of work has been successfully completed.

The ROLLBACK operation by constrast, signals unsuccessful end-of-transaction


indicating an inconsistent state and all the updates made by the logical unit of work must
be rolled back or undone.

SYNTAX:

COMMIT
The SQL commit statement takes the form
EXEC SQL
COMMIT [WORK]
END-EXEC.

ROLLBACK

The SQL ROLLBACK statement takes the form

EXEC SQL
ROLLBACK [WORK]
END-EXEC

- 40 -
CONCURRENCY AND LOCKING

- 41 -
THREE CONCURRENCY PROBLEMS:

DB2 is a shared system; i.e. it is a system that allows any number of transactions to access
the same database at the same time. Any such system requires some kind of concurrency
control mechanism to ensure that current transactions do not interfere with each others
operation and DB2 includes such a mechanism namely locking.
There are essentially three ways in which things can go wrong- three ways. They are
• The lost update problem
• The uncommited dependency problem
• The incorrect analysis problem

The lost update problem:

Consider the situation as shown in the figure:

The Lost Update Problem


TRANSACTION A TIME TRANSACTION B

- -

- -

FETCH R T1 -

- -

- T2 -

- FETCH R

UPDATE R T3 -

- -

- T4 UPDATE R
-

Transaction A retrieves some row R at time t1 ; transaction B retrieves the same row at
time t2; transaction A updates the row(on the basis of the values seen at time t1) at time t3;
and transaction B updates the same row (on the basis of the value seen at time t2, which
are the same as seen at time t1) at time t4. Transaction A’s update is lost at time t4,
becauses transaction B overwrites it without even looking at it.

- 42 -
The uncommitted dependency problem:

This problem arises if one transaction is allowed to retrieve (or update) a row that has been
updated by another transaction and has not yet been committed by that other transaction.
For if it has not been committed, there is always a possibility that it never will be
committed but will be rolled back instead- in which case the first transaction will have data
that now no longer exists. Consider the figure given below:

The Uncommitted Dependency Problem


TRANSACTION A TIME TRANSACTION B

- -

T1 UPDATE R
-
-

- -

FETCH R T2 -

- -

- -

- T3 ROLLBACK

- TRAN A BECOMES DEPENDENT ON


- AN UNCOMMITTED CHANGE
AT TIME T2

In this example , transaction A sees an uncommitted update at time t2. That update is then
undone at time t3. Transaction A is therefore operating on false assumption that row R has
the value as seen at time t2, where as in fact , it has whatever value it had prior to time t1.
As a result, transaction A may well produce incorrect output.

Inconsistent analysis problem:

This problem could be a simple summation problem.

- 43 -
How DB2 solves these problems:

DB2’s concurrency control mechanism is based on a technique called locking. The basic
idea of locking is simple: when a transaction needs and assurance that some of the object it
is interested in typically a database row - will not change in some unpredictable manner
while its back is turned , it acquires a lock on that object and there by to prevent them from
changing it. The first transaction is thus able to carry out its processing in the certain
knowledge that the object in question will remain in stable state for as long as that
transaction wishes it to

The two types of locks that can be placed are shared lock(S) and exclusive lock (X). We
assume that if a transaction requests a lock that is currently not available , the transaction
simply waits until it is. In practice, the installation can specify a maximum wait time ; then
if any transaction ever reaches this threshold in waiting for a lock, it “times out” and the
lock request fails ( a negative SQLCODE is returned)

COMPATIBILITY MATRIX

A
B X S

X N N

S N Y

From the compatibility matrix , two inferences can be drawn:

If transaction A holds an X lock on row R , then a request from transaction B for a lock of
either type on R will cause B to go into a wait state. B will wait until A’s lock is released.

If transaction A hold a shared lock (S) lock on row R , then


A request from transaction B for an X lock on R will cause B to go into a wait
state (and B will wait until A’s lock is released);

A request from transaction B for an S lock on R will be granted (that is , B will


now also hold an S lock on R)

- 44 -
Transaction requests for row locks are always implicit. When a transaction successfully
fetches a row , it automatically acquires an S lock on that row. When a transaction
successfully updates a row, it automatically acquires and X lock on that row. If it already
holds an S lock on that row,as it will in FETCH/UPDATE or FETCH/DELETE
SEQUENCE, then the update or delete “promotes the S lock to X level.

The lost update problem revisited

Handling the Lost Update Problem

TRAN A TIME TRAN B

- -
FETCH R T1 -
(acquires S lock on R) - -
- -

- T2 FETCH R
- (acquires S lock On R)
-
UPDATE R T3 -
(request X lock on R) -
-
Wait

Wait UPDATE R
T4 (request for X lock on R)
Wait
Wait

The above figure shows what would happen to the interleaved execution under the locking
mechanism of DB2. As one can see, transaction A’s update at time t3 is not accepted
because it is an implicit request for an X lock on R, and such a request conflicts with the S
lock already held by transaction B; so A goes into a wait state. For analogous reasons , B
goes into a wait state at time t4. Now both transaction are unable to proceed, so there is no
question of any update being lost.DB2 thus solves the lost update problem by reducing it
to another problem but at least it does solve the original problem. This problem is called
the deadlock problem, discussed later.

- 45 -
The uncommitted dependency problem revisited

Handling the Uncommitted Dependency Problem


TRAN A TIME TRAN B

- -
- T1 UPDATE R
- (X lock on R)
- -
FETCH R -
(request for S lock on R) T2 -
-
Wait
`
Wait SYNCPOINT
T3 (Release X lock
Wait on R)

resume : FETCH R T4
(X lock on R)`

Transaction A’s operation at time t2 is not accepted , because it is an implicit request for a
lock on R, and such a request conflicts with the X lock already held by B; so A goes into a
wait state and remains so until B reaches a synchpoint(either commit or rollback), when
B’s lock is released and A is able to proceed; and at that point A sees a committed value
(either the pre – B value, if B terminates and with a ROLLBACK , or the post – b value
otherwise. Either way, A is no longer dependent on an uncommitted update.

- 46 -
Deadlocks .
A 'deadlock' occurs when program A locks page X exclusively and attempts to
lock page Y, while program B has already locked page Y exclusively and attempts to lock
page X. Neither program can continue without the required lock. DB2 cancels one of the
processes (the one with the fewest log records) with a time out code. To avoid this
situation, you can:

• Use the same sequence of update. Both programs should advance along the
tables in ascending key order; this makes it less likely that they will cross each
other's paths and attempt to lock the same page. Both programs should access
various tables in the same sequence, for the same reason.

• Avoid clashing updates through different paths (if possible). The problem
with such updates can be exemplified as follows:
User A is updating page 1 using index X for access to the data;
User B is updating page 2 using index Y for access to the data.

However, each index also references the data on the other page (i.e., Index X
references data on page 2 and index Y references data on page 1).

Therefore, after the data is reached by means of one index, and after that data is
updated, each query must modify the other index so it would reflect the new,
updated data. However, that other index is still locked by the other query.

Such deadlocks are hard to avoid. You may watch out for some conditions
which make this situation more likely:

• Large number of indexes on a table, with frequent deletes or


updates.
• Multi-row updates and deletes.
• Large value used for SUBPAGES for the index.
• Use frequent COMMITs. The chance that a page needed by
program A will already be held by program B is thereby reduced.

Use cursor with FOR UPDATE OF ... This technique locks with INTENT
UPDATE, ahead of time, all the pages which you will need. This is a more
relaxed lock than an exclusive lock, but it still prevents other programs from
locking one of your pages exclusively. Thus, less contention, fewer deadlocks
and fewer time outs will occur.

- 47 -
Explicit locking facilities

In addition to the implicit locking mechanism described earlier, DB2 provides certain
explicit facilities which the programmer should be aware of. The explicit facilities
consists of : the SQL statement LOCK TABLE , the ISOLATION PARAMETER on the
bind plan command, the table space LOCKSIZE parameter and the ACQUIRE/RELEASE
PARAMETER on the bind plan.

LOCK TABLE :

Depending on the versions, this command either locks up the table or an entire table space

SYNTAX

LOCK TABLE table IN mode MODE

Where “mode” is SHARE or EXCLUSIVE & “table” must designate a base table not a
view.
Once a lock is acquired no other transaction will be able to access any part of the table in
any way - until the original lock is released. When that original lock is released depends on
the RELEASE parameter on BIND.
The Lock table statement can be used to control locks from within a DB2 application
program. Every individual page lock uses system resources in storage and processing time
A single table lock reduces the storage and processing time required by the many small
locks. This results in saving of system resources.

- 48 -
ISOLATION PARAMETER:

Specifies the isolation level for the application plan being bound.DB2 supports two
isolation levels for every transaction
• Cursor Stability (CS)
• Repeatable Read (RR)

These are used while cursor manipulation

Cursor Stability

DB2 takes a lock on the page the cursor is accessing and releases the lock on that page,
when the cursor moves onto a different page
(This is not done when we use FOR UPDATE of statement in cursor)
The lock on the last page is released at commit time (or at Thread de-allocation time)

Repeatable Read

In cursor stability, while your transaction reads data, other transaction could change the
data you have already read. In repeatable read, DB2 holds all page locks while the cursor
is moving all till the transaction commits (or till thread deallocation).
Cursor Stability provides more /higher concurrency while repeatable read provide more
consistency.

The problem with CS is that a transaction operating at that level may have a record
changed “behind its back” as in inconsistent analysis problem and so may produce a
wrong answer.

By contrast, a transaction that operates under isolation level RR, can behave completely as
if it were executing in a single user system.

- 49 -
LOCKSIZE PARAMETER:

Physically, DB2 locks data in terms of pages or tables or table spaces, depending on what
was specified as the LOCKSIZE for the relevant table space in the CREATE or ALTER
TABLESPACE operation.

For a given table space, the LOCKSIZE can be specified as PAGE, TABLE, TABLE
SPACE or ANY.

• TABLESPACE : means all locks acquired on data in the tablespace will be at the
tablespace level

• TABLE : means locks acquired on data in the tablespace will be at the table level

• PAGE : means locks acquired on data in the tablespace will be at the page level
whenever possible.

• ANY : (Which is the default) means that DB2 itself will decide the appropriate
physical unit of locking for the tablespace for each plan .Defaults to a page lock, but if
the no. of pages that are locked exceeds an installation default, DB2 does a lock
escalation and automatically locks a larger unit.

- 50 -
ACQUIRE/RELEASE PARAMETERS

DB2 always implicitly acquires locks of some kind

The acquire and release parameters on the BIND command specify when such tablespace
level locks are to be acquired and released .
For ACQUIRE, the possible specifications are
• USE & ALLOCATE;
For RELEASE, they are
• COMMIT & DEALLOCATE

While Binding, DB2 allows us to specify transaction level lock acquiring and releasing
parameters.

ACQUIRE:

ACQUIRE (use) - This option tells DB2 to take locks at the time of SQL statements
execution. This is DB2 default.

ACQUIRE (allocate) - This option tells DB2 to take all necessary locks at the start of the
transaction.

RELEASE:

Similar to the DB2 acquire parameters, there are 2 release parameters.

RELEASE (commit) - This option tells DB2 to release all locks of transaction commit
time. This is the DB2 default.
Release (de-allocate) - This option tells DB2 to release all locks only when the program
ends (i.e. the thread is de-allocated).

• All combinations except ACQUIRE (Allocate) and RELEASE (commit) are allowed.

• For more concurrency, use ACQUIRE (use) and RELEASE (commit).


This is the DB2 default.

• For better performance, use ACQUIRE (ALLOCATE) and RELEASE


(DEALLOCATE).

- 51 -
Fields of Sql Communication Area
01 SQLCA.
05 SQLCAID PIC X(8).
05 SQLCABC PIC S9(9) COMP-4.
05 SQLCODE PIC S9(9) COMP-4
05 SQLERRM.
49 SQLERRML PIC S9(4) COMP-4.
49 SQLERRMC PIC X(70).
05 SQLERRP PIC X(8).
05 SQLERRD OCCURS 6 TIMES PIC S9(9) COMP-4.
05 SQLWARN.
10 SQLWARN0 PIC X.
10 SQLWARN1 PIC X.
10 SQLWARN2 PIC X.
10 SQLWARN3 PIC X.
10 SQLWARN4 PIC X.
10 SQLWARN5 PIC X.
10 SQLWARN6 PIC X.
10 SQLWARN7 PIC X.
05 SQLEXT PIC X(8).

- 52 -
SQL WARNINGS
SQLWARN0 - This indicates that any of the following warnings
are flagged.
SQLWARN1 - This is flagged if a value is truncated when
assigned to a host variable
SQLWARN2 - This is set to ‘W’ if some of the rows weren’t
considered by a column function because the
column being processed contained null values
SQLWARN3 - Here a ‘W’ means you did not supply enough host
variables in an into clause to match the columns in
a select clause
SQLWARN4 - This flag is set to ‘W’ if a dynamic SQL update or
delete statement does not have a ‘Where’ clause
SQLWARN5 - This indicates that the SQL statement is valid only
in SQL/DS and not in DB2
SQLWARN6 - This indicates a ‘W’ when an arithmetic operation
produces an unusual date or timestamp

- 53 -
SQL WARNINGS
SQLERRD is an array of six full word items.The third of the six, contains
useful information.

 After an insert,delete or update statement, sqlerrd(3) contains the no.


rows the statement affected.

 Eg.

EXEC SQL
DELETE FROM EMP
WHERE DNO IN
(SELECT DNO FROM MASTER_DEPT)
END-EXEC.

Display ‘SQLERRD(3)’ ROWS WERE DELETED.

- 54 -
DB2 ERROR CODES
SQLCODE Keyword Meaning
+100 SELECT Row not found during FETCH,
SELECT, UPDATE or DELETE
+304 Program Value and host variable are
incompatible
-305 Variables Null value occurred, and no indicator
variable was defined
-501 Cursor Cursor named in FETCH or CLOSE is
not open
-551 Authority You lack the authority to access the
named object, possibly because its
name is not spelled correctly
-803 Updating Duplicate keys not allowed
-805 Plan DBRM not bound into this plan
-811 SELECT Embedded SELECT or sub select
returned more than one row

- 55 -
DATA DEFINITION

In this topic we will understand the two database objects and the SQL commands used in
creating, modifying or removing their definitions from the database:

TABLE
• CREATE TABLE
• ALTER TABLE
• DROP TABLE

INDEX
• CREATE INDEX
• ALTER INDEX
• DROP INDEX

TABLES

A table is a collection of rows having the same attribute or columns. In other words it is a
matrix where the intersection of every row and a column is a value , which is the smallest
unit of data that can be retrieved or changed. A row is a smallest unit of data that can be
inserted or deleted.

The structure of a table in a tablespace can be created with the following command:

CREATE TABLE TABLE_NAME


(COLUMN1 DATATYPE {NOT NULL|NOT NULL WITH DEFAULT} ,
COLUMN2 DATATYPE {NOT NULL|NOT NULL WITH DEFAULT } ,
COLUMN3 DATATYPE {NOT NULL|NOT NULL WITH DEFAULT } ,
.
.
PRIMARY KEY (COLUMN1,COLUMN2,...)
FOREIGN KEY RELATION_NAME(COLUMN3,COLUMN4,...)
REFERENCES TABLESPACE_NAME.TABLE_NAME
ON DELETE {RESTRICT|CASCADE|SET NULL})
IN DATABASE_NAME.TABLESPACE_NAME

A table name can be simple or fully qualified. A simple table name has to begin with a
character and cannot be more than 18 characters long. Except for the first character the
name can contain any alphanumeric character including the special characters (-,@,#,$). A
fully qualified table name is a simple table name prefixed by the owner’s ID . For e.g. if
the user ‘MKTG’ is creating a table ‘CUSTOMER’ then the table name will be
‘MKTG.CUSTOMER’.
e.g..
1> simple table CUSTOMER can be created with the following command;

CREATE TABLE CUSTOMER

- 56 -
(CUSTID SMALLINT NOT NULL WITH DEFAULT,
CNAME CHAR(15) NOT NULL WITH DEFAULT,
CITY CHAR(15) NOT NULL WITH DEFAULT,
PRIMARY KEY (CUSTID)
IN MKTGDATA.TABLSPC1

This command creates the CUSTOMER table which is having the CUSTID field as it’s
primary key. The NOT null with default clause prevents the column from having null
values and specifies a default value to be used if data for the specific column is missing
.The default values of a column depend on the datatype of the column. The table below
shows the default values different datatypes in DB2.

2>Table ORDER which is a child table of CUSTOMER and PARTS tables can be created
after creating both the parent tables, with the following command:

CREATE TABLE ORDERTAB


(ORDID SMALLINT NOT NULL WITH DEFAULT,
ORDERDT DATE NOT NULL WITH DEFAULT,
RATE DEC(8,2)NOT NULL WITH DEFAULT,
CUSTID SMALLINT NOT NULL WITH DEFAULT,
PARTTID SMALLINT NOT NULL WITH DEFAULT,
PRIMARY KEY (ORDID,CUSTID,PARTID),
FOREIGN KEY FKEY1(CUSTID) REFERENCES CUSTOMER
ON DELETE RESTRICT
FOREIGN KEY FKEY2(PARTID) REFERENCES PARTS
ON DELETE RESTRICT
IN MKTGDATA.TABLSPC1

The above command creates the table ORDER with ORDID, CUSTID and PARTID as
the primary keys. These keys ensure uniqueness of the record within the table. The foreign
key FKEY1 ensures that the CUSTID field in this table consists of only one of those
values which are existing in the primary key field of that table which appears in the
references clause i.e. CUSTOMER . While The foreign key FKEY2 ensures that the
PARTID field in this table consists of only one of those values which are existing in the
primary key field of that table which appears in the references clause i.e. PARTS.

Tables once created can be altered with the following command:

ALTER TABLE TABLE_NAME


[COLUMN1 DATATYPE {NOT NULL|NOT NULL WITH DEFAULT} ,]
[AUDIT (NONE|CHANGES|ALL)]
[PRIMARY KEY (COLUMN1,COLUMN2,...)]
[FOREIGN KEY RELATION_NAME(COLUMN3,COLUMN4,...)
REFERENCES TABLESPACE_NAME.TABLE_NAME]
[DROP PRIMARY KEY]
[DROP FOREIGN KEY (FOREIGN_KEY_NAME)]

e.g...

- 57 -
1>In the CUSTOMER table CREDITLIMIT field can be added with the following
command:

ALTER TABLE CUSTOMER


ADD CREDITLIMIT INTEGER NOT NULL

2> The primary key definition of the ORDER table can be modified in the following way:

ALTER TABLE ORDERTAB


ADD PARTNO SMALLINT NOT NULL
DROP PRIMARY KEY
PRIMARY KEY (PARTNO)

INDEXES

They are used for efficient access to data . It is an ordered set of pointers to the data in a
DB2 table. The rows in a table are stored in a particular order . But when the data has to be
accessed, the entire table needs to be scanned due to which the performance of the system
would be affected. Here indexes can prove very significant as they provide alternate means
of access to the data thereby decreasing the I/O and the processing time. A user cannot by
any means tell the system when to use or when not use the indexes. It is only the DB2
optimizer which makes the choice at BIND time.

An index is a physical entity which can be based on one or more columns of a table and
can be created any time after the table is built. There is no limit on the number of indexes
which can be built on a table , but a partitioned table must have atleast one index . This is
called as a PARTITIONING INDEX and is used to define the scope of each partition and
thereby assign the rows of the table to their respective partitions.

The second type of index is a CLUSTER INDEX . When a cluster index is created on a
table the rows of that table are physically stored in the order of those fields which have
been mentioned in the cluster index. The order of these rows can also be specified as
ascending or descending. There can be only one cluster index per table. If no such index is
created, it loads the data in order of the first index field defined.

UNIQUE INDEXES in DB2 can be created to disallow the user from entering duplicate
values into a field constrained by the unique index. For e.g... the primary key field of a
table must have unique values in all the rows, this relational datamodel concept is well
supported by a Unique Index on the primary key field.

Indexes can be created with the following command :

CREATE [UNIQUE] INDEX INDEX_NAME


ON TABLE_NAME (INDEX_COLUMN_NAME [ASC|DESC] [,...])
[CLUSTER [(PART INTEGER VALUES (CONSTANT OR RANGE))]]
[SUBPAGES (1|2|4|8|16) ]
[BUFFERPOOL (BP0|BP1|BP2)]

e.g.

- 58 -
The following command creates a unique clustering index CUSTIDX on the field
CUSTID of the CUSTOMER table:

CREATE UNIQUE INDEX CUSTIDX


ON CUSTOMER (CUSTID)
USING STOGROUP MKTG
SUBPAGES 8
CLUSTER

Each indexpage physically consists of 4k bytes, which can be further divided into a
maximum of 16 subpages . The subpage is a logical unit of locking an index. Therefore
when a subpage is used ,smaller amount of storage is locked for individual update. Thus
fewer users are kept waiting at a given point of time and this causes increase in
concurrency . The SUBPAGE clause defines the number of subpages into which one index
page can be divided. The CLUSTER clause can be used to create clustering index, which
is the fastest path for sequential access.
Definition of an index can be altered by the following command:

ALTER INDEX INDEX_NAME


[PART (INTEGER)]
BUFFERPOOL(BP1|BP1|BP2)]

INDEXSPACE

As table data is stored in a tablespace, index data is physically stored in indexspace. This
indexspace contains an ESDS for every index . If the indexspace is partitioned ,then each
partition has its own indexspace. It is divided in pages of the size 4k and further these
pages are divided into 1 to 16 units which are called subpages. This subpage as discussed
earlier is the unit of locking.
Apart from this, unlike a tablespace an indexspace need not be explicitly defined as it gets
created automatically whenever an index is created.

Exercise

1. Create the following tables

SALESMAN CUSTOMER PARTS ORDER

Slid Custid Partid Ordid


Name Cname Description OrderDate
Region City Act_Rate Rate
Min_Stock Units
Partid
Custid
Slid

2. Create primary key indexes for all the above created tables so as to enforce
uniqueness for these keys.

- 59 -
3. Describe the different types of indexes and their advantages with the create index
command
4. Add a new numeric field credit limit in the customer table with the alter command.
This field should have a default values and cannot be left null.
5. What will be the effect of the following drop table command on the other objects in
the database which are related to the table ORDERS.
Drop Table Orders.

RETRIEVAL OPERATIONS

In this topic we will discuss SQL statements for performing the following operations on
the data described below :

• SELECT
• INSERT
• UPDATE
• DELETE

SAMPLE DATABASE OF ABC Inc.

CUSTOMER TABLE

CUSTID CNAME CITY


1005 Spectrum Mumbai
1004 Lloyds Madras
1001 Pentasys Delhi
1002 Hertz Mumbai
1003 Procter Pune

PARTS TABLE

PARTID DESCRIPTION ACT_RATE MIN_STOCK


101 Nuts 4.00 50000
106 Bolts 2.50 15000
104 Nails 5.00 12500
105 Latches 15.00
103 Hinges 6.00 12000
102 Handle 15.00 8500

ORDERTAB TABLE

ORDID ORDERDATE RATE UNITS PARTID CUSTID


201 O1/12/96 3.00 300 101 1001
202 12/11/95 5.50 250 103 1001
206 06/22/96 4.50 375 104 1002

- 60 -
204 04/1896 3.75 175 101 1004
205 05/05/96 13.50 235 102 1003
203 11/0896 5.50 180 103 1003

SELECT STATEMENT

SQLgives the SELECT statement to retrieve the data from the database.This statement has
different clauses attached to it which have been explained further. Unlike the other SQL
statements INSERT, UPDATE, DELETE this statement can work on multiple tables at
the same time.
The basic syntax of the SELECT statement

SELECT {ALL /DISTINCT/ scalar-expression/s}


FROM table/s
[ WHERE conditional-expressions]
[ GROUP BY column/s]
[ HAVING conditional-expressions]
[ ORDER BY column/s ]

Let us now proceed with the different types of SELECT statements :

5.1.1 SIMPLE SELECT

To retrieve all the columns of the CUSTOMER table the following SELECT statement
will have to be executed:

SELECT * FROM CUSTOMER

RESULT(screen dump)

The above result indicates that the statement has retrieved all the records of the
CUSTOMER table and all the fields too. Whenever a ‘*’ is given by default all the
columns re selected in the sequence in which they have been defined while creating the
table. If one has to retrieve only specific column the SELECT statement can be modified
as below :

SELECT CNAME, CITY

- 61 -
FROM CUSTOMER

RESULT(screen dump)

SELECT ELIMINATING UNIQUE / DISTINCT VALUES


If only unique values have to be retrieved from the database the word DISTINCT should
be specified before the column list. For e.g. if different CITY values from the
CUSTOMER tables have to be retrieved the following SELECT statement can help :

SELECT DISTINCT CITY


FROM CUSTOMER

RESULT(screen dump)

Without DISTINCT i.e. by default the above SELECT statement would retrieve the same
CITY multiple times i.e. ALL

COMPUTED VALUES RETRIEVAL

Many a times it happens that the data which is in the database has to represented in a
different way. For e.g. if one has to retrieve the actual value(RATE * QTY) of every part
sold to every individual customer from the PART_CUST table the statement will be:

SELECT PARTID,CUSTID, RATE*UNITS


FROM ORDERTAB

RESULT(screen dump)

- 62 -
Computed retrieval might also be required when an expression has to be used on any of
the field or fields while retrieving . For e.g. If the DESCRIPTION value of the part has to
be in upper case while the value is stored in lower/mixed case the following SELECT
statement is valid:

SELECT PARTID, UPPER(DESCRIPTION)


FROM PARTS

CONDITIONAL RETRIEVAL

For retrieving data of specific criteria or multiple criteria’s which can be combined
through one SELECT statement, the statement has to use a WHERE condition. Some
examples below will explain the different ways of using WHERE.

E.g 1

SELECT CNAME, CITY


FROM CUSTOMER
WHERE CITY = ‘Mumbai’

screen dump

E.g 2

- 63 -
SELECT ORDID, ORDERDT
FROM ORDERTAB
WHERE CUSTID = 1001 AND
UNITS > 200

screen dump

The above SELECT statement shows the data satisfying multiple conditions on the
TABLE ORDERTAB.

E.g 3

SELECT *
FROM PARTS
WHERE ACT_RATE < 4 OR
MIN_STOCK > 10000

screen dump

The above results shows that those records which are satisfying any one of the two
conditions are selected.

The WHERE condition can have different operators to compare the value of the fields. the
numeric operators (<, >, = , <>,>=,<=) are used for comparing numeric values but apart
from these there are some more operators as explained below :

BETWEEN

- 64 -
The BETWEEN condition is used to define a range for comparing numeric values.
It can be considered as a shortcut rather than involving two conditions with AND.

SELECT *
FROM PARTS
WHERE MIN_STOCK BETWEEN 10000 AND 20000

screen dump

The same result as shown in the above screen can also be obtained by the following
query :

SELECT *
FROM PARTS
WHERE MIN_STOCK > 10000 AND
MIN_STOCK < 20000

With the BETWEEN operator a NOT condition will retrieve exactly the opposite data as
shown below :

SELECT *
FROM PARTS
WHERE MIN_STOCK NOT 10000 AND 20000

IN

The IN operator is an alternative for the OR used for individual conditions. when a
field has to be compared against a set of given values the IN operator is useful.

SELECT *
FROM ORDERTAB
WHERE PARTID IN (101, 104)

- 65 -
screen dump

The above result can also be obtained without an IN operator with the following SELECT
statement :

SELECT *
FROM ORDERTAB
WHERE PARTID = 101 OR
PARTID = 104

screen dump

The NOT IN operator is used for deselecting the above data and selecting the data whose
PARTID value is anything except 101, 104.

SELECT *
FROM ORDERTAB
WHERE PARTID NOT IN (101, 104)

screen dump

LIKE

- 66 -
LIKE operators are used for comparing a value which either begins with certain
characters, or ends with certain characters or contains certain characters. the followings
examples deal with different ways of using the LIKE operator

SELECT DESCRIPTION
FROM PARTS
WHERE DESCRIPTION LIKE ‘N%’

screen dump

The result shows only those PART records which have a name starting with the alphabet
N.

SELECT DESCRIPTION
FROM PARTS
WHERE DESCRIPTION LIKE ‘%S’

This command will retrieve only those records from the PART table which has the last
alphabet as S irrespective of what the other alphabets in the DESCRIPTION field.

As we observed in the above examples, a % stands for any characters. Similarly _


(underscore) stands for a single character. This can be used to search a particular
character in a string at a particular position.

SELECT *
FROM CUSTOMER
WHERE CNAME LIKE ‘_e%’

screen dump

This SELECT statement retrieves only those records which have the alphabet e in its
second position of the CNAME field.

- 67 -
Using a NOT LIKE in place of LIKE in the last SELECT statement will retrieve all the
other records from the CUSTOMER table except those displayed in the above screen
dump.

CHECKING FOR NULL VALUES

For retrieving values which contain a NULL the operator IS has to be used. For
checking a NULL value, this is the only valid operator as any other operator like >, <, =
etc. will always give an errorneous result of the comparison.

The examples below are invalid

SELECT *
FROM PARTS
WHERE MIN_STOCK > NULL

OR

WHERE MIN_STOCK < NULL

OR

WHERE MIN_STOCK = NULL

If one need to retrieve the records having MIN_STOCK value as NULL, the SELECT
statement should be as follows :

SELECT *
FROM PARTS
WHERE MIN_STOCK IS NULL

screen dump

To check the values which are not NULL, the word NOT can be used with the IS operator
as shown below :

SELECT *
FROM PARTS
WHERE MIN_STOCK IS NOT NULL

screen dump

- 68 -
JOINS

A SELECT statement has the ability to select from multiple tables at once. So this
feature is called as a JOIN whereby two tables are selected through the same statement. In
other words they are JOINS.

DB2 offers different types of JOINS as explained below :

Simple Join

According to Relational Database concepts, a column is responsible for creating a


relationship between two tables. As in our sample database the PARTID column relates
the ORDER table to the PARTS table. Similarly the CUSTID column relates the ORDER
table to the CUSTOMER table. A JOIN can be used for selecting data from any of these
tables at the same time through one SELECT.

SELECT CNAME, RATE, UNITS


FROM CUSTOMER, ORDERTAB
WHERE CUSTOMER.CUSTID = ORDERTAB.CUSTID

screen dump

With the above query, one gets the records belonging to the CUSTOMER table as well as
the ORDER table. The CUSTID field being the common fields in these two tables needs
to be compared, while selections from these tables have to take place. It is actually the
WHERE condition which compares the data of these two tables having the same values in
their CUSTID fields. It is necessary that this field has the same datatype and datawidth in

- 69 -
both the tables, but the name of the field may differ in the two tables. Apart from this the
statement may also have some more WHERE conditions as explained below :

SELECT CNAME, RATE, UNITS


FROM CUSTOMER, ORDERTAB
WHERE CUSTOMER.CUSTID = ORDERTAB.CUSTID AND CITY = ‘ Mumbai’

screen dump

Both these statements above are checking for the equal value of CUSTID fields that they
can be termed as EQUIJOINS. When one SELECT statement retrieves the data from the
multiple tables, each pair of table should be bound by a JOIN between them. In short, the
no. of JOINs in a SELECT statement are one less than the number of tables being selected
from.

Self Join

As one can select from two tables through one SELECT statement, it is also
possible to select one table twice in the same SELECT statement. Here is an example
which finds all the CUSTOMER records belonging to the same CITY

SELECT A.CNAME, B.CNAME, A.CITY


FROM CUSTOMER A, CUSTOMER B
WHERE A.CITY = B.CITY

screen dump

In the above SELECT it appears as though two different tables A, B are being selected
from. In reality A and B are nothing but aliases used for the same table CUSTOMER.
This means the first selection of the CUSTOMER table is given a name A while the
second selection is given a name B.

- 70 -
We also observe in our screen dump that a value like ‘Spectrum’ is selected twice from
selection A as well as selection B because each value shows up once in each alias and the
predicate of these values is symmetrical. the other records like ‘Hertz’, ‘Procter’ are also
repeated because of the same reason. To avoid this, one has to add a WHERE condition
which checks for non matching predicates as shown in the following SELECT statement :

SELECT A.CNAME, B.CNAME, A.CITY


FROM CUSTOMER A, CUSTOMER B
WHERE A.CITY = B.CITY
AND A.CNAME< B.CNAME
screen dump

SUB-QUERIES

A SELECT statement can also appear as a clause of another SELECT statement.


Such SELECT statement are called NESTED queries. The inner queries returns a value
which is compared with a value of the outer query. In other words the outer query is
executed after the inner query. Such type of nesting can be done upto multiple levels.

The example below selects the CNAME and the CITY for those CUSTOMERS who have
ordered for PART 103. Here the outer query selects from the CUSTOMER table as the
data to be displayed belongs to the same table. While the inner query finds the CUSTID
value from the ORDER table where the PARTID value is 103 :

SELECT CNAME, CITY


FROM CUSTOMER
WHERE CUSTID IN (SELECT CUSTID
FROM ORDERTAB
WHERE PARTID = 103)

screen dump

The difference in NESTED queries and JOINS is that one needs to give a JOIN only if the
data to be extracted belongs to multiple tables. Nested queries are generally used when the
data belongs to one table while the condition has to be checked in some other table.

- 71 -
Performance-wise JOINS are preferred as NESTED queries execution is slower than
JOINS (according to IBM Manuals ).

In our example the CUSTID is common between the two tables (ORDER, CUSTOMER).
It is this field’s value which is checked between the two queries.

The next example discusses nesting of queries upto multiple levels. Irrespective of the
number of levels to which nesting is done, the execution of the queries remain the same. In
other words, it is always the innermost query which will be executed first and then the
enclosing query and so on.

To select the details of the PARTS which are ordered by the CUSTOMER ‘Hertz’, the
following NESTED query can be given :

SELECT *
FROM PARTS
WHERE PARTID IN (SELECT PARTID
FROM ORDERTAB
WHERE CUSTID = (SELECT CUSTID
FROM CUSTOMER
WHERE CNAME = ‘Hertz’))
screen dump

CORRELATED SUB QUERY

A Correlated sub query is the one whose value depends upon a variable which
receives a value from some outer query. Such sub-query has to be repeatedly executed for
every new value of the variable. These types of queries can be used as an alternate
solutions for JOINS.

The following example selects all the CNAME values from the CUSTOMER table who
have ordered for PART 103.

SELECT CNAME
FROM CUSTOMER
WHERE 103 IN (SELECT PARTID
FROM ORDERTAB
WHERE CUSTID = CUSTOMER.CUSTID)

screen dump

- 72 -
Execution of such correlated queries take place in the following way :

1. The currently selected row from the outer query is stored.


2. The sub-query is performed where a field from the row stored in first step is used for
selecting specific rows from the table mentioned in this query (Outer reference).
3. The value from the outer query is evaluated on the basis of the results of the sub-query
execution taken place in step 2. After this evaluation whether the record from the
outer query should be selected or no is determined.
4. The same three steps mentioned above are performed for the next row of the outer
query till all the rows of the table are evaluated.

CORRELATED QUERIES ON THE SAME TABLE

Correlated sub- queries can also be fired on the same table as the outer queries
table. These types of queries are useful in selecting data from a single table which is
evaluated against some other data of the same table.

This example selects all the details from the PART table whose ACT_RATE is
higher than the ACT_RATE of the PARTS ‘Nails’.

SELECT *
FROM PARTS
WHERE PARTID IN (SELECT PARTID
FROM PARTS
WHERE DESCRIPTION = ‘Nails’)

screen dump

OTHER OPERATOR (EXISTS, ANY/ SOME, ALL )

EXISTS

This operator produces a Boolean results. Unlike the other operators where a value
is used, it takes only a sub-query as an argument. The sub-query is evaluated to true or
false and only if the result is true, the outer query which has the EXISTS operator
produces certain output. The query given below extracts data from the PARTS table only
if there are more than one part having MIN_STOCK value > 5000.

SELECT *
FROM PARTS
WHERE EXISTS
(SELECT *

- 73 -
FROM PARTS
WHERE MIN_STOCK > 5000)

screen dump

ANY/SOME

The operators ANY and SOME are interchangeable as the result produced by
them is just the same. The operator ANY / SOME takes all the values produced by the
sub-query and evaluates to true if ANY of these values are equal to the value of certain
field of the record selected in the outer query. It is necessary to select a field of the same
datatype as the one against which it is being compared in the main query.

This example selects all the records from the CUSTOMER table whose CUSTID appears
in the ORDER table.

SELECT *
FROM CUSTOMER
WHERE CUSTID = ANY
(SELECT CUSTID
FROM ORDERTAB)

ALL

As the ANY operator checks for the OR condition , the ALL operator can be checking
the AND operator. In other words, it checks for each and every value which is selected by
the inner query for satisfying a condition on a field of the outer query.

The query below selects all those ORDERS where UNITS ordered are greater than all the
UNITS ordered by the CUSTOMER 1003.

- 74 -
SELECT *
FROM ORDERTAB
WHERE UNITS> ALL
(SELECT UNITS
FROM ORDERTAB
WHERE CUSTID = 1003)

screen dump

The above statement examines UNITS value of all the ORDERS placed by CUSTOMER
1003. It then find out the ORDERS having UNITS more than the highest UNITS of
CUSTOMER 1003 i.e. 235.

The NOT operator can be attached to all the operators discussed above as in the operators
such as IN, LIKE, BETWEEN discussed in the sub- section of 5.1.

AGGREGATE FUNCTIONS

Aggregate functions are those which return a single value like MINIMUM,
MAXIMUM, AVERAGE, SUM after evaluating multiple values of a certain column in
the table. These functions appear in the column list of the SELECT statement. The list
below gives the names of the functions and the value returned by it.
AGGREGATE FUNCTIONS

FUNCTION DATA TYPE RETURNED

MAX NUMBER/CHAR Largest value in the column


MIN NUMBER/CHAR Least value in the column
COUNT NUMBER/CHAR Number of values in the column
SUM NUMBER/CHAR Sum of values in the column
AVG NUMBER/CHAR AVG of values in the column

These functions can also be used in nested sub-queries or in combination with distinct
Clause.

Some examples :

1. To find the total no. of CUSTOMERS SELECT COUNT (*) FROM CUSTOMER.
2. To find the TOTAL UNITS to ORDER by CUSTOMER 1003 SELECT SUM
(UNITS ) FROM ORDER WHERE CUSTID = 1003.
3. To find the ORDER WHERE no. of UNITS order is less than the Avg. .

- 75 -
SELECT *
FROM ORDERTAB
WHERE UNITS < (SELECT AVG (UNITS)
FROM ORDERTAB
4. To SELECT the no. of CUSTOMERS who have placed ORDERS SELECT COUNT
(DISTINCT CUSTID) FROM ORDERTAB

GROUP BY HAVING

This is an optional clause of the SELECT statement. It groups the rows of a table
surrounding to the value in the column appearing on which GROUPING is done. It has an
optional sub-clause HAVING which checks a condition on every group created by
GROUP BY. Only those groups which satisfy the condition are passed to the column
function further. One can understand HAVING a filter on the group as the clause,
WHERE acts as a filter on the final data being selected.

Group by clause allows only column functions or the those columns on which grouping is
done to be selected in the column list of the SELECT statement. The same is applicable to
the HAVING clause.

This example selects the maximum UNITS value for every CUSTOMER

SELECT MAX (UNITS), CUSTID


FROM ORDERTAB
GROUP BY CUSTID

screen dump

- 76 -
In the above example the execution of the query takes place in the following manner :

The GROUP BY clause puts all records which have the same CUSTID value together and
creates on set, the next set has records of the next CUSTID so on it continues to create the
group till all CUSTIDs are considered. The result of which is shown below . Then the
maximum no. of units are selected from each group.

screen dump of order by custid on order


We observe in the above result that a CUSTID who has placed a single order (1004) is
placed in a separate group. Let us assume the result is not supposed to show any max value
for such groups having a single record. In such a case attaching the HAVING clause to the
GROUP BY clause is necessary. Thus the previous query changes to -

SELECT MAX (UNITS), CUSTID


FROM ORDER
GROUP BY CUSTID
HAVING COUNT (*)>1

ORDER BY

The final result being displayed by the SELECT statement can be sequenced in a particular
way by using the ORDER BY clause. The data which is selected is displayed either in the
order of its selection or in the order of its physical storage. When this order has to be set
on a certain field of the table one can use the ORDER BY clause.

In the statement below, we arrange all the records in the order of ORDERDATE field in
the ORDER table.

SELECT *
FROM ORDERTAB
ORDER BY ORDERDT

screen dump

- 77 -
One can also order the data on multiple fields. The fields have to appear in the order by
clause in the order of the required sequence. If the above e.g. one has to show the records
or ORDER table in the sequence of CUSTID and within every single value of CUSTID,
the records should be sequenced on the ORDERDATE field the query given will be

SELECT *
FROM ORDERTAB
ORDER BY CUSTID, ORDERDT

UNION

The UNION operators is used when records appearing in either or both the tables
referred in the SELECT statement are to be retrieved. The operation is embedded in two
SELECT statements and both these SELECT statements must

1. Have the same no. of columns


2.
3. Have the columns which belong to same datatype

the example below selects the PARTID value which have a MIN_STOCK above 10000 or
those parts which are ordered by

CUSTOMER 1001
SELECT PARTID
FROM PARTS
WHERE MIN_STOCK >10000
UNION
SELECT PARTID
FROM ORDERTAB

- 78 -
WHERE CUSTID = 1001

screen dump

The result indicates that a PARTID selected only once. The UNION operator implicitly
eliminates the duplicate value unless an ALL is used with it as shown below :

SELECT PARTID
FROM PARTS
WHERE MIN_STOCK >10000
UNION ALL
SELECT PARTID
FROM ORDERTAB
WHERE CUSTID = 1001

screen dump

AGGREGATE FUNCTIONS

MAX NUMBER/CHAR - Largest value in the column


MIN NUMBER/CHAR - Least value in the column
COUNT NUMBER/CHAR - Number of values in the column
SUM NUMBER/CHAR - Sum of values in the column
AVG NUMBER/CHAR - AVG of values in the column

These functions can also be used in nested sub-queries or in combination with distinct
Clause.

- 79 -
Some examples :

1. To find the total no. of CUSTOMERS SELECT COUNT (*) FROM CUSTOMER.
2. To find the TOTAL UNITS to ORDER by CUSTOMER 1003 SELECT SUM
(UNITS ) FROM ORDER WHERE CUSTID = 1003.
3. To find the ORDER WHERE no. of UNITS order is less than the Avg. .

SELECT *
FROM ORDERTAB
WHERE UNITS < (SELECT AVG (UNITS)
FROM ORDERTAB)
4. To SELECT the no. of CUSTOMERS who have placed ORDERS SELECT COUNT
(DISTINCT CUSTID) FROM ORDERTAB

5.7 INSERT

The records can be added to a table with the INSERT statement. The insert
command gives a specified value to every column of the row. The syntax of the INSERT
statement is as follows :

INSERT INTO TABLE


[(LIST OF COLUMNS)]
VALUES (list of values)

A simple INSERT statement like the one below needs to be given values in the same order
as that of the columns defined in the table :

INSERT INTO CUSTOMER


VALUES (106, `SUMMIT’, `PUNE’)

This statement gives the first value (1006) from the VALUES clause to the first field
(CUSTID) of the CUSTOMER table, the second value (`SUMMIT’) is assigned to the
second field (CNAME) and the third value (`Pune’) is assigned to the 3rd field (CITY).

A value NULL can also be specified for those field which allow NULL . In other words
the field which are not constrained by the NOT null constraints

E.g. INSERT INTO CUSTOMER


VALUES (1007, `PIONEER’, NULL)

If the city columns value is set to default with the NOT NULL with DEFAULT clause,
one can skip this column by just using a comma (‘) at the right position in the VALUE
clause.

- 80 -
INSERT INTO CUSTOMER
VALUES (1008, `SPECTRUM’)

Alternatively, one can also provide the list of column names to which the value are being
provided by through the INSERT statement.

e.g. INSERT INTO PARTS (PARTID, DESCRIPTION, RATE)


VALUES (107, `screws’, 4.00)

All the above examples e.g. INSERT only one record in the specific table. When multiple
records are to be inserted through one INSERT a sub-query can be attached to the INSERT
statement. The syntax of such INSERT statement is :

INSERT INTO TABLE


(list of columns)
Sub-select

e.g. If all the records having UNITS of the above 250 have to be stored in the BIG order
table the following INSERT statement has to be given

INSERT INTO BIG_ORDER


SELECT ORDERED, CUSTID, PARTID
FROM ORDER
WHERE UNIT > 250

In the above query the structure of the BIG-ORDERS table should be such that the values
SORDID and CUSTID should be stored in it. If the datatype and the order in which the
fields exists in the table has to match the order in which they appear in the SELECT
statement. If any of these field have to be skipped one needs to define the column list.

e.g. INSERT INTO BIG_ORDER


(Ordid, Partid)
SELECT ordid, Partid
FROM ORDER
WHERE UNITS> 250

UPDATE

This statement can be used for modifying the values of one or more fields from the
database tables.

The syntax for UPDATE statement

UPDATE table
SET column1 = Value 1
-
(WHERE condition)

- 81 -
The keyword SET is used to change the value of a certain field or fields.

To modify the UNITS value for ORDID 202, the UPDATE statement should be as
follows

UPDATE ORDER
SET UNITS = 300
WHERE ORDERID = 202

Without the where clause the UPDATE statement would update all the records of the
table.

DELETE

The records from a table can be removed with a DELETE statement. As the INSERT
statement a DELETE statement operates on the entire row of the table. It can
delete/multiple rows from a table through one execution. the syntax of the DELETE
command is

DELETE FROM TABLE


WHERE CONDITION

The following DELETE statement removes all CUSTOMER records belonging to the city
MADRAS provides there are no dependent records for the record being deleted.

DELETE FROM CUSTOMER


WHERE CITY = `Madras’

A DELETE statement may also have a sub - select . In such case the sub selected get
executed first. The result of this execution is used for evaluating certain field through the
WHERE condition.

e.g. DELETE FROM PART ORDER


WHERE PARTID IN
(SELECT partid
FROM PART

WHERE MIN_STOCK < 10000)

Referential Integrity with INSERT, UPDATE AND DELETE

According to CODD’s rule a referential integrity defined between tables must be


maintained during the updation or insertion or deletion of the data.

Whenever any referential integrity violation takes place through these command DB2
returns errors as a response to these statements.

E.g

- 82 -
1. INSERT INTO ORDER
VALUES (208, `01-JAN-06’, 400, 109,1004)

ERROR MESSAGE

-530 INSERT OR UPDATE VALUE OF FOREIGN KEY PARTID IS INVALID

The above query is not executed as the PARTID value 109 does not exist in the PARTS
table which is a parent of ORDERTAB

2) UPDATE ORDERTAB
SET CUSTID = 10004
WHERE CUSTID = 1003

ERROR MESSAGE

-530 INSERT OR UPDATE VALUE OF FOREIGN KEY CUSTID IS INVALID

This statement is rejected as the new value of the foreign key CUSTID 10004 is not
existing in the primary key table CUSTOMER

3) DELETE FROM CUSTOMER


WHERE CUSTID = 1002

The fate of the above statement depend upon whether the option set during creation
of referential integrity is RESTRICT which is default, or SET NULL or CASCADE

If the option is the default RESTRICT the statement will not get executed as child
records are existing for this CUSTID in the order table. Thus the following error will be
shown :

-532 . The relationship FK-COST restricts the deletion of row with RIDX `00000103’.

If the referential integrity has been created with SET NULL the above DELETE statement
will delete the record of customer 1002 from the CUSTOMER table and set the value of
CUSTID field of ORDERTABlk to NULL.

It is CASCADE option that is given with the referential integrity constraint definition the
statement will not only delete the record from the CUSTOMER table but also from the
child table ORDERS. All the records from the ORDER table having the CUSTID value as
1002 will be deleted as a result.

- 83 -
SYSTEM CATALOG

• Querying and updating the catalog


• Synonyms

SYSTEM CATALOG (QUERYING)

According to the Data Dictionary Rule of Codd every RDBMS must store and
manage the information which is vital for its smooth functioning implicitly .Thus adhering
to this rule the DB2 database creates a Data Dictionary and a Catalog consisting of
multiple system tables. The Data Dictionary is used by DB2 for its internal processing,
while the Catalog is accessible to the Database users.

The Catalog contains about 30 tables which are critical to DB2 functioning. They are used
by DB2 to determine access paths, check authorization, validate the different request by
the BIND. Users can retrieve the data from these system tables by firing the SELECT
statement on these tables.

Example

When a user wants to know the different tables which have been created by him so far, he
can access SYSIBM.SYSTABLE with the following SELECT statement :

SELECT * FROM SYSIBM.SYSTABLE

All catalog tables are prefixed by SYSIBM which is the authid of these tables. Even
though SELECT statement can be fired on these tables one cannot update them. It is only
the RUNSTATS utility , which is discussed in the further section through which one can
update these tables. Only certain catalog columns can be updated which are used for
performance tuning. DB2 updates all these tables automatically for the operations taking
place on the database. For e.g. when an object VIEW is created, DB2 updated
SYSIBM.SYSVIEWS table.

Dropping or altering an object deletes or updates the specific entry from the system table/s
related to that object.

SYSTEM CATALOG (UPDATING)

Updating of the system catalog is done by the RUNSTATS utility. This utility
performs the updation after gathering statistics about the database. the optimizer chooses
the proper access path for execution of certain SQL statements based on the statistics in
the catalog. Thus the system catalog is utilized in multiple ways and has its own
significance.

- 84 -
RUNSTATS

DB2 tablespaces and associated indexes are accessed by this utility to gather different
statistical information which is required for updating the catalog. This updation is also
significant information for the database administration as it is used for deciding when to
reorganize tables and indexes.

The system catalog tables updated by the RUNSTATS utility are

SYSCOLUMNS
SYSINDEXES
SYSTABLES
SYSTABLESPACE
SYSTABLEPART
SYSINDEXPART

For the optimizer to use the indexes for faster execution of queries the system catalog
needs to be updated. For this purpose RUNSTATS utility should be run minimum once on
all these tablespaces which have indexes as soon as the data is inserted in the table of these
tablespaces. RUNSTATS can be run with several control statements individually for every
tablespace and the indexes within it.

The syntax of the RUNSTATS statement is as follows :

RUNSTATS TABLESPACE dbname tablespace index (ALL)

The above statement will gather the statistics of the specified tablespace in the specified
database. With INDEX ALL each and every INDEX of the tablespace is used for
statistical information.

OPTIMIZER

The optimiser’s major function is to decide the access path for query execution
After the indexes are created it is the optimizer which decides if the INDEX should be
accessed or not. Optimizer, as the name suggests tries to find out the best ways of
executing a query with optimum utilization of the available resources.

We can consider the following example to understand the functioning of the optimizer. If
the library contains only three books on a particular subject, one may not go through the
list of books for selecting one book but is multiple authors had to be scanned from the list
it would be preferable. In the same way , while executing a query of the table contain very
few records the index may not be accessed by the optimizer it would perform a tablespace
directly.

For decision making the optimizer needs to know the nature of the table i.e. data and the
indexes available of the table. It procures this information from different system catalog
tables like SYSIBM.SYSINDEXES. This table has information regarding the
CLUSTERING columns which indicates whether the data is clustered. It also carries the
information about the number of distinct value in the fields FIRSTKEYCARD and

- 85 -
FULLKEYCARD. But the above information is added to the SYSINDEXES table only
after the RUNSTATS utility is

Apart from this it is also necessary to design the SQL queries in such a manner that the
optimizer can choose the best path . It is not only the final result of the query that is
significant but also its resource utilization.

One can know the access method adopted by the optimizer to execute a query with the
EXPLAIN command which is discussed at length in the further chapters.

SYNONYM

A synonym is an alternate name for the tables or views. It is also a logical object as a view
whose definitions are stored in the DB2 catalog (system tables). These system tables are
accessed whenever the synonym is accessed and the operations on the synonym are done
on the underlying base tables.

Synonyms can be created as follows:

CREATE SYNONYM SYNONYM_NAME


FOR [TABLE_NAME|VIEW_NAME]

e.g.

1>The create statement given below can be used for creating a synonym CUST on the
CUSTOMER table

CREATE SYNONYM CUST


FOR CUSTOMER

After the synonym is created all operations on the CUSTOMER table can be done with
the name as CUST.

e.g...
ALTER TABLE CUST
ADD RATING NUMBER(5) NOT NULL

OR

SELECT * FROM CUST

2>This example creates a synonym MY_VIEW on the view CUST_VW

CREATE SYNONYM MY_VIEW


FOR CUST_VW

- 86 -
DB2 ERROR CODES
SQLCODE Keyword Meaning
-818 Plan Timestamps in load module and plan
do not agree; program was probably
re-precompiled without being re-bound
-901 System Mysterious system error, permits
running more SQL statements
-904 System Unavailable resource; if resource
name is a table view, etc. this was
probably causes by contention and
re-trying the operation maywork; if
resource name is a 44 character
VSAM file name, the file has probably
been archived or deleted
-911 System Deadlock or timeout, updates rolled
back
-913 System Deadlock or timeout, updates not rolled
back

- 87 -

S-ar putea să vă placă și