Sunteți pe pagina 1din 167

DB2

RELATIONAL DATABASE MANAGEMENT SYSTEM

1
COURSE OBJECTIVES

After completing this course you should be able to

• List and describe the major functions , components


and data management techniques for DB2 .
• Describe DB2’s SQL and its efficient use with 3GL
languages like COBOL.
• Use DB2 associated facilities like SPUFI ,DCLGENS.

• Use DB2 utilities like LOAD , RUNSTATS.

2
COURSE PLAN

• Introduction to Database Management System .


• DB2 Overview.
• DB2 Data Objects .
• SQL - Data Definition Language .
• SQL - Data Manipulation Language .
• SQL - Data Control Language .
• DB2 - Program Preparation .
• DB2 - Application Programming
• DB2 Features and Utilities .

3
SESSION 1

INTRODUCTION TO
DBMS

4
INTRODUCTION TO DBMS

SESSION CONTENTS :

• Data and Information .


• What is a Database ?
• Database Management systems .
• Hierarchical DBMS.
• Network DBMS .
• Relational DBMS .

5
INFORMATION:

Information is refined data,Data that have been put into a

meaningful and useful context and communicated to a

recipient who uses it to make decisions.

6
DATABASE:

Database is a collection of interrelated data stored together


with controlled redundancy to serve one or more application
in an optimal fashion.The data is stored in such a fashion
that they are independent of the programs or people using
the data.

NEED OF DATABASE:

• Easy retrieval and updation of data.


• Avoid inconsistency in data.
• Avoid multiple copies of same data.

7
DBMS CHARACTERISTICS

• Centralised Controls.
• Inconsistency elimination.
• Data can be shared.
• Standards.
• Controlled redundancy.
• Authorise access.
• Data integrity.
• Data independence.
• Performance and efficiency.

8
DBMS MODEL EVOLUTION

•File Management System

•Hierarchical Database Management System

•Network Database Management System

•Relational Database Management System

9
FILE MANAGEMENT SYSTEM

It was the first method to store data in computers.The


data was stored and retrieved sequentially from the disk.

LIMITATIONS :
• Relationships between data items unknown.
• Code is dependent on data.
• Slow search process.
• Sorting records is difficult.
• Error prone.
• Interpretation of fields by accessing programs.
• Data inconsistency.
• Data redundancy.
• Changes to data structure cumbersome.

10
HIERARCHICAL DBMS

REPRESENTATION:

• Tree structure originating from a root.


• Record types at different levels.
• Parent/Child relationship .
• Successor/Predecessor relationship
• Node/Leaf.

FEATURES

• Unique parent for each child .


• 1:M relationship between parent and child.
• All nodes to be accessed through root parent node except root.

11
HIERARCHICAL MODEL EXAMPLE

Department

Department Non-project
Project
manager employee

Project Project
manager employee

12
DISADVANTAGES OF HIERARCHICAL DBMS

• Many to many relationship not possible.

• Cross relationship not possible.

• Structural changes -adding or deleting a level is


cumbersome.

• Limited flexibility in accessing lower nodes.

13
NETWORK DBMS

Network DBMS was proposed by CODASYL


(Conference On DAta SYstem Languages) database task group
in 1971.

REPRESENTATION

• Using Sets and Links .

FEATURES

• Many to many relationships possible.


• Same data at multiple levels.

14
NETWORK MODEL EXAMPLE

Managed by
Dept-mgr Department

Has project
Has employee Manages
Project Proj-mgr
Project ass.
Employee Employee ass. Assignment

15
NETWORK MODEL LIMITATIONS

• All interrelationships difficult to map.


• Code traces out different path.
• Complexity with high number of operators.
• Reorganisation complex and has wide impact on the system
• Requires expertise on part of user.

16
RELATIONAL DBMS

• Conceptualised by Dr. E.F.Codd at IBM in 1969.


• Build on sound Mathematical foundation.

REPRESENTATION :

• Relation - Table .
• Tuple - Record/Row .
• Attributes - Field/Columns .
• Domain - Set of valid values of attributes.
• Degree - Number of columns in a table.
• Cardinality - Number of rows in a table.

17
RELATIONAL MODEL EXAMPLE

RELATION ATTRIBUTES

T EMP# NAME AGE DEPT#


U 2000 AMIT 23 101
P 3000 SANJAY 23 102
L 4000 ARVIND 23 101
E
5000 SATISH 24 105
S

18
KEYS

PRIMARY KEY
The column or set of columns which can
uniquely identify every row in a table is termed
as a CANDIDATE key. Every candidate key
satisfies these two properties,
• Uniqueness - At no time , no two row have the
same value for the column or the set of columns.
• Minimality - None of the columns can be
removed from the key without violating the
uniqueness property .
For a given table one candidate key is
designated as PRIMARY key and all other are
designated as ALTERNATE keys.

19
KEYS

FOREIGN KEY

It is possible for one table to contain


a column ,or a set of columns , that contains
data elements values drawn from the same
domains as the columns that form the primary
key of some other table. This column or set
of columns is called the FOREIGN KEY.

20
FOREIGN KEY

EXAMPLE:

EMP# NAME DEPT#


2000 AMIT 101 EMPLOYEE
3000 SANJAY 102
4000 ARVIND 101
5000 SATISH 105
The department number column from the
employee table can draw values from the
department number in department table.
DEPT# NAME DEPARTMENT
101 MARKETING
102 PERSONNEL DEPT#
103 ADMIN Primary key of
104 TRAVEL department table
105 FINANCE
Foreign key of
employee table

21
RELATIONAL MODEL FEATURES

• Reorganisation of one table does not affect others-Dynamic


connections.
• No pre-defined connections.
• Non procedural data manipulation.
• Abandons parent child relationship.
• Data arranged in logical mathematical datasets.
• Based on mathematical concepts of relational sets.
• Each row identified by unique set of attributes.
• Same column name used to relate different tables.

22
RELATIONAL MODEL ADVANTAGES

• Flexible ,Simple and easy to use.


• 1:1 ,1:M and M:N relationships easily represented.
• Simple representation with ternary and higher order
relationships easy to represented.
• Structural changes simple to make .
• Data integrity maintained.
• Code does not trace path of data.
• Flexible querying.

23
INTEGRITY CONSTRAINTS

• ENTITY INTEGRITY

The entity integrity rule states that no column that is


a part of a primary key can have a null value. Otherwise, it
will be tough to uniquely identify a row. Primary key with a
NULL value is a contradiction in terms - in effect, it would
be saying that there is some entity that has no ‘identity’.

24
INTEGRITY CONSTRAINTS

• REFERENTIAL INTEGRITY

The referential integrity rule states that every foreign


key must either match a primary key value in its associated
table or it must be wholly NULL.

25
SESSION 2

DB2 OVERVIEW
AND
DATA OBJECTS

26
DB2 OVERVIEW

SESSION CONTENTS :

• DB2 History.
• DB2 and MVS Relationship .
• DB2 objects
> Storage Groups .
> Tablespaces.
> Tables .
> Indexes .
> Bufferpools .
• DB2 Data Types .

27
DB2 OVERVIEW

This course is all about DB2, IBM’s flagship relational


database management system.

DB2 is available on multiple platforms but we will concentrate


on DB2 for MVS/ESA in this course.

28
DB2 – SOME HISTORY
♦ The foundations of Relational Database technology
were laid by Dr. E.F. Codd, who in his paper ‘ A
Relational model of Data for Large Shared Data
Banks ’ laid the basic principles of RDBMS.

♦ IBM built a research prototype called System R


which resulted in two commercial releases :
> SQL / DS for VM in 1982 and
> DB2 for MVS in 1983.

♦ DB2 is available on other platforms too :


DB2/2 for OS/2
DB2/6000 for AIX
DB2 for Windows/NT.

29
DB2 ENVIRONMENT

IMS
IMS TSO
TSO CICS
CICS

IMS
IMS IMS
IMS TSO
TSO TSO
TSO CICS
CICS CICS
CICS
online batch online batch
batch online batch
IMS online batch online online batch VSAM,
databases DAM files/
DB2 IMS
DB2
databases
MVS

DB2
DB2database
database DASD
DASD

30
DB2 ARCHITECTURE

IMS
IMS//DB
DB//DC
DC CICS
CICS TSO
TSO
SQL DB2
statements utilities
Locking
Locking Other
Services Relational Data components System
System
Services services
(IRLM) data Manager Buffer services
(IRLM) system Manager
Data Base Services

DB2 databases DB2 log

31
DB2 ARCHITECTURE

DB2 has three major components :

• IRLM - IMS resource Lock Manager


IRLM provides concurrency control mechanism ,
called locking, which is required to isolate different users
from each other and to maintain integrity.

• System services
System services control the overall DB2 execution
environment . This includes managing the log datasets,
gathering statistics for performance monitoring, handling
system startup and shutdown.

32
DB2 ARCHITECTURE

• Data Base Services

Data base services support the function of the SQL


language, i.e. , definition , access control , retrieval and
update of user and system data . This component has several
subcomponents , among them the relational data system
(RDS) , the data manager and the buffer manager.

33
DB2 DATA OBJECTS

♦ DATABASE
♦ STORAGE GROUP
♦ TABLESPACES
♦ TABLES
♦ INDEXES
♦ BUFFERPOOLS
♦ VIEWS

34
DB2 DATA OBJECTS

TABLESPACES

TABLE TABLE

TABLESPACES

TABLE

INDEX

INDEX
STORAGE GROUPS

35
STORAGE GROUPS

 Storage Groups are a named collection of direct access


volumes, all of the same device type. A volume can
appear in more than one storage group and a storage
group can contain more than one volume.

 Each tablespace and index is associated with a Storage


Group.

36
DATABASE

• A Database is a logical grouping of


related tablespaces, tables and indexes for
administrative purposes.
• There is no restriction on accessing data
from more than one table in more than
one database.
• Only restriction is that an index is placed
in the same database as the tablespace
containing the table.
• There is no one to one mapping between
databases and storage groups.

37
DB2 TABLESPACES

♦ A table space can be considered as a logical


address space on secondary storage that is
used to hold one or more stored tables.

♦ One tablespace can be upto approximately


64 billion bytes and there is effectively no limit
to the number of tablespaces in a database.

♦ DB2 provides three types of tablespaces :


Simple
Segmented
Partitioned

38
PAGES

SPACE PAGE
Header

Footer
Data records
Space : A ‘Space’ is a collection of one or more VSAM
linear datasets that are logically concatenated to form a linear
addressing range.
Pages : The datasets in a space contains pages . The pages
for a tablespace can be either 4KB or 32KB . All pages use a
control interval of 4KB; therefore when a page with 32KB is
needed , 8 CI’s are assigned .

39
PAGES

A page is the unit of transfer between secondary and


primary storage. Even to access one byte of data in a page ,
the complete page is brought in the main memory.

A page can hold upto 127 data records . With the


DB2 V3 introduced compression techniques , upto 255 data
records can be placed in a page . Each record is held
completely in a page , that is , records do not span pages .

40
SIMPLE TABLESPACES

TABLESPACE

Table1 – Row1 Table1 – Row2

Table1 - Row3 Table1 - Row4

Table2 – Row2 Table2 – Row1

Table2 – Row3 Table2 – Row4

PAGE 1 PAGE 2

41
SIMPLE TABLESPACE

A Simple tablespace can contain more than one


stored table. Rows from different tables can be placed in
a page. If this tablespace is dropped , its rows are not deleted
. The space occupied by the rows does not become free until
the tablespace is reorganised .

• Advantage : faster access of data from related tables.

• Disadvantage : if a query requires data from only one table


DB2 still has to interrogate all pages
resulting in additional I/O activity.

42
SEGMENTED TABLESPACE

SEGMENT1 SEGMENT2 SEGNMENT3

TABLE3
TABLE1 TABLE2

43
SEGMENTED TABLESPACE

A segmented tablespace is intended to store more than


one table. The space within the tablespace is divided into
segments , where a segment consists of logically contiguous
set of ‘n’ pages (where n is a multiple of 4 between 4 and
64)and is the same for all segments in the tablespace.

A segmented tablespace can have between 1 and 32


VSAM linear datasets . The maximum size of a linear dataset
in a segmented tablespace is 2 GB and so the maximum size of
a segmented tablespace is 64 GB .

44
SEGMENTED TABLESPACE

•Advantages : To search all rows of a table , it is not


necessary to search the whole tablespace ,
but only the segments that contain the table.

If a segmented tablespace is dropped , its


segments become immediately reusable .

45
PARTITIONED TABLESPACE

Part 1 Part 2 Part 3 Part 4

46
PARTITIONED TABLESPACES

Partitioned Tablespaces are intended for tables that


are sufficiently large and operationally difficult to deal with
as an entire unit.The table is partitioned in accordance with
value ranges of the partitioning column.

• Advantage :
-improved data availability
-each part can be placed on a different DASD
volume thereby spreading the tablespace I/O load.

47
INDEX

Index
101
101 1
102 3 104 TABLE
103 4
104 2 102

103
Key Position

 An index is an ordered set of pointers to data rows


of a table. The contents of the index is sorted on one
or more specified columns . Indexes are maintained
by DB2 once they are created .
 Any number of indexes can be defined on a
particular table.

48
INDEXES

•Disadvantages of large number of indexes :

• DB2 has to update both table as well as indexes,


which leads to slower processing of requests.

•More storage space required.

49
INDEXES – Terminology

♦ Indexing Keys - the columns of the table on


which the index is defined.

♦ Unique Index - ensures that the values of the


indexed column(s) are not
duplicated.

♦ Primary Index - is defined on the primary key


of the table and is always
unique.

50
INDEXES : Terminology

Partitioning Index - The column(s) on which the partitioning


is done is called the partitioning key.
The partitioning index must be specified
on the partitioning key specifying the
partitioning values.
Clustering Index - determines the order in which records of
the base table.
Each table can have only one clustering
index.
Adding a clustering index after loading
the data does not reorganize the data.

51
BUFFERPOOLS

The bufferpools in DB2 consists of 4KB slots in


memory. After reading from the DASD , the data and the index
pages go into the slots until the buffer manager decides to use
the slot for other pages . The idea is that the data gets a chance
to be reused , thus minimizing I/O.
DB2 maintains logical chains of pages to be written
( because of they have been updated) and waits as long as
reasonable before writing them . The pages writes are
performed asynchronously and therefore do not affect response
times.
There are 50 4KB bufferpools - BP0 thru BP49 .
There are 10 32KB bufferpools - BP32K0 thru BP32K9 .

52
VIEWS

A VIEW is another way to represent data , a different


way to look at it . Views are derived from base tables , or from
other views . Unlike base tables, which represent physically
stored data , views are virtual tables that have no associated
physical storage .

Advantages :
• A view may be used as a part of security mechanism which
allows user to access only a portion of the table.
• Complicated queries can be stored as a view .
• Views can also minimize the program modification that may
be required when base table changes .

53
VIEWS

Example :
Cust_no Cust_name Cust_dob Cust_branch Cust_rating
111111 MACK 19501110 12 4 Basetable
222222 JACK 19770921 156 6
333333 PACK 19800202 8 3

Cust_no Cust_name Cust_branch


111111 MACK 12
View
222222 JACK 156
333333 PACK 8

54
DB2 - directory & Catalog

Information about the Db2 system is maintained in


Db2 directory and catalog .
Directory : The directory is kept solely for DB2’s internal
use.
Catalog : contains descriptive information about the plans
, it may be accessed by DB2 and its users.Used
by DB2 to determine the access paths and
manage system resources.DB2’s catalog
contains approximately 30 tables which are
central to Db2’s functioning .
Example : SYSIBM.SYSTABLES
SYSIBM.SYSCOLUMNS
SYSIBM.SYSINDEXES

55
DATA TYPES: STRINGS

STRING

CHARACTER GRAPHIC

Fixed length Varying length Fixed length Varying length

56
DATA TYPES : STRINGS

Strings can be divided into characters and graphic.

CHARACTER:
Character data type fields are used to store alphanumeric
data items .
• Fixed length character string : CHAR(X) or CHARACTER(X)
A fixed length character string is ‘CHAR’ must have its
length specified. Each value in this column is this length. A
shorter value is padded with blanks in the end. The length (value
of x) must be greater than 0 and less than 255 , it occupies x
bytes.

57
DATA TYPES : STRINGS

•Varying length character string : VARCHAR(X) & LONG


VARCHAR
VARCHAR is the variable length character string with
the length greater than 0 and less than the page size. If the
maximum length is greater than 254, it is considered as LONG
VARCHAR . It occupies x+2 bytes .

GRAPHIC

Graphic strings are similar to character strings . The


difference is that instead of occupying one byte per character,
they use two bytes to represent a character.

58
DATA TYPES : DATETIME

DATETIME

DATE TIMESTAMP TIME

59
DATA TYPES : DATETIME

DATE
Date is represented as a sequence of eight unsigned
packed decimal digits occupying four bytes; permitted value are
legal dates in the range January 1st ,1 A.D to December 31st ,
9999 A.D .
Internal format :YYYYMMDD.

TIME
Time is represented as a sequence of six unsigned
packed decimal digits, occupying three bytes ; permitted values
of legal times is in the range midnight to midnight i.e 000000 to
240000
Internal format : HHMMSS.

60
DATA TYPES : DATETIME

TIMESTAMP

Timestamp is represented as a sequence of 20 unsigned


packed decimal digits, occupies ten bytes ; permitted values of
legal timestamp are in the range 0001010100000000000 to
99991231230000000000 .

Internal format : YYYYMMDDHHMMSSnnnnnn .

61
DATA TYPES : NUMERIC

NUMERIC

Binary integer Decimal Floating point

Small Large Small Large

62
DATA TYPES : NUMERIC

INTEGER
Integer is used to store non-decimal numeric
information - 4-byte binary integer , 31 bits for number and
32nd bit to store sign of the number.
Range : -2,147,483,648 to 2,147,483,647.

SMALLINT
Two-byte binary,15 bits for number and 16th bit for
sign of the number.
Range : -32,768 to 32,767.

63
DATA TYPES : NUMERIC

DECIMAL(x,y)

A packed decimal number with precision of ‘x’


(ranging from 1 to 31) and a scale of ‘y’ (ranging from 1 to
less than the precision value).

64
DATA TYPES : NUMERIC

FLOAT(p)

Floating point number n, represented by a binary


fraction f of p binary digits precision (-1<f<1,0<p<54) and
a binary integer exponent e(-65<e<64) such that
n=f*(16**e).

If p<22 the number n is single precision and


occupies 4 bytes , otherwise it is double precision and
occupies 8 bytes.

65
SQL

STRUCTURED QUERY
LANGUAGE

66
STRUCTURED QUERY LANGUAGE

• SQL - DATA DEFINITION LANGUAGE.

• SQL - DATA MANIPULATION LANGUAGE .

• SQL - DATA CONTROL LANGUAGE .

67
SQL – Structured Query language

SQ L

DD L DM L D CL

68
STRUCTURED QUERY LANGUAGE

Data Definition Language (DDL)


are statements used to create and
maintain DB2 objects .

Data Manipulation Language (DML)


are statements used to access and modify
data available in tables.

Data Control Language (DCL)


are control statements that govern data
security.

69
SQL – DDL

DATA DEFINITION
LANGUAGE

70
DDL Operations

♦ CREATE - Defines a new object


♦ ALTER - Modifies an object
♦ DROP - Deletes a defined object

♦ Entered interactively or embedded in


application programs.

71
DDL – OBJECTS vs OPERATIONS

CREATE ALTER DROP


Storage   
Group
Database  
Tablespace   
Table   
Index   
Synonym  
View  

72
DDL – CREATE Storage Group

Syntax :

CREATE STOGROUP stogroup-name


VOLUMES (vol1, vol2, .......)
VCAT catalog-name
[ PASSWORD password ]
Example :
CREATE STOGROUP TRG1TO1
VOLUMES ( DBPK01,DBPK02)
VCAT DB220TRG

Create Storage Group defines a set of DASD volumes


controlled by a VSAM catalog

73
DDL – ALTER Storage Group

Syntax :
ALTER STOGROUP stogroup-name
ADD VOLUMES (vol1, vol2, ....... )
REMOVE VOLUMES (vol1, vol2, ....... )
[PASSWORD password ]
Example :
ALTER STOGROUP TRG1T01
ADD VOLUMES (DBPK03)
REMOVE VOLUMES (DBPK01)

ALTER STOGROUP statement can be used to add or remove


DASD volumes associated with the Storage Group.

74
DDL – CREATE Database
Syntax :
CREATE DATABASE database-name
[STOGROUP stogroup-name ]
[BUFFERPOOL bufferpool-name ]

Example :
CREATE DATABASE TRG1T01
STOGROUP TRG1T01
BUFFERPOOL BPO

CREATE DATABASE statement is used to define


a database which uses stogroup-name as its storage group,
that will be used to support DASD space requirements for
tablespaces and indexes within the database .

75
DDL – CREATE TABLESPACE

Syntax :

CREATE TABLESPACE tablespace-name


IN database-name
USING STOGROUP stogroup-name
PRIQTY qty
SECQTY qty
ERASE YES / NO
LOCKSIZE ANY/PAGE/TABLESPACE/TABLE
BUFFERPOOL bufferpool-name
CLOSE YES / NO
FREEPAGE amount
PCTFREE amount

76
DDL -Tablespace Parameters

PRIQTY - amount of physical storage


allocated when tablespace is created.

SECQTY - secondary allocation of space as


amount of data in tablespace grows in
size.

ERASE - indicates whether the DB2 defined


data sets are to be erased when
tablespace is dropped.

LOCKSIZE - indicates type of locking ( Page /


Table /Tablespace /DB2 decided )

77
DDL – Create Tablespace Parameters

BUFFERPOOL - bufferpool to be associated


with the tablespace.

CLOSE - indicates whether data sets


associated with the tablespace
should be closed if there are no
current users of the tablespace.

FREEPAGE - specified number of pages


after which an empty page is
available.

78
DDL – CREATE TABLE

CREATE TABLE table-name


( col-name1 col-type1 [ NOT NULL / NULL /
NOT NULL WITH DEFAULT ]
[, col-name2 col-type2 ............])

[ PRIMARY KEY(col-name1, col-name2 ...)

[ FOREIGN KEY [constraint name]


(col-name1, col-name2 ...)
REFERENCES base-table
[ON DELETE RESTRICT /
CASCADE / SET NULL ] ]
[ IN database.tablespace name
IN DATABASE database name ]

79
DDL – CREATE TABLE

Example :

CREATE TABLE EMP


( EMP# CHAR ( 5 ) NOT NULL,
ENAME VARCHAR (2 ) NOT NULL,
SAL DECIMAL NOT NULL
WITH DEFAULT,
PRIMARY KEY ( EMP# ) )

80
DDL – CREATE TABLE Like Existing Table

Syntax :

CREATE TABLE table-name


LIKE existing-table-name

This format allows the user to create a table table-name


with the same column description as some existing table
existing-table-name . The table table-name does not inherit
any primary or foreign key definitions from existing-table-
name.

81
DDL – ALTER TABLE

Syntax :

ALTER TABLE table-name


ADD column definition
PRIMARY KEY primary key definition
DROP PRIMARY KEY
DROP FOREIGN KEY constraint name

This statement is used to alter the columns,


keys and other specifications of a previously
defined table .

82
DDL – CREATE INDEX

Syntax : CREATE [ UNIQUE ] INDEX index-name


ON table-name ( col-name [ ASC / DSC], ...)
[ USING STOGROUP stogroup-name
PRIQTY qty
SECQTY qty
ERASE YES / NO ]
[ CLUSTER ]
[BUFFERPOOL bufferpool-name ]
[ CLOSE YES / NO ]
[ PCTFREE amount ]
[FREEPAGE amount ]

The above statement is used to define a index on a previously


defined table . The various constraints are also checked during the
CREATE INDEX STATEMENT. If any of the constraints is not met
the index is not created.

83
DDL - CREATE INDEX

EXAMPLE :

CREATE UNIQUE INDEX XS


ON SUPPLIER (S#)
USING STOGROUP TRG1T01
PRIQTY 16
SECQTY 4
ERASE NO

84
DDL – CREATE VIEW

Syntax :

CREATE VIEW view-name


(column-name,...)
AS ( SELECT col-name1, col-name2 ...
FROM table-name)
WITH CHECK OPTION

The above statement creates a view on one or more


tables or views. The column-name is a list of column in the
view . If the column names are not specified then the view
inherits the name of the columns used in the subselect.

85
DDL – DROP Statement

Syntax : DROP object-type object-name

Example : DROP DATABASE database-name


DROP TABLE table-name

The DROP statement deletes an object . Any object


that are directly or indirectly dependent on that object are
also deleted.

Object types can be :


TABLE,VIEW,INDEX,SYNONYM,STOGROUP,
DATABASE,TABLESPACE .

86
DDL – DROP Dependencies

TABLESPACE

TABLE

TABLE

VIEW1 VIEW2 VIEW3

87
SQL – DML

DATA MANIPULATION
LANGUAGE

88
DML – Operations

♦ SELECT - Retrieves data from the table


♦ UPDATE - Changes values of columns / rows
♦ DELETE - Deletes row(s)
♦ INSERT - Inserts a new row

89
DML – EXAMPLE

EMP

EMP# ENAME DEPT# SAL MGR#

1000 Arun 10 8000.00 1002


1001 Ramesh 20 9000.00 1001
1002 Rahul 10 8500.00 1001
1003 Rohit 10 7500.00 1002

DEPT

DEPT# DNAME

10 Finance
20 Admin
30 Sales
40 Personnel

90
SELECT – SIMPLE QUERY

♦ SELECT EMP#, SAL


FROM EMP
WHERE EMP# = 1000

Output –

EMP# SAL
1000 8000.00

♦ SELECT *
FROM EMP
WHERE EMP# = 1000

Output –

EMP# ENAME DEPT# SAL MGR#


1000 Arun 10 8000.00 1002

91
DATA COMPARISON

• OPERATORS
> , < , = , >= , <= , <>

• BOOLEAN
NOT , AND ,OR .

• PARTIAL VALUES
% , _ , LIKE

• MISC.
IN , BETWEEN .

92
DML – Retrieving With Ordering

SELECT *
FROM EMP
ORDER BY DEPT#

Output –

EMP# ENAME DEPT# SAL MGR#


1000 Arun 10 8000.00 1002
1002 Rahul 10 8500.00 1001
1003 Rohit 10 7500.00 1001
1001 Ramesh 20 9000.00 1002

93
DML - JOIN Queries

Simple Equijoin

SELECT EMP#, ENAME, DEPT#, DNAME


FROM EMP, DEPT
WHERE EMP.DEPT# = DEPT.DEPT#

Output –

EMP# ENAME DEPT# DNAME

1000 Arun 10 Finance


1001 Ramesh 20 Sales
1002 Rahul 10 Finance
1003 Rohit 10 Finance

94
DML – JOIN Queries

Self-Join

♦ Join of a table with itself

SELECT E.EMP#, M.MGR#


FROM EMP E, EMP M
WHERE E.EMP# = M.MGR#

95
DML – Subqueries OR Nested queries

♦ Simple Subquery

List of Employees whose salary is greater than


the salary of Employee number 1000

SELECT EMP#
FROM EMP
WHERE SAL > ( SELECT SAL
FROM EMP
WHERE EMP# = 1000 )

96
DML - Correlated Subquery

Correlated subquery provides further level of flexibility


by permitting the nested SELECT statement to refer back to
columns in the previous SELECT statement. Correlated
subqueries differ from normal subqueries in that the nested
SELECT statements refers back to the table in the first SELECT
statement.

97
DML - Correlated Subqueries

EXAMPLE:

SELECT DNAME
FROM DEPT
WHERE ‘ARUN’ IN
(SELECT ENAME
FROM EMP
WHERE DEPT# = EMP.DEPT#)

98
DML – Column Functions

Functions operate on collection of values in a


Column.

♦ COUNT - number of values in the column


♦ SUM - sum of values in the column
♦ AVG - average of values in the column
♦ MAX - largest value in the column
♦ MIN - smallest value in the column

99
DML – Column Functions

Example :
SELECT COUNT(*)
FROM EMP

- Gives number of employees



SELECT EMP#, MAX(SAL)
FROM EMP

Output –
EMP# MAX(SAL)
1001 9000

100
DML – SELECT Statement

Group By Clause

SELECT DEPT#, SUM(SAL)


FROM EMP
GROUP BY DEPT#

Output –

DEPT# SUM(SAL)
10 24000
20 9000

101
DML – SELECT Statement

Group By – Having Clause

SELECT DEPT#
FROM EMP
GROUP BY DEPT#
HAVING COUNT(*) > 1

Output –

DEPT#
10

102
DML – INSERT Statement

Syntax :

INSERT INTO table-name


VALUES ( literal 1, [literal 2, .........] )

Example :

INSERT INTO EMP


VALUES ( 1004, ‘Sameer’, 10, 7500, 1002)

103
DML – UPDATE Statement

Syntax :

UPDATE table-name
SET col-name1 = expression
[, col-name2 =expression , ......]
[WHERE search-condition ]

Example :

UPDATE EMP
SET DEPT# = 20
SAL = SAL + 100
WHERE EMP# = 1004

104
DML – DELETE Statement

Syntax :

DELETE FROM table-name


[ WHERE search-condition ]

DELETE FROM EMP


WHERE EMP# = 1004

Execution depends upon DELETE RULE –


CASCADE, RESTRICT or SET NULL.

105
PROGRAM PREPARATION

106
DB2 PROGRAM PREPARATION

• Execution cycle of DB2 program .


• Definitions
> DBRM .
> Bind .
> Plans.
> Packages.

107
PROGRAM PREPARATION

COBOL
PROGRAM
WITH
EMBEDDED
SQL

MODIFIED DBRM’S
COBOL PGM
DB2 PRECOMPILER

DB2
COMPILE & CATALOG BIND
LINKEDIT

LOAD
PACKAGE
MODULE

108
EXECUTION CYCLE OF DB2 PROGRAM

• Precompilation
The Precompilation separates SQL statements from
NON-SQL statements.From this step onwards the further
processing is done in two separate paths.

NON-SQL PATH:
• Compilation And Linking
The non-SQL part of COBOL program goes through
compilation and linking ,after all the SQL statements are
commented .This results in a LOAD module

109
EXECUTION CYCLE OF DB2 PROGRAM (CONT.)

SQL PATH :

•Bind

The extracted SQL part of the COBOL program which is


called Data Base Request Module(DBRM) goes through an
analogous process called BIND ,to produce an executable
PLAN/PACKAGE.

110
EXECUTION CYCLE OF DB2 PROGRAM(CONT)

RUNNING THE PROGRAM.

After the above mentioned steps are over ,the two


separate physical components are produced
• PLAN-containing the access path specifications for the
SQL statements in the program.
• LOAD MODULE- containing the executable machine
instructions for the COBOL statements in the program.

This program can now be executed in a TSO batch


process.

111
WHAT IS ...?

• DBRM
DBRM is a module containing SQL statements
extracted from the source program by the DB2 precompiler.It
is stored as a member of a partitioned dataset. It is not stored
in the db2 catalog or directory.

•BIND
Bind is a DB2 routine that analyses each SQL
statement and determines the most efficient access path to get
the data.It also checks for errors and accesses the DB2 catalog
to check that the resources mentioned in the SQL statement
actually exists and also that the binder is authorised to perform
each statement in the program.

112
WHAT IS ...?

•PLAN

Plan is an executable module containing the access


path logic provided by the db2 optimizer.It can be composed
of one or more DBRM’s and packages. Plans are created by
BIND command.

113
WHAT IS ?

•PACKAGE

Package is a single bound DBRM with optimized


access paths. Before DB2 v2.3 the only bind option was at the
Plan level. By using Packages the table access logic is
packaged at a lower level for granularity - at the Package or
program level. To execute a Package it must be first included
in the Packages list of a Plan.Packages can never be directly
executed.

114
WHAT IS ...?

•COLLECTION

Collection is a user defined name (1 to 18 characters)


that the programmer must specify for every package. A
collection is not an actual,physical data-base object. A
collection is a grouping of DB2 packages. By specifying
different collection identifier for a package ,the same DBRM
can be bound to different packages. This capability permits
programmers to use the same DBRM for different packages ,
enabling easy access to tables that have the same structure but
different owners.

115
DB2 - APPLICATION PROGRAMMING

Using Embedded SQL

116
HOST LANGUAGES

♦ COBOL
♦ PL/I
♦ C
♦ ASSEMBLER
♦ FORTRAN

117
EMBEDDED SQL

Steps to coding SQL in a program :

♦ Delimit all SQL statements.


♦ Describe host variables.
♦ Declare a Communication Area.
♦ Code SQL statements to access data.
♦ Handle exceptional conditions.

118
STATIC SQL

Statement :

♦ Coded in the program.


♦ Same function applied on the same
tables and columns.

Bind : On all SQL statements

♦ Before program execution


♦ Access strategy is permanant

119
SQL Delimiters in COBOL

EXEC SQL
SQL Statement
END-EXEC

♦ The EXEC SQL must be coded after


column 12 thru 72 .
♦ No COBOL Statements allowed
between the delimiters.

120
COBOL Host Variables

Fields in the program’s WORKING-STORAGE or


LINKAGE-SECTION that are referenced in your SQL
statements are called Host Variables.

♦ All COBOL host variables must be declared


in the DATA DIVISION.

♦ A colon (:) must precede all host variables in


an SQL statement.

121
COBOL Host Variable

Example :

EXEC SQL
SELECT ENAME
INTO :WS-ENM
FROM EMP
WHERE EMP# = 1000
END-EXEC.

122
SQLCA

The SQLCA contains a set of fields that DB2 updates


after each SQL statement is executed . It indicates the results
of executing the statement. In a COBOL program they will be
a combination of binary and alphanumeric fields .

The most important field in SQLCA is SQLCODE , a


binary full word field that is updated with a SQL return code
after each SQL statement .

123
INCLUDE

EXEC SQL
INCLUDE SQLCA | SQLDA | member-name
END-EXEC.

Member-name: names a member of partitioned


dataset.

To include SQL statement or COBOL host variable


declarations from a member of a partitioned dataset,
we use the INCLUDE statement.

124
ERROR HANDLING

♦ After the execution of every SQLstatement


DB2 sets the SQLCODE value.

SQLCODE = 0 , execution was


successful

SQLCODE > 0 , execution was


successful with a
warning

SQLCODE < 0 , execution was not


successful

SQLCODE = 100 , no data found

125
ERROR HANDLING

WHENEVER :

EXEC SQL
WHENEVER < condition > < action >
END-EXEC.

Condition : SQLWARNING
SQLERROR
NOT FOUND

Action : CONTINUE
PERFORM para-name

126
PROCESSING MULTIPLE ROWS

Multi-Row
SELECT

TABLE
RESULT Retrieves one
TABLE row at a time.

127
DECLARE CURSOR

EXEC SQL
DECLARE CUREMP CURSOR FOR
SELECT EMP#, ENAME
FROM EMP
WHERE DEPT# = 10
END-EXEC.

128
OPEN CURSOR

♦ The Cursor must be opened before any rows


are retrieved.

EXEC SQL
OPEN cursor-name
END-EXEC.

129
FETCH CURSOR

♦ The FETCH statement is used to move


the contents of the current selected row
into host variables.

EXEC SQL
FETCH CUREMP
INTO :WS-ENO, :WS-ENM
END-EXEC.

130
End-Of-Data Processing

TABLE
1000 Arun
1002 Rahul
1003 Rohit

HOST VARIABLES SQLCODE


1000 Arun 0
FETCH 1002 Rahul 0
1003 Rohit 0
1003 Rohit 100

131
CLOSE CURSOR

The Cursor has to be closed after processing


the rows of the result table.

EXEC SQL
CLOSE CUREMP
END-EXEC.

132
DYNAMIC SQL

Statement :

Acquired during execution.


Function can vary and can be applied to
different tables and columns.

Bind : On a single statement

At statement execution.
Access strategy not saved.

133
Executing a program using Dynamic SQL

Program
Translate,
Capture,
precompile Bind DSQL
format DSQL
DSQL

Precompile,
Link, Process
Bind DSQL

Process
Static SQL

134
STATIC vs DYNAMIC

Statement :

Static : Object and Action known


Dynamic : Object or Action not known

Bind :

Static : is bound once


Dynamic : is bound every time

135
TYPES OF DYNAMIC SQL

♦ EXECUTE IMMEDIATE

♦ NON SELECT DYNAMIC SQL

♦ FIXED LIST SELECT

♦ VARYING LIST SELECT

136
EXECUTE IMMDEDIATE

Implicitly prepares and executes complete SQL statements


coded in host variables.

If any data has to be retrieved, the SQL portion of the program


should consist of two parts :

- moving complete text for statement to be executed


into the host variable.
- issuing an Execute Immediate statement.

137
EXECUTE IMMEDIATE

Syntax :
EXEC SQL
EXECUTE IMMEDIATE :host variable
END-EXEC.

Example :

WORKING- STORAGE SECTION.


01 WS-HOSTVAR.
05 WS-HOSTVAR-LEN PIC S9(4) COMP.
05 WS-HOSTVAR-TXT PIC X(50).

138
EXECUTE IMMEDIATE

Example ( Cond. )

PROCEDURE DIVISION.

MOVE 32 TO WS-HOSTVAR-LEN.
MOVE “DELETE FROM EMP WHERE DEPT# = 10” TO
WS-HOSTVAR-TXT.
EXEC SQL
EXECUTE IMMEDIATE :WS-HOSTVAR
END-EXEC.

139
NON-SELECT Dynamic SQL

• This statement is used to PREPARE and EXECUTE the SQL


statement in an application program.

• Cannot be used to SELECT statements.

• Host variables cannot be used in the Prepared Statement.

• Syntax : EXEC SQL


PREPARE statement-name
FROM :host variable
END-EXEC.
EXEC SQL
EXECUTE statement-name
END-EXEC.

140
Parameter Markers

• Since host variables cannot be included in the


statement string, a similar feature is provided called
the Parameter Marker.

• It is a question mark ( ? ) which is included in the


statement.

• DB2 substitutes the values for the parameter markers


before it executes the SQL statement.

141
FIXED LIST SELECT

• This is used when the structure of the result table is


known, when the program is coded.

• To prepare a Fixed List Select Dynamic SQL, a program


uses five different SQL statements :

Declare Cursor Prepare Statement Open Cursor

Close Cursor Fetch

142
FIXED LIST SELECT Dynamic SQL

Example :

SQL to execute :
SELECT EMP#, ENAME
FROM EMP
WHERE DEPT# = ?

Move the ‘SQL to execute’ to WS-HOSTVAR-TXT.


EXEC SQL
DECLARE EMPCUR CURSOR FOR FLSQL
END-EXEC.
EXEC SQL
PREPARE FLSQL FROM : WS-HOSTVAR-TXT
END-EXEC.

143
FIXED LIST SELECT Dynamic SQL

Example (Cond.)

Move the required value to DNO.


EXEC SQL
OPEN EMPCUR USING :DNO
END-EXEC.
Loop until no more rows to fetch.
EXEC SQL
FETCH EMPCUR
INTO :ENO, :ENM
END-EXEC.
EXEC SQL
CLOSE EMPCUR
END-EXEC.

144
VARYING LIST SELECT Dynamic SQL

• Facility to change columns, tables during execution.

• Thus more flexible than any other Dynamic SQL.

• Since the number and type of host variables cannot be


known beforehand, this class of SQL is most complicated
among the dynamic SQL's.

145
DB2 FEATURES

• Transactions & its properties .


• Recovery utilities .
• Security .
• DB2 privileges .
• DB2 Utilities.

146
Transactions : ACID Properties

Atomicity : means either all work of the transaction is


applied or none.

Consistency : means all that the database is in a consistent


state after the execution of the transactions.

Isolation : requires that a transaction not be influenced by


changes made by other concurrently executing
transactions.

Durability : means that the work associated with a


successfully executed transaction is applied to
the database.

147
CONCURRENCY

To provide concurrent access , the database manager uses


a software mechanism called locks.

Attributes of locks :

• Object : The resource being locked.

•Duration : How long the lock is needed.

•Mode : The type of access allowed for the lock owner as


well as type of access permitted for concurrent
users of the locked object.

148
CONCURRENCY : DeadLock

User 1 User 2

Lock Requested
Lock held Lock held

Object 1 Object 2

149
NEED FOR BACKUP

FAILURES :

• Hardware Failure

• Program Failure

• Natural Calamity

150
RECOVERY UTILITIES

• BACKUP - IMAGE COPY


( FULL, INCREMENTAL )

- MERGECOPY

• RECOVER - RECOVER
- QUIESCE
- REPORT
- CHECK
- REPAIR

151
RECOVERY UTILITIES

COPY
• Backup of pages are prepared
• Copies pages of a tablespace into a sequential dataset.
• DB2 lets users take a full backup called IMAGECOPY
or backup of changes only since the last backup called
INCREMENTAL backup.

MERGECOPY :

• Merges all the incremental copies to produce a single


copy or merges the full image copy with all the
incremental copies to produce a full image copy.

152
RECOVERY UTILITIES

QUIESCE :

•Records the point of consistency for related tablespaces.


•Ensures that all the tablespaces in the scope of QUIESCE are
referentially intact.

CHECK :

• Checks referential integrity of related tables.


• Checks consistency of indexes with the data.

153
RECOVERY UTILITIES

REPORT :

•The input is a single tablespace and the output is a report


containing information about related tables and tablespaces.
• Provides information necessary for the recovery of a tablespace.

REPAIR :

•This utility is designed to modify DB2 data and associated data


structures when there is an error or problem.

154
SECURITY

What can be protected ?

Access to the DB2 Subsystem.

Datasets used by DB2.

DB2 Objects.

155
How Protection Is Achieved ?

RACF : Security managers like RACF ( Resource


Access Control Facility ) are used to protect
DB2 resources - DB2 Subsystem, DB2
Objects.

• Protection of DB2 objects is done within DB2.


• Each access to the DB2 object is validated against
a set of privileges associated with process issuing
the request.

156
DB2 Privileges

• IMPLICIT - Automatic Privileges for the OWNER


of the object.

• EXPLICIT - Specific privileges provided by


GRANT SQL.

• INDIRECT - Through EXECUTE privilege or as


a member of a group.

157
GRANTING Explicit Privileges

Syntax :

GRANT privilege
ON object-type object-name
TO ( user-id | PUBLIC )
WITH GRANT OPTION.

PUBLIC : to all the users in the system.

WITH GRANT OPTION : users can grant to other users.

158
REVOKING Privileges

REVOKE undoes a matching GRANT

Syntax :

REVOKE privilege
ON object-type, object-name
FROM (user-id | PUBLIC )

159
Cascading REVOKE

Grants privilege Grants privilege


to to
User1 User2 User3

User1 REVOKES the privilege from User2.

User1 User2 User3

160
DB2 - Utilities

 LOAD

 UNLOAD

 REORG

 RUNSTATS

161
LOAD Utility

Used to load data into one or more tables.

Input :
• File containing data.

Output :

• Loaded table.
• Summary report of errors encountered.

162
LOAD Utility

Features :

• Automatic data conversion between compatible


data types.

• Data loaded in the sequence presented, no sort


invoked.

• Indexes built.

163
UNLOAD Utility

• An IBM supplied program that unloads data from a table in


the LOAD utility format.
• A where clause can be supplied to selectively unload row.

OUTPUT:
• Sequential data set containing the unloaded data.
• Load control statements.

164
REORGANISING Utility

REORG :

• Eliminates fragmentation in tables and


indexes.

• May order the pages of a table according


to the order of the index.

• Will restore any free space for later insertion.

165
RUNSTATS

• Updates statistics of tables and indexes and stores


this information in the DB2 catalog.

• The Optimizer ( part of Bind ) analyses this


information in determining the best access
strategy.

• Should be used after massive changes to the data


and after running REORG.

166
SUMMARY

• Introduction to Database Management System .


• DB2 Overview.
• DB2 Data Objects .
• SQL - Data Definition Language .
• SQL - Data Manipulation Language .
• SQL - Data Control Language .
• DB2 - Program Preparation .
• DB2 - Application Programming
• DB2 Features and Utilities

167