VSAM Performance Tuning

VSE/VSAM – Inside & Out
John Mycroft, Software Developer

CSI International
www.e-vse.com
johnm@e-vse.com
WAVV 2007, Green Bay, WI

Acknowledgement
With grateful thanks to Dan
Janda, The Swami of VSAM,
from whom most of this
presentation was stolen

Abstract
This presentation gives an overview of VSAM & its components.
We take a look at what a VSAM file really looks like and how to
soup up its performance.
We also look at some common mistakes and how to avoid them.
This presentation and its materials are copyrighted and
developed by John Mycroft from a presentation originally
copyrighted by Dan Janda. Permission is granted for WAVV to
reproduce this presentation for distribution to its members at no
charge.
Trademarks:
IBM, VSE, VSE/ESA, zVSE, CICS & DL/I are trademarks or registered
trademarks of the IBM Corporation
The Swami of VSAM is a trademark of Dan Janda.

VSE/VSAM Overview
Virtual Storage Access Method
For disk files
Sequential – “Entry Sequence Dataset” or ESDS
Begin at the beginning, go on til you get to the end and
then stop
Indexed – “Keyed Sequence Dataset” or KSDS
Process by key or sequentially or a mixture
Direct – “Relative Record Dataset” or RRDS (fixed) or
VRDS (variable)
Calculate a record’s location in the file to access it
Alternate index (AIX) – gives an alternative route to a
KSDS
Allows unique & non-unique keys

VSE/VSAM Functional
areas
Catalog
Volume & file information
Usage statistics
Disk space management
Space allocation including secondary
allocations
VSAM and VSAM/SAM files
System files
Libraries

VSE/VSAM Functional areas
Integrity
Performance
Data transfer size
Buffering
Backup / restore
File sharing between jobs and
systems

Processing a VSAM file
Sequentially (ESDS)
Forward or backward
Keyed access (KSDS)
Directby full or partial (generic) key
Sequentially, forward or backward
Skip sequential, forward or backward
Addressed access (RRDS, VRDS)
Direct,
by record address
Sequential & skip sequential
Alternate Index Access
Same as keyed access
Also direct access by non-unique key
How VSAM stores data
We’re going to look at
How VSAM stores records logically on disk
Performance considerations
How VSAM physically stores data on disk
Disk space usage calculations
Optimizing disk capacity
Performance considerations
VSAM jargon
Control Interval
Control Area
CI & CA splits
Freespace
RDF, CIDF

VSAM Jargon
Control Interval (CI)
“Smallest unit of data transfer between main &
disk storage”
In other words, when you read a record, VSAM reads
the whole CI that contains that record
Think of it as the same as a block of records in a
sequential file if you like (though it’s laid out
differently)
A CI can initially contain 1 or more records
More can be inserted
Some or all can be deleted
When you try to add a new record to a CI with no
room, a “CI split” takes place – more about that later

Layout of a control interval
Rec 1 Rec 2 Rec 3 Rec … Freespace RDFs CIDF
ALL VSAM FILES ARE VARIABLE LENGTH

Even if all the records are the same size
Rec 1 – Rec n 1 to n logical records of any length
Freespace Unused space in CI for inserting
records or making existing records longer
RDFs 3 byte record descriptor field
ESDS/KSDS one per record length, one for all
consecutive records of same length
RRDS one per numbered record slot
CIDF 4 byte Control Interval Descriptor Field

Control Area (CA)
A CA is a group of CIs. In a KSDS, all the data CIs
in a CA are indexed by one index CI
CI 0 CI 1 CI 2 CI 3 CI 4 CI 5 CI 6 CI 7 CI 8 CI 9
CI10 CI11 CI12 CI13 CI14 CI15 CI16 CI17 CI18 CI19
CA size is the smallest of :
One cylinder or
The size of the primary allocation
The size of the secondary allocation
The number of CIs per CA depends on the device and the CI
and CA sizes
It is generally a good idea to go for the biggest CA possible

Index Control Interval (Index CI)
A CI in an index containing pointers to
The next level in the index or
The Data CI in the CA – this is referred
to as a Sequence Set CI
Index CI
CI 0 CI 1 CI 2 CI 3 CI 4 CI 5 CI 6 CI 7 CI 8 CI 9

Index and data structure
Balanced tree
Sparse index
Always just 1 high-level index CI
There can be 0 to many intermediate
level index CIs
There can be one or more low-level
(sequence set) index CIs.
If there is only 1 sequence set CI, it
is also the high-level index CI
And now the bit you’ve all
been waiting for……

Performance rules of thumb
Use largest data CI possible,
especially for sequential work
Use as small an index CI as you
can (but not too small!)
Use large data CA – allocate
primary and secondary as at
least 1 cylinder
Avoid too many extents /
allocations
Allocation calculations
CI freespace =
CI Size * Freespace %
Number of records per CI
“Fixed” length:
(CI Size -10 –Freespace) / LRECL
Variable length:
(CI
Size -7 –Freespace) / (Average
LRECL +3)

Allocation calculations
Calculate Freespace in each CA
Get number of CIs per CA from
LISTCAT or device characteristics
(3390, 12 x 4K CIs/track, 180/cyl)
CA freespace = No of CIs per CA *
CA Freespace %, rounded up
Number of CIs loaded per CA =
CIs per CA – CA freespace
Number of records loaded per CA =
Loaded CIs in CA * No of recs in CI
VSAM Catalogs
Exactly one master catalog
Assignedat IPL with DEF CAT or
DEFINE MCAT IDCAMS command
User catalogs – 0 to many

No more than 1 per volume
Catalog can own multiple spaces
on a volume
Many catalogs can own space on a
volume
VSAM Catalogs
Catalog contains :-
Self-describing records
User catalog pointers
Volume definitions
Space definitions
Cluster (file) definitions
Component (data, index) definitions
AIX & Path definitions

Catalog recommendations
Use naming conventions
Name Cluster, Data and Index components
explicitly
Use partition / system independent names
where applicable
Separate
Files seldom defined or deleted
Files often defined or deleted
Online critical files
Batch files
Multiple baskets – all the eggs won’t
get broken
More recommendations
Don’t use recoverable catalogs
Hangover from 2314 / 3330
Backup is vastly better
IDCAMS, Faver, Maxback, Dr D,
user-written …

CI & CA splits and freespace
You try to insert a record in a CI
or extend a record already there
If there is enough free space in
the CI, everyone moves up,
record is inserted and CI
rewritten
BUT what if there isn’t enough
free space????

CI & CA splits
CI split – 4 physical IOs
Set “Split in progress”, write CI
Move half of records to new CI &
write it
Update sequence set, write index
CI
Erase moved records from old CI,
turn off “Split in progress”, wite
old CI
BUT…..
Failure in CI split
System failure
Corrected next time CI is updated
No free CI in the CA
CA split is needed
Remember – 1 physical IO =
30,000 – 40,000 CPU
instructions…

CA Split
MANY physical reads and writes
Set “Split in progress”, write sequence
set CI
Maybe get new extent
Format new CA at HURBA position
Read / write half of CIs to new CA
Write new sequence set CI for new CA
Update higher level index CIs
Erase moved CIs from old CA, write
empty CIs
Write updated original sequence set CI
Recommendations
Don’t worry about CI splits
Avoid excessive CA splits by
defining CA freespace
Don’t do a reorg just because
you have done n CI / CA splits

To reorg or not to reorg?
“We’ve done 1000 CA splits –
better reorg!”
Inserts tend to be clustered
CI / CA split creates freespace
where it is needed, allows faster
inserts
Reorg gets rid of freespace,
causing more CI / CA splits

Recommendations
Avoid frequent reorgs
Once a split has occurred, the
processing cost has been paid
Understand your application
1 “hot spot”
Little distributed freespace – let it split
Many hot spots
Little distributed freespace – let it split
Even distribution – no hot spots
Use distributed freespace
Freespace
•3% of each CI is empty

•5% of CIs in each CA are empty
•3% of 2048 = 61 bytes = 0
records (or, at most, 1)
•5% of 315 CIs per CA = 16 CIs
Freespace
3% CI freespace where CISZ=2048

and average LRECL=120
No room in this CI for an average
length record

Altering freespace
Initial freespace set via DEFINE
eg 10% of CI and 5% of CA
If inserts are clustered, consider
DEFINE with 0% freespace, then
Load the file then
ALTER freespace to non-zero

Strings
VSAM allows multiple concurrent
processing e.g.
CICS transactions
Browsing
Updating
Placeholders (“strings”) hold file
location info

Shared / non-shared resources
Non-shared resources (NSR)
Each string has its own buffers
Multiple copies of a CI may be in
memory
Works well for batch
Local Shared Resources (LSR)

Many strings share a pool of buffers
Only 1 copy of a CI in the pool
Ideal for online
Recommendations - NSR
Non-shared resources
Eachstring must have enough index
buffers
Bad – 1 buffer (old default)
OK – 1 buffer per index level (new default)
Good – enough buffers for all high level
indexes + 1 more
Best – enough buffers to hold entire index

Recommendations - LSR
Local Shared Resource buffers
Same index buffer needs as NSR
(buffers are per pool, not per
string)
Monitor VSAM LSR stats to make
sure BUFNI keeps up with index
growth
Monitor data buffers for high hit
rates

IO with NSR
VSAM uses chained IO to read
ahead and write behind
Better to read many CIs in one IO
Block big
Large CI sizes
Be aware that VSAM will split CIs into
smaller blocks to save space
Eg 3390 with 32K CI gets written as 2 x
16K blocks giving 1.5 CIs = 48K/track
Buffer big
½ to 1 cyl of BUFND to minimize IO
IO with LSR
VSAM reads 1 CI at a time, even
for sequential processing

Monitor your stats
LISTCAT before and after critical job
Data & Index EXCPs – the fewer the better.
Index EXCPs should be close to number of
index CIs.
Job Accounting data
IO count by device
Overal CPU & IO activity
CICS stats
Shows logical / physical IO counts by file
LSR pool hits and misses
VSAM buffer stats – in VSE/ESA examples doc
LSR is in 31 bit – use LOTS but don’t page
Sharing VSAM datasets
VSAM can share files among partitions
And among VSE systems
BUT
TANSTAAFL (Robert Heinlein)
Sharing is not a performance
option (Dan Janda)
It’s your gun and your foot
(Steve Huggins)
Sharing is based on
The type of sharing you ask for
(SHAREOPTIONS)
VSE Lock Table within a single VSE
system
VSE Lock File when sharing across VSE
systems
VSE sharing mechanism is not
compatible with zOS or zVM
Sharing at OPEN / CLOSE time
Entrieschecked and placed in / removed
from lock table
If DASD volume is added as shared (ADD
cuu,SHR), it is added to lock file
VSE & VSAM allow concurrent
processing to protect against
concurrent updates messing up the file

Integrity classes – your choice
NO INTEGRITY – VSE & VSAM provide no data
protection: it’s all up to you. Your data can be
messed up.
WRITE INTEGRITY – VSE & VSAM protect against
concurrent updates
READ INTEGRITY – VSE & VSAM make sure your
programs always see the latest version of a
record
The price
Higher levels & broader scopes of integrity lead
to more CPU and IO activity

SHAREOPTIONS
Ready – Fire – Aim
Set in DEFINE CLUSTER
Get it wrong & be prepared to suffer
If a disk drive isn’t shared between
VSEs, don’t ADD it with SHR as this
causes lock file IO

SHAREOPTIONS & Locking
SHR(1) 1 output OR many input
External lock at OPEN, unlock at CLOSE
SHR(2) 1 output AND many input
SHR(3) No checking or locking
Prepare for garbage data
SHR(4) Many output in one VSE & many
input OPENs across all VSEs
External lock at access, unlock at release
SHR(4 4) Many output OPENs across all
VSEs + many input OPENs
Locks sameWAVV
as SHR(4)
2007, Green Bay, WI
Alternate indexes (AIX)
An AIX is a VSAM KSDS, acting as
a “pointer file” for another file
Target file (“Base Cluster”) can be
KSDS – pointers are KSDS key values
ESDS – pointers are Relative Byte Addrs
Great for multiple or non-unique keys

BUT
Processingvia an AIX needs IO to both
the AIX and to the base cluster

Setting up an AIX
DEFINE CLUSTER for base cluster
DEFINE AIX for the alternate index
Give base cluster’s name & alternate key
Data & Index CI sizes
DEFINE PATH
Allows specifying of NOUPGRADE paths
BLDINDEX
Reads primary & alternate key info from base
cluster
Sorts into alternate key sequence
Loads alternate index
AIX recommendations
To process the base cluster in AIX
order, it is better to sort it and use
the SORTOUT file
Remember VSAM processes base
clusters directly based on AIX values
Base cluster will need lots of index
buffers for batch processing. Give
Base cluster large BUFFERSPACE on
DEFINE or ALTER
AIX and CICS
“SPHERE” – a base cluster and all its
AIXs related to it
Requirements
Each sphere must be wholly within one
LSR pool
Use Dataset Name Sharing
In CICS 2.3, add BASE= to FCT entry for
Base cluster file entry
Each related path file entry
This is automatic in CICS TS
SHR(2) is usually best
Make sure your CICS and VSAM service
is current!
Contacting the presenter
You can contact me by email at
johnm@e-vse.com
Dan Janda’s website has much of

this info –
http://business.epix.net/~theswami
And, if you want to find me this
evening…

You’ll find me here

VSAM Performance Tuning

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

VSAM Performance Tuning

Încărcat de

Drepturi de autor:

Formate disponibile

VSE/VSAM – Inside & Out

John Mycroft, Software Developer

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

 ALL VSAM FILES ARE VARIABLE LENGTH

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

 User catalogs – 0 to many

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

•3% of each CI is empty

3% CI freespace where CISZ=2048

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

 Local Shared Resources (LSR)

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

 Great for multiple or non-unique keys

WAVV 2007, Green Bay, WI

 Dan Janda’s website has much of

WAVV 2007, Green Bay, WI

WAVV 2007, Green Bay, WI

S-ar putea să vă placă și

ALL VSAM FILES ARE VARIABLE LENGTH

User catalogs – 0 to many

Local Shared Resources (LSR)

Great for multiple or non-unique keys

Dan Janda’s website has much of