Documente Academic
Documente Profesional
Documente Cultură
Performance Tuning
Mr. Somraj Chakrabarty (somrajob@gmail.com) 01 September 2016
Senior Consultant
CapGemini India
This article is intended to provide a detail view of DB2 (LUW) database SORT operation. SORT
operations can happen during application query run and causes system performance degrade
and High CPU usage. This article describes different kind of SORT operation, SORT overflow.
Performance tuning and system CPU usage optimization by avoiding SORT operation is also
described with test results.
• Piped and Non-Piped SORT: Piped SORT returns sorted data without storing it in temporary
table whereas Non-Piped sort uses temporary table to store the data before returning. Piped
SORT keep SORTHEAP (A chunk of Memory allocated each time a sort is performed)
occupied till cursor associated with the SORT is open. Piped sorts reduce disk I/O so allowing
more piped sorts will improve the performance of sort operations and the performance of the
database as a consequence.
• Overflowed and Non-Overflowed Sort: If the amount of data to be sorted is larger than
the SORTHEAP, then temporary tables are used to hold that data. This situation is called
SORT OVERFLOW. If SORTHEAP can give room to the data, then it will be Non-Overflowed
SORT. Due to lower I/O. Query performance with Non-Overflowed SORT is always better
than Overflowed SORT. Overflowed SORT also resulting into temporary increase of usage of
filesystem /drive holding the temporary table. Special care needs to be taken for Overflowed
SORT as this may result into very quick fill up of database filesystem and file system become
full. SORT overflow can be assumed with frequent change of filesystem (holding temporary
table) usage.
SORTHEAP identifies the maximum number of private memory pages to be used for private sorts
or the maximum number of shared memory pages to be used for shared sorts. SORTHEAP affects
agent private memory for private sorts and database shared memory for shared sorts. SORTHEAP
is the area where data is sorted. Each sort has a separate sort heap that is allocated as needed by
the database manager. SORTHEAP memory allocation is purely OPTIMIZER dependent and can
be less than parameter value. When SORTHEAP is configured as AUTOMATIC, number of private
memory pages used for private sorts and the number of shared memory pages used for shared
sorts are balanced within the SORTHEAP value by STMM.
parameter is set to 0, all sort operations are performed in shared memory. Otherwise, sort
operations are performed in private sort memory.
SHEAPTHRES configuration parameter sets an instance-wide limit on the total amount of memory
that can be consumed by private sorts at any given time. This limit is soft in nature as when
the limit is reached it does not prevent further SORT operations, it just reduces the amount of
memory each SORT can consume until other SORTs complete as a result consecutive SORT
requests will be assigned lesser amount of sort heap until usable sort memory become less than
SHEAPTHRES value. If the SHEAPTHRES configuration parameter is assigned a non-zero value,
SORTHEAP cannot be configured for automatic tuning.
• SORTHEAP (sort heap) and SHEAPTHRES_SHR (SORTHEAP threshold for shared sorts)
database configuration parameters must not be set to AUTOMATIC value because self-tuning
memory management of sort memory is not currently supported for column-organized table
processing.
• To get optimized performance for analytics workloads SORTHEAP and SHEAPTHRES_SHR
parameter value should be set (hard coded) to significantly high.
• Suggested SORT memory configuration–
Set SHEAPTHRES_SHR to the size of the buffer pool (across all buffer pools)
Set SORTHEAP to some fraction (e.g. 1/10) of SHEAPTHRES_SHR to enable concurrent sort
operations.
At initial point database snapshot helps to understand how the database looks like w.r.t SORT
operation happening in database.
From this snapshot data we can calculate Total SORT per transaction as below
Here Total SORT per Transaction value is coming as 3.65 and it looks to be a database well-tuned
w.r.t SORT operation happens.
If Total SORT per Transaction value is greater than 5, this will be an indication that umber of SORT
happening per transaction is high. There may be few application queries where number of sorts is
more, those sorts may not be overflowed but it utilizes large number of CPU cycles and resulting in
overall performance hit.
From this snapshot data we can calculate Percentage of SORT Overflow as below
Here Percentage of SORT overflow is coming as 0.14% and it looks to be a database well-tuned
w.r.t SORT operation happens.
Percentage of SORT overflow for any specific database more than 3 percent in value indicates
severe SORT problems in the application queries. It shows that, number of SORT overflow is more
per SORT operation. This indicates below-optimize performance of Database. To avoid this both
SORTHEAP and SHEAPTHRES can be increased. Ideally 0 SORT overflow is desirable for a
database.
Post threshold SORT is the sorts that have requested heap memory when sort heap threshold limit
already exceeded. This kind of SORT won't get optimum amount of memory (SORTHEAP amount
or more than this) to execute.
Percentage of Post threshold sorts = ((Post threshold sorts * 100) / Total sorts) =
6921*100/1163139294 =0.006%
Here percentage of post threshold SORT is very less. Where it is high, tuning of sort heap
threshold limit and Query to generate fewer sorts is inevitable.
SORTHEAP value can be increased within specific limit, however very large SORTHEAP value
(hardcoded) can cause memory deficiency for other memory parameter and as a consequence
it adversely affects system performance. SORTHEAP value needs to be tuned such that SORT
threshold is at least twice as large as SORTHEAP. In our environment SHEAPTHRES_SHR is
5000, So we are getting the below warning
D:\>db2 update db cfg using SORTHEAP 32000
SQL5155W The update completed successfully. The current value of SORTHEAP may adversely affect performance.
Effect of SORTHEAP value change can be tested on a small table and a large table.
In our test environment SAMPLE database, It had been tested in a small table SC1.T1 with row
count 0.1 million and a large table SC1.T2 with a row count of 10 Million. Total system memory is 4
GB. Testing had been done with SORTHEAP value 16 and then increasing the same to 1000 then
5000 then 32000 and AUTOMATIC value as below.
SORT activity related values had been extracted with the below query
db2 "SELECT STMT_TEXT, TOTAL_SORTS, SORT_OVERFLOWS FROM TABLE (MON_GET_PKG_CACHE_STMT ('D', NULL, NULL, -2))"
> SORT_STMT.txt
A simple query with DISTINCT and ORDER BY clause ran on SC1.T1 table to enforce SORT
operation
db2 "select DISTINCT COL A, COL B from SC1.T1 ORDER BY COL B"
Upon running the above query in small table SC1.T1 with different SORTHEAP value, each time
SORT_STMT.txt output came as below
STMT_TEXT TOTAL_SORTS SORT_OVERFLOWS
Same query with DISTINCT and ORDER BY clause ran on SC1.T2 table to enforce SORT
operation
db2 "select DISTINCT COL A, COL B from SC1.T2 ORDER BY COL B"
Upon running the above query in large table SC1.T2 with different SORTHEAP value, each time
SORT_STMT.txt output came as below
STMT_TEXT TOTAL_SORTS SORT_OVERFLOWS
This test result shows that for small and large tables, even if we are increasing SORTHEAP value
with a fixed SHEAPTHRES_SHR (value 5000), we are not able to avoid SORT overflow.
Now prior to increase SORTHEAP, we have increased SHEAPTHRES_SHR to a high value say
70000 and then we increased SORTHEAP to 32000.This increased without any warning.
Now again we ran the same queries in tables and able to avoid SORT overflow, but very low value
of SORTHEAP as 16 can't avoid it.
Upon running the above query in small table SC1.T1 with SORTHEAP value 32000, output came
as below
STMT_TEXT TOTAL_SORTS SORT_OVERFLOWS
Whereas query in small table SC1.T1 with SORTHEAP value 16, output came as below
STMT_TEXT TOTAL_SORTS SORT_OVERFLOWS
So to get optimum performance, all SORT related parameters (SORTHEAP, SHEAPTHRES and
SHEAPTHRES_SHR) need to be tested and tuned. Only SORTHEAP tuning will not provide
desired result.
When a database suffers from poor performance, Investigation should be carried out for database
SORT operation and SORT overflow.
Below queries can be used to monitor/identify database SORT and SORT Overflow.
db2 "SELECT STMT_TEXT, TOTAL_SORTS, SORT_OVERFLOWS FROM TABLE (MON_GET_PKG_CACHE_STMT ('D', NULL, NULL, -2))"
db2 "select agent_id, stmt_text as Statement, total_sort_time, sort_overflows, from table (snapshot_statement
('',-1)) as T order by sort_overflows desc fetch first 10 rows only"
db2 "select agent_id, stmt_text as Statement, STMT_SORTS, TOTAL_SORT_TIME, SORT_OVERFLOWS from TABLE
(SNAP_GET_STMT ('',-1)) AS T fetch first 10 rows only"
All identified queries which are suffering from SORT overflow needs to be analyzed with
DB2ADVIS tool which can generate advice for any necessary index(s) which can help to avoid
SORT and/or SORT overflow. SQL tuning is another way to eradicate SORT problem.
A simple query with DISTINCT and ORDER BY clause ran on SC1.T1 table to enforce SORT
operation
db2 "select DISTINCT COL A, COL B from SC1.T1 ORDER BY COL B"
Upon running the above query in small table SC1.T1 with initial SORTHEAP value (256), each
time SORT_STMT.txt output came as below
STMT_TEXT TOTAL_SORTS SORT_OVERFLOWS
Analysis on this query with DB2ADVIS tool recommends creation of one Index as below
Recommending indexes...
total disk space needed for initial set [ 7.231] MB
total disk space constrained to [ 123.058] MB
Trying variations of the solution set.
1 indexes in current solution
[19705.0000] timerons (without recommendations)
[12836.0000] timerons (with current solution)
[34.86%] improvement
-- LIST OF RECOMMENDED INDEXES
-- ===========================
-- index[1], 7.231MB
CREATE INDEX "SC1"."IDX1602251456370" ON "SC1"."T1"
("COL B" ASC, "COL A" DESC) ALLOW REVERSE
SCANS COLLECT SAMPLED DETAILED STATISTICS;
COMMIT WORK ;
DB2ADVIS output suggested one index to be created and query cost improvement is only 34.86%
(Due to the fact that table size is small so new index don't improve query cost to large extent)
After this index IDX1602251456370 is created and RUNSTATS is run, we ran the same query and
from SORT_STMT.txt output came as below
STMT_TEXT TOTAL_SORTS SORT_OVERFLOWS
This proves that the newly created index helps the query to avoid SORT operation (and SORT
overflow in turn).
With the help of db2expalin and db2exfmp tool the same can be checked as below
db2 set current explain mode explain
db2 "select DISTINCT COL A, COL B from SC1.T1 ORDER BY COL B"
db2exfmt -d SAMPLE -# 0 -w -1 -g TIC -n % -s % -o explnplan.out
db2 set current explain mode no
more explnplan.out
Rows
RETURN
( 1)
Cost
I/O
|
394931
TBSCAN
( 2)
19705.4
10186
|
394931
SORT
( 3)
17948.4
8580
|
394931
TBSCAN
( 4)
6661.23
6974
|
394931
TABLE: SC1
T1
Q1
Access Plan:
-----------
Total Cost: 2071.36
Query Degree: 1
Rows
RETURN
( 1)
Cost
I/O
|
394931
IXSCAN
( 2)
2071.36
1933.37
|
394931
INDEX: SC1
IDX1602251456370
Q1
Same query with DISTINCT and ORDER BY clause ran on SC1.T2 table to enforce SORT
operation
db2 "select DISTINCT COL A, COL B from SC1.T2 ORDER BY COL B"
Upon running the above query in large table SC1.T2 with different SORTHEAP value, each time
SORT_STMT.txt output came as below
STMT_TEXT TOTAL_SORTS SORT_OVERFLOWS
Analysis on this query with DB2ADVIS tool produced the below result with creation of one Index
Recommending indexes...
total disk space needed for initial set [ 9.993] MB
total disk space constrained to [ 118.008] MB
Trying variations of the solution set.
1 indexes in current solution
[133050.0000] timerons (without recommendations)
[18071.0000] timerons (with current solution)
[86.42%] improvement
-- LIST OF RECOMMENDED INDEXES
-- ===========================
-- index[1], 9.993MB
CREATE INDEX "SC1"."IDX1602291708390" ON "SC1"."T2"
("COL B" ASC, "COL A" DESC) ALLOW REVERSE SCANS COLLECT SAMPLED DETAILED STATISTICS;
COMMIT WORK ;
DB2ADVIS output suggested one index to be created and query cost improvement is 86.42%
which is quite high
After this index IDX1602291708390 is created and RUNSTATS is run, we ran the same query and
from SORT_STMT.txt output came as below
This proves that the newly created index helps the query to avoid SORT operation (and SORT
overflow in turn).
During our test we could see that how a new index suggested by DB2ADVIS tool is improving
query performance by avoiding SORT operation.
This approach actually resolves database performance issue and not just masking it.
Lets check for CPU time for before and after Indexing activity in Table SC1.T1.Benchmark testing
with db2batch tool can be used to get the CPU usage during the query run before and after Index
creation activity as below
We have run the below query before and after Creation of Index SC1.IDX1602251456370
D:\>db2batch -d SAMPLE -f db2batch_T1.txt -i complete > T1_Batch_Before.txt
D:\>db2batch -d SAMPLE -f db2batch_T1.txt -i complete > T1_Batch_After.txt
T1_Batch_Before.txt: 686404
T1_Batch_After.txt: 78000
A query with SORT runs in multiple everyday can have negative effect on CPU usage and
similarly, new index creation to avoid SORT operation by the query can have collectively positive
effect and optimize system CPU usage to great extent.
Now we have two tables with same data but one is row organized (SC1.T1) and other is Column
organized (SCC.T1C).
As already described that to get optimized performance for analytics workloads SORTHEAP and
SHEAPTHRES_SHR parameter value should be high, in our database SAMPLECOL it is auto
configured to SHEAPTHRES_SHR is 94820 and SORTHEAP is 32000.We have now changed
SHEAPTHRES_SHR to 94820 and SORTHEAP to 32000 in SAMPLE database also which holds
row organized table.
A simple query with DISTINCT and ORDER BY clause ran on SC1.T1 table to enforce SORT
operation
db2 "select DISTINCT COL A, COL B from SC1.T1 ORDER BY COL B"
Upon running the above query in small table SC1.T1 we can see that the query can avoid SORT
overflow but not SORT
STMT_TEXT TOTAL_SORTS SORT_OVERFLOWS
Analysis on this query with DB2ADVIS tool produced the below result with no indexing advice
Recommending indexes...
0 indexes in current solution
[6966.0000] timerons (without recommendations)
[6966.0000] timerons (with current solution)
[0.00%] improvement
-- LIST OF RECOMMENDED INDEXES
-- ===========================
-- no indexes are recommended for this workload.
So in Row Organized table, If SORT related parameters will be increased considerably, it won't ask
for any Index to be created. Here query cost will also be decreased but SORT can't be avoided.
If we run the same command in SAMPLECOL database SCC.T1C table we can see that the query
can avoid SORT overflow but not SORT
STMT_TEXT TOTAL_SORTS SORT_OVERFLOWS
Upon testing of this query with DB2ADVIS, we could see that same query cost is much less in
column organized table
Recommending indexes...
0 indexes in current solution
[683.0000] timerons (without recommendations)
[683.0000] timerons (with current solution)
[0.00%] improvement
So in column organized tables, we can't avoid SORT for this query, neither we can try to avoid it
with new indexing. However the same query cost is much less than that of Row organized table.
Points to remember
1. SORT is a database operation which can occur due to nature of the query running in
Database.
2. Too many SORT operation negatively impact database performance and High CPU utilization.
3. SORT data may overflow out of SORT memory into temporary tablespace, which negatively
impact database performance and cause temporary high storage utilization.
4. Increasing SORTHEAP may or may not improve performance by reducing the chance of
SORT Overflow.
5. Identification and analysis of queries causing SORT needs to be tuned with New Indexing for
reducing SORT overflow and database performance improvement.
6. New Indexing can reduce number of SORT (Overflowing and Non Overflowing) and optimize
system CPU utilization.
Acknowledgments
Special thanks to Manish Makwana for review and advice towards writing this article.
Resources
• Learn more from Database SORT, SORT overflow from IBM infocenter
• Stay current with developer technical events and webcasts focused on a variety of IBM
products and IT industry topics.
• Follow developerWorks on Twitter
• Get involved in the developerWorks Community. Connect with other developerWorks users
while you explore developer-driven blogs, forums, groups, and wikis.
Suvradeep Sensarma is a DB2 LUW DBA senior consultant with CapGemini India.
He has extensive experience working with customers in performance tuning and
support tips on DB2 on LUW. He is also certified in DB2 10.1 DBA for Linux, UNIX,
and Windows (Exam 611).
Abhinava Mukherjee is a DB2 LUW DBA with CapGemini India supporting multiple
projects in various domains. He has a Master degree in Computer Application along
with hands on IT experience in Datacenter Operation and System Administration
working on AIX, Unix, AS-400 and Windows environment. He is also certified in DB2
10.1 Fundamentals (Exam 610).
© Copyright IBM Corporation 2016
(www.ibm.com/legal/copytrade.shtml)
Trademarks
(www.ibm.com/developerworks/ibm/trademarks/)