Sunteți pe pagina 1din 36

SAP SYBASE REPLICATION SERVER

INTERNALS AND PERFORMANCE TUNING


RDP364

Exercises / Solutions
Chris Brown/SAP
Jeff Tallman/SAP
Table of Contents
BEFORE YOU START....................................................................................................................................................... 3

Lab 1: Running Reports & Summary Report ..................................................................................................................... 4

Lab 2: RepAgent & Inbound Queue SQM Analysis ......................................................................................................... 21

Lab 3: SQMR, SQT & DIST Analysis............................................................................................................................... 26

Lab 4: Outbound Queue SQM, DSI & DSIEXEC Analysis .............................................................................................. 31

2
BEFORE YOU START

Pre-requisites for this hands-on session assumes that the student has the following skills or knowledge:

 Sybase ASE basic SQL and database administration.


 Basic understanding of data flow through Sybase Replication Server

Note that analysis of SRS Monitor Counter data (aka SRS MC) utilizes an ASE database of approximately 1GB in size
on an ASE instance using a minimum page size of 4KB. Other details of installation and configuration are included in
the release notes. The analysis package with scripts to build the environment can be obtained from
jeff.tallman@sap.com

Exercise Estimated Duration

1. Lab 1: Running Reports & Summary Report 15 minutes

2. Lab 2: RepAgent & Inbound Queue SQM Analysis 10 minutes

3. Lab 3: SQMR, SQT & DIST Analysis 10 minutes

4. Lab 4: Outbound Queue SQM, DSI & DSIEXEC Analysis 10 minutes

3
Lab 1: Running Reports & Summary Report
Estimated time: 15 minutes

Objective

The objective of this exercise is to

1. Learn how to load captured data for analysis


2. Get familiar with running reports

In addition, due to timing, the template installation is patched to a later release. This is useful, as sometimes, a patch
can be obtained from the instructors.

Exercise Description

The following description only serves as a short overview to what objectives will be executed during the exercise.

For example:
 How to truncate tables of any residual data prior to loading new data (which also drops some indexes to speed load)
 How to load new data into the schema
 How to update index statistics and recreate indexes dropped for data loading
 How to run the summary report and key sections to review
 How to run the detail report

4
LAB 1 – SOLUTION

Explanation Screenshot

1) Navigate to student
session files and leave
open:

Select the Student (Share)


shortcut on the desktop
and navigate to
D:\Files\Session\RDP364

2) Open the "Computer" icon

5
3) Pick ToolsMap Network
Drive. This will cause the
network drive dialog to
popup

4) Return to the student share


window and click in the
address bar and
select/copy (Ctl-A) the link
address and paste into the
network drive link address
and enter "Z:" for the drive
letter

5) The result should be a file


explorer window
resembling this

6
6) Navigate to Z:\RS_157
folder, shift-right-click on
"setup" and select "Open
command window here" to
open a DOS command
window.

7) Right click on icon in upper


left to get menu and select
"Properties"

7
8) Make sure "QuickEdit
Mode" is selected (along
with "Insert Mode") (do not
press "OK" yet)

9) Select the Layout tab and


change the screen buffer
sizes to 255 & 3000
respectively and then
change the Window Size to
100 & 50 respectively (do
not press "OK" yet)

10) Change the "Screen


Background" and "Screen
Text" to colors of your
preference. Then press
"OK"

8
11) Now type the DOS
command "dir *.sql" to get
a list of directory contents
of SQL scripts

12) Execute the execute the following command:


rs_report_explanations.
sql script - note in isql -Usa -PAbcd1234 -STechEd_ASE -w999 -Drep_analysis_157
-irs_report_explanations.sql
particular the -w999 and
-D options

13) Execute the execute the following command:


rs_report_headings script
(hint: up arrow to retrieve isql -Usa -PAbcd1234 -STechEd_ASE -w999 -Drep_analysis_157
-irs_report_headings.sql
previous command and
then use back arrow and
backspace to simply
overtype the
rs_report_explanations.sql)

9
14) Execute the execute the following command:
rs_mc_analysis_procs_157
.sql script isql -Usa -PAbcd1234 -STechEd_ASE -w999 -Drep_analysis_157
-irs_mc_analysis_procs_157.sql
The reason we executed
these three scripts is that
we need to patch the
15.7.1 v1.0 to 15.7.1 v1.0.1

15) Change to the lab directory


via cd ..\..\lab

10
16) Use dir to get a list of
directory contents

17) Execute the execute the following command:


truncate_tables.sql script to
clear the analysis schema isql -Usa -PAbcd1234 -STechEd_ASE -w999 -Drep_analysis_157
-itruncate_tables.sql
of any residual data

11
18) Using windows explorer,
navigate to Z:\lab, then
select bcp_in_rssd.bat and
carefully shift-right-click,
select "Send to" and
"TextPad"

If prompted to run, select


"cancel"

If you accidentally run,


repeat the truncate_tables
script and try this step
again.

19) Verify the DSQUERY, the


user, password in the top
of the .bat file (do not close
yet)

Note that there are some


commented lines to handle
different RSSD schemas
for older SRS versions
such as 15.2, 15.5
(including 15.6) specifically
for rs_databases (critical),
rs_objects and rs_columns.
For SRS 15.2, there are
two versions - one of which
covers situations in which
SRS was upgraded to 15.2
(which has a slightly
different rs_databases
schema). If you get load
errors on these tables, this
is an area to check (make
sure using the correct bcp
line for your SRS version)

12
20) Verify each bcp line
includes a "-Y" option.

Note: we need to do this


as the template ASE install
for TechEd was for utf8 vs.
the default DOS/windows
charset of cp1252 or
similar.

Note 2: If "-Y" is not there,


the easiest approach is to
use Search/Replace and
replace all occurrences of -
t"|" with -t"|" -Y

13
21) Back in the DOS window,
execute the script. The
output should resemble the
below.

Note that if any errors


occur, they will scroll to the
screen as well as in most
cases get logged in the
<table>.err file in the
source data directory

14
22) Execute the execute the following command:
update_statistics.sql script
isql -Usa -PAbcd1234 -STechEd_ASE -w999 -Drep_analysis_157
-iupdate_statistics.sql
Note that any failure along
the path can easily be fixed
by rerunning the
truncate_tables.sql script,
the bcp_in_rssd.bat and
this script again.

This is a critical step -


skipping this step will result
in extremely long execution
times as key indexes will
be missing. This step also
merges the various version
schema differences into a
single schema

23) Open the sample_run.sql


file. It is a template of the
single form of execution for
rs_perf_summary as well
as the 3 key variances for
rs_perf_details.

Since the first execution is


just to get a summary to
determine connections and
backlog, we need to
uncomment the 'exec
rs_perf_summary' line and
make sure all the 'exec
rs_perf_details' executions
are commented out.

15
24) Execute the summary execute the following command:
report via isql specifying an
output file as shown. isql -Usa -PAbcd1234 -STechEd_ASE -w999 -Drep_analysis_157
-isample_run.sql -ostudent_sample.out

25) In Windows Explorer and


open the file using
'TextPad'. For better
viewing, enable line
numbers in TextPad by
selecting View  Line
Numbers.

**************************************************************
26) Look at the data * *
distribution and routing * Data Distribution & Routing Report Sections
*
*
*
section just below the **************************************************************
config values (line ~250).
Take note of the different Databases in this Replication Server publish data to ...:
databases and how they
source database destination database RRS Name #Tables #Repdef #Subscr
participate (table repdefs & --------------------- ----------------------- ----------- ------- ------- -------
subscriptions vs. DB ASE1.tpcc (103) ASE2.tpcc (102) (local) 17 17 17
repdefs, etc.)
Databases in this Replication Server subscribe to data from ...:

destination database source database PRS Name #Tables #Repdef #Subscr


------------------------ -------------------- ----------- ------- ------- -------

ASE2.tpcc (102) ASE1.tpcc (103) (local) 17 17 17

Answer: Note that there is a source & destination database listed - these will be the input to the detail report.
Because there isn't a route, we will use the PDS.PDB , RDS.RDB variant of the detail report If a route existed,
we would need to use the PRS or RRS name as one of either the source or target input parameter instead of the
database for the detail report

16
27) Scroll down through the
report looking at various
sections until you get to
~line 350 where the
backlog summary is listed.
Note that this backlog is
based off of the number of
'active' segments and we
would have to look closely
at the real counter data to
see if an actual backlog
exists.

28) Scroll down to the next


section (line ~380) which
reports SRS memory
allocation.

ExecCmds NRMSQMRqst SQMCmdCach SQMPgCache DIST SQT MD Request DSI SQT


29) Considering the values to ---------- ---------- ---------- ---------- ---------- ---------- ----------
the right, does it appear 2.0
2.0
16.0
16.0
200.0
200.0
256.0
256.0
200.0
200.0
1.0
1.0
29.9
0.0
that any areas are 2.0 16.0 200.0 256.0 200.0 1.0 29.9
---------- ---------- ---------- ---------- ---------- ---------- ----------
over/under configured 6.0 48.0 600.0 768.0 600.0 3.0 59.8
memory wise?? Which
configuration values should Answer: SQMCmdCache ( sqm_cmd_cache_size) and SQMPageCache ( sqm_cache_size * sqm_page_size *
be adjusted?? block_size) seem to be very high - it could have been due to trying to address the apparent inbound queue
latency - or it could be a cause of it. In addition, the DSI SQT cache (dsi_sqt_max_cache_size) seems to be a
tad low for a performance system

17
Configured server memory_limit is 2621440MB
30) Note the following output. Maximum memory used was 2002.700MB or 0.000% of memory configured
Does this RS look like it is ExecCmds -> exec_max_cache_size
over-configured for NRMSQMRqst -> exec_nrm_request_limit + exec_sqm_write_request_limit
SQMCmdCach -> sqm_cmd_cache_size
memory_limit? SQMPgCache -> sqm_cache_size * sqm_page_size * block_size * 1024 * 2 (IBQ &
OBQ)
DIST SQT -> dist_sqt_max_cache_size (sqt_max_cache_size)
Note the formulas for MD Request -> md_sqm_write_request_limit
DSI SQT -> dsi_sqt_max_cache_size (sqt_max_cache_size)
calculating memory usage SQT PRS -> sqt_max_prs_size
DSI CDB -> dsi_cdb_max_size
- this should help users
that may only be familiar In addition, other memory allocators such as dsi_cmd_batch_size, dsi_xact_group_size affect the
final total, but are not included in the
(or assuming) that SQT above values (so actual memory allocation/consumption may be higher).
cache is key component
when in reality it is just a Answer: Yes - this is typical when DBA is throwing memory at the problem, but SRS really isn't using it.
small piece. Probably could be reconfigured to only 8GB without impact (reason why this high vs. 2GB will be discussed later)

31) Now, re-edit the


sample_run.sql script and
comment out the summary
execution and uncomment
the first rs_perf_details
execution.

Note the following use cases for


the variants:

1  used for direct PDB to RDB


replication with no routes involved

2  used for PDB to replicate


over a route in which we want to
analyze the PDB/inbound
processing. Note that the RRS is
specified vs. RDB.

3  use for PRS to RDB in which


we want to analyze the outbound
latency performance. Notice that
the PRS is specified vs. the PDB.

32) Re-run the report script, execute the following command:


but changing the output file
name for the detailed isql -Usa -PAbcd1234 -STechEd_ASE -w999 -Drep_analysis_157
-isample_run.sql -ostudent_detail.out
report.

18
33) Edit the output and re-
enable line numbers if not
enabled. Notice the
Execution syntax section
which lists other optional
parameters that can be
used and the values
(mostly defaults) that were
used for this execution.
Generally, the most
common use is to narrow
in analysis on a smaller
date/time range where a
problem is occurring.

34) Scroll down to the Source Connection Configuration Values: ASE1.tpcc

connection configurations - Configuration Option Name Value


take note of connection ------------------------------------------------- -------------------------------
level changes from server
configs. Should we
Destination Connection Configuration Values: ASE2.tpcc
change any of these?
Configuration Option Name Value
------------------------------------------------- -------------------------------
dsi_num_large_xact_threads 0
dsi_num_threads 1
dsi_serialization_method wait_for_commit
dsi_sqt_max_cache_size 0
parallel_dsi off

Answer: Yes - from earlier observations about dsi_sqt_max_cache_size being fairly low, we probably should
change it…we just don't know what to yet (later discussion)

Report Intervals: 15 (duration in seconds)


35) Look at the intervals
section to get an idea of Int
---
Label
--------------------
Samples
-------
Duration
--------
Start DateTime
--------------------
End DateTime
--------------------
how many samples per 1 14:49:01 -> 14:51:33 3 150 Jul 10 2013 14:50:32 Jul 10 2013 14:51:33
2 14:51:34 -> 14:53:03 3 90 Jul 10 2013 14:51:34 Jul 10 2013 14:53:03
interval as well as rough 3 14:53:04 -> 14:54:34 3 90 Jul 10 2013 14:53:04 Jul 10 2013 14:54:34
time alignments with 4 14:54:35 -> 14:56:05 3 90 Jul 10 2013 14:54:35 Jul 10 2013 14:56:05
5 14:56:06 -> 14:57:36 3 90 Jul 10 2013 14:56:06 Jul 10 2013 14:57:36
intervals 6 14:57:37 -> 14:59:07 3 90 Jul 10 2013 14:57:37 Jul 10 2013 14:59:07
7 14:59:08 -> 15:00:37 3 90 Jul 10 2013 14:59:08 Jul 10 2013 15:00:37
8 15:00:38 -> 15:02:08 3 90 Jul 10 2013 15:00:38 Jul 10 2013 15:02:08
9 15:02:09 -> 15:03:39 3 90 Jul 10 2013 15:02:09 Jul 10 2013 15:03:39
10 15:03:40 -> 15:05:10 3 90 Jul 10 2013 15:03:40 Jul 10 2013 15:05:10
11 15:05:11 -> 15:06:41 3 90 Jul 10 2013 15:05:11 Jul 10 2013 15:06:41
12 15:06:42 -> 15:08:11 3 90 Jul 10 2013 15:06:42 Jul 10 2013 15:08:11
13 15:08:12 -> 15:09:42 3 90 Jul 10 2013 15:08:12 Jul 10 2013 15:09:42
14 15:09:43 -> 15:10:13 1 30 Jul 10 2013 15:09:43 Jul 10 2013 15:10:13

19
**************************************************************
36) Look at the * *
throughput/latency report * Throughput Rate/Latency Summary Report Section
*
*
*
(shortened version to **************************************************************
right). How do the
throughput rates compare
across the modules?
Where does the rate first ASE1.tpcc (103) --> ASE2.tpcc (102)
RS Performance Summary (cmds/sec, backlog in MB):
change? Is there any real
latency in the inbound Interval RepAgent SQM(w) Active(i) Backlog(i) Active(o) DSI
-------------- ---------- ---------- ---------- ---------- ---------- ----------
queue? Outbound queue? (1) 14:49:01 0 0 2009 0 73 0
(2) 14:51:34 7553 7561 1 0 1 1395
Why is there a jump in DSI (3) 14:53:04 7451 7472 177 0 59 1874
throughput when inbound (4) 14:54:35 8631 8636 646 0 235 1896
(5) 14:56:06 8143 8150 1087 0 400 1924
finishes? (6) 14:57:37 8495 8502 1549 0 575 1862
(7) 14:59:08 8180 8186 1993 0 743 1922
(8) 15:00:38 7230 7234 2386 0 891 1930
(9) 15:02:09 0 0 2386 0 891 2318
(10) 15:03:40 0 0 2021 0 861 2371
(11) 15:05:11 0 0 1759 0 823 2381
(12) 15:06:42 0 0 1759 0 783 2368
(13) 15:08:12 0 0 1759 0 745 2402
(14) 15:09:43 0 0 1759 0 706 2544
-------------- ---------- ---------- ---------- ---------- ---------- ----------
6187 6193 2386 0 891 2091

Answer: The rates are nearly identical for all modules other than the DSI. This means that everything is keeping
up - except of course - the DSI. The inbound queue doesn't have any latency - just delay in recovering disk
space - we can tell this as the DIST has same rate as inbound writer SQM ….if it didn't then it would be possible
that any backlog would be masked by the size of SQT cache and it would explain the active segments. The
outbound queue has real latency driven by DSI throughput issues. The jump could be due to two possible
causes (1) when the inbound side becomes quiescent, there is more CPU available to outbound threads; (2)
internal contention on STS structures between DSI and DIST, RepAgent User. #2 is more likely with this few
connections

37) You have completed the You are now able to:
exercise!
 Load data into the analysis schema
 Run a sample report to identify key connections with latency
 Review memory utilization from the summary report
 Run a detailed report and review throughput/latency summary

20
Lab 2: RepAgent & Inbound Queue SQM Analysis
Estimated time: 10 minutes

Objective

The objective of this exercise is to analyze potential bottlenecks in the RepAgent and inbound queue SQM writer. The
reason this is important is that if there is any latency in this section of processing, it will build up in the primary
database log - which could threaten the availability of the primary database.

Exercise Description

The following description only serves as a short overview to what questions the student will be able to learn during the
exercise that are commonly asked questions in a normal performance situation.

 Should we change any RepAgent configs (packet size, scan_batch_size)?


 How much would using the NRM thread help?
 Would async parsing help more or not so much?
 Were there any issues with write waits between RepAgent User & SQM?
 Should the sqm_recover_seg be adjusted?
 Would a larger block_size configuration help?
 How did the rate of segment allocations/deallocations appear?

21
EXERCISE 2 – SOLUTION

Explanation Screenshot

**************************************************************
1) Open the earlier detail * *
* RepAgent User (EXEC) Report Section *
report and scroll to first * *
**************************************************************
section (~line 510)…it
should like what is to right
(truncated).
RepAgent Processing Time Summary (time in seconds)
Source Connection: ASE1.tpcc (103)
How much of each sample
Interval RAExecTime ExecutorTm YieldTime WaitMemTm PcktRecvTm ParseTime
interval was RepAgent ------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
(1) 14:49:01 -> 14:51:33 0.000 0.000 0.000 0.000 0.000 0.000
user running? (2) 14:51:34 -> 14:53:03
(3) 14:53:04 -> 14:54:34
28.265
27.706
1.678
1.982
0.000
0.000
0.000
0.000
0.774
0.803
12.982
12.801
(4) 14:54:35 -> 14:56:05 32.333 2.179 0.000 0.000 0.959 14.924
(5) 14:56:06 -> 14:57:36 30.707 2.073 0.000 0.000 0.899 14.151
Is the NRM thread (6) 14:57:37 -> 14:59:07
(7) 14:59:08 -> 15:00:37
32.285
31.591
2.162
2.532
0.000
0.000
0.000
0.000
0.959
0.931
14.956
14.432
enabled? If there was RA (8) 15:00:38 -> 15:02:08
(9) 15:02:09 -> 15:03:39
27.192
0.007
1.816
0.006
0.000
0.000
0.000
0.000
0.792
0.000
12.524
0.000
latency, would it help? (10) 15:03:40 -> 15:05:10
(11) 15:05:11 -> 15:06:41
0.002
0.000
0.002
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
Would async parsing help (12) 15:06:42 -> 15:08:11
(13) 15:08:12 -> 15:09:42
0.000
0.001
0.000
0.001
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
more? What other options (14) 15:09:43 -> 15:10:13
------------------------------
0.000
----------
0.000
----------
0.000
----------
0.000
----------
0.000
----------
0.000
----------
exist for reducing the 210.089 14.431 0.000 0.000 6.117 96.770

parse time without using


Answers:
async parsing? (1) ~30 secs out of 1.5 mins or ~33%
(2) no - cfg at line 144 shows nrm_thread=off
(3) Yes…NRM+PAK is 38+14=52 out of 210 secs or max improvement of 20-25%. However, async parsing
would help more as it is 96 secs of 210 (nearly 50%) just for parsing plus the 52 for total 96+52=148 out of 210
(70%) even without multiple PRS threads.
(4) The EXEC command cache (exec_max_cache_size and sp_config_repagent 'ltl metadata reduction') can
reduce parsing time by using the command cache to cache frequently replicated commands. Another option is to
use MPR and multiple rep agents to distribute the load between different paths, but this was not discussed in
class

RepAgent Throughput Summary


2) Scroll to the next Source Connection: ASE1.tpcc (103)

RepAgent User/EXEC Interval


------------------------------
CmdsRecv
----------
Cmds/Sec
----------
MB Recv
----------
MB/sec
----------
RAScanSize
----------
UpdLocater
----------
Upd/Min
----------
(1) 14:49:01 -> 14:51:33 0 0.0 0.00 0.00 0 0 0.0
section - ~line 560. (2) 14:51:34 -> 14:53:03
(3) 14:53:04 -> 14:54:34
679787
670618
7553.2
7451.3
187.08
184.51
2.08
2.05
750
750
906
894
604.0
596.0
(4) 14:54:35 -> 14:56:05 776805 8631.2 213.89 2.38 750 1035 690.0
(5) 14:56:06 -> 14:57:36 732865 8142.9 201.59 2.24 751 975 650.0
(6) 14:57:37 -> 14:59:07 764537 8494.9 210.49 2.34 751 1018 678.7
What is the peak rate? (7) 14:59:08 -> 15:00:37
(8) 15:00:38 -> 15:02:08
736179
650695
8179.8
7229.9
202.52
179.08
2.25
1.99
751
750
980
867
653.3
578.0
(9) 15:02:09 -> 15:03:39 4 0.0 0.00 0.00 1 3 2.0
The apparent average (10) 15:03:40 -> 15:05:10
(11) 15:05:11 -> 15:06:41
6
0
0.1
0.0
0.00
0.00
0.00
0.00
3
0
2
0
1.3
0.0
rate? How many GB were (12) 15:06:42 -> 15:08:11
(13) 15:08:12 -> 15:09:42
0
0
0.0
0.0
0.00
0.00
0.00
0.00
0
0
0
1
0.0
0.7
(14) 15:09:43 -> 15:10:13 0 0.0 0.00 0.00 0 0 0.0
received total? How long ------------------------------ ----------
5011496
----------
6187.0
----------
1379.16
----------
1.70
----------
584
----------
6681
----------
494.8
did it take? Why are we
updating the locater? Answers:
(1) ~8631 cmds/sec
Does ~650 updates to the (2) Report says 6187, but that is due to low commands on intervals 9&10 - actual average is probably
locater per minute seem ~8000cmds/sec
(3) ~1.3GB
excessive? How could we (4) ~10 minutes (~10GB/hr)
decrease that? Were (5) Ensures RepAgent fwd'd cmds are in the inbound queue before RepAgent moves the 2nd truncation point.
there any "alerts" about (6) Yes…considering we would like PDB recovery to be within a few minutes (~5?), we are moving the secondary
truncation point 10 times per second. We could decrease this by a factor of 100 and still not impact database
this? recovery speed (it would be ~10 seconds at 6 times per minute)
(7) Increase the RepAgent's configured 'scan_batch_size'
(8) Yes - in fact, it suggests that we increase the 'scan_batch_size'.

22
RepAgent Packet Processing
3) Scroll to the next section Source Connection: ASE1.tpcc (103)

(Packet processing) @ Interval Packets PacketSize Cmds/Packe PcktRecvTm ms/Packet


------------------------------ ---------- ---------- ---------- ---------- ----------
~line 614 (1) 14:49:01 -> 14:51:33 0 0 0.0 0.000 0.0
(2) 14:51:34 -> 14:53:03 109936 1784 6.0 0.774 0.0
(3) 14:53:04 -> 14:54:34 115635 1673 6.0 0.803 0.0
(4) 14:54:35 -> 14:56:05 130983 1712 6.0 0.959 0.0
What is the derived (5) 14:56:06 -> 14:57:36 123973 1705 6.0 0.899 0.0
(6) 14:57:37 -> 14:59:07 128148 1722 6.0 0.959 0.0
packet size? Does this (7) 14:59:08 -> 15:00:37 124081 1711 6.0 0.931 0.0
(8) 15:00:38 -> 15:02:08 109437 1715 6.0 0.792 0.0
seem efficient? Given the (9) 15:02:09 -> 15:03:39 6 135 1.0 0.000 0.0
(10) 15:03:40 -> 15:05:10 6 188 1.0 0.000 0.0
packet size and (11) 15:05:11 -> 15:06:41 0 0 0.0 0.000 0.0
(12) 15:06:42 -> 15:08:11 0 0 0.0 0.000 0.0
commands per packet, (13) 15:08:12 -> 15:09:42
(14) 15:09:43 -> 15:10:13
1
0
36
0
0.0
0.0
0.000
0.000
0.0
0.0
what can we infer about ------------------------------ ----------
842206
----------
1238
----------
4.0
----------
6.117
----------
0.0
the relative size of the
commands being
Answers:
replicated? Compare to (1) ~1700 bytes - which implies the real packet size is probably 2048 (the default). Remember, we don't split
the LTL size in the next commands across packets, consequently the actual effective size is less than the configuration.
(2) no - not really. While we are getting about 6 commands per packet, this could be better
section. (3) The commands & relative width of tables being modified are likely fairly narrow. The next section shows that
we are getting <300 bytes per command (which including overhead suggest less than 200 bytes per command).

RepAgent Normalization & SQM Packing


4) Scroll down to the Source Connection: ASE1.tpcc (103)

Normalization and Interval


------------------------------
CmdsDirect
----------
WriteWaits
----------
WritWaitTm
----------
RAWaitNRM
----------
WaitNRMTm
----------
RANrmTime
----------
RAPackTime
----------
(1) 14:49:01 -> 14:51:33 0 0 0.000 0 0.000 0.000 0.000
Packing section (~line (2) 14:51:34 -> 14:53:03
(3) 14:53:04 -> 14:54:34
679168
670167
1
0
0.744
0.000
0
0
0.000
0.000
5.110
5.016
1.955
1.933
(4) 14:54:35 -> 14:56:05 776300 0 0.000 0 0.000 5.927 2.270
736) (5) 14:56:06 -> 14:57:36
(6) 14:57:37 -> 14:59:07
732332
763898
0
0
0.000
0.000
0
0
0.000
0.000
5.578
5.860
2.140
2.233
(7) 14:59:08 -> 15:00:37 735465 0 0.000 0 0.000 5.653 2.140
(8) 15:00:38 -> 15:02:08 650279 0 0.000 0 0.000 4.990 1.918
(9) 15:02:09 -> 15:03:39 4 0 0.000 0 0.000 0.000 0.000
Were there any write (10) 15:03:40 -> 15:05:10
(11) 15:05:11 -> 15:06:41
6
0
0
0
0.000
0.000
0
0
0.000
0.000
0.000
0.000
0.000
0.000
waits? Which (12) 15:06:42 -> 15:08:11
(13) 15:08:12 -> 15:09:42
0
0
0
0
0.000
0.000
0
0
0.000
0.000
0.000
0.000
0.000
0.000
(14) 15:09:43 -> 15:10:13 0 0 0.000 0 0.000 0.000 0.000
configuration would this ------------------------------ ----------
5007619
----------
1
----------
0.744
----------
0
----------
0.000
----------
38.134
----------
14.589
refer to? Why aren't there
any waits for Answers:
(1) Yes - 1. Hardly worth worrying about.
normalization? What do (2) exec_sqm_write_request_limit
you think CmdsDirect (3) Because the nrm_thread is off, so the NRM thread isn't running
(4) The number of commands the RepAgent user thread is placing into the SQM command cache to facilitate
refer to? direct replication. The trick will be comparing this to the DIST utilization to see how effective this is.

Inbound Queue SQM Writes (Cmds)


5) Scroll down to the first Source Connection: ASE1.tpcc (103)

SQM inbound section Interval


------------------------------
WrRequests
----------
PeakReqst
----------
CmdsWritn
----------
Cmds/Sec
----------
MBytesWr
----------
PAKCmdSize
----------
UnPackSize
----------
(1) 14:49:01 -> 14:51:33 0 0 0 0.0 0.00 0 0
(~line 786) (2) 14:51:34 -> 14:53:03
(3) 14:53:04 -> 14:54:34
679159
671906
312245
268064
680524
672438
7561.3
7471.5
388.88
384.05
475
474
599
599
(4) 14:54:35 -> 14:56:05 776314 276222 777249 8636.1 444.46 475 600
(5) 14:56:06 -> 14:57:36 732332 276002 733477 8149.7 418.83 474 599
(6) 14:57:37 -> 14:59:07 763906 273402 765142 8501.5 437.65 475 600
How many GB were (7) 14:59:08 -> 15:00:37
(8) 15:00:38 -> 15:02:08
735466
650279
277247
263459
736773
651077
8186.3
7234.1
420.73
372.02
474
475
599
599
(9) 15:02:09 -> 15:03:39 4 3 4 0.0 0.00 103 240
written to disk? How does (10) 15:03:40 -> 15:05:10
(11) 15:05:11 -> 15:06:41
6
0
3
0
6
0
0.0
0.0
0.00
0.00
104
0
245
0
this compare to RepAgent (12) 15:06:42 -> 15:08:11
(13) 15:08:12 -> 15:09:42
0
0
0
0
0
0
0.0
0.0
0.00
0.00
0
0
0
0
(14) 15:09:43 -> 15:10:13 0 0 0 0.0 0.00 0 0
User GB received? How ------------------------------ ----------
5009372
----------
312245
----------
5016690
----------
6193.3
----------
2866.62
----------
392
----------
519
effective is the command
packing? At ~8000 Answers:
cmds/sec and ~475 bytes (1) About 2.8GB.
per SQM cmd, how many (2) 2.8/1.3 = ~2.15x bigger
(3) Not very - it isn't done for this reason - the real reason packing is done is to put the binary command structure
commands will the into a format for writing to disk.
exec_sqm_write_request_ (4) ~17,500 commands
(5) slightly more than 2 seconds!!! Not very long - hence all the comments about near-synchronous processing
limit buffer? About how as it doesn't take long before RepAgent in the PDB is impacted if the inbound queue speed slows at all. If this
many seconds will the happens, the WriteWaits (SQM or NRM - discussed above) would reflect the RepAgent User waiting for the SQM
writer to catch up. The lack of WriteWaits indicates at this time the write speed to disk is not the issue - if the
buffer work until RepAgent is lagging, it is either due to the RepAgent itself, the network latency or the RepAgent User
RepAgent throughput is processing.
impacted?

23
Inbound Queue SQM Writes (Blocks)
6) Scroll to the next SQM Source Connection: ASE1.tpcc (103)

writer section (Blocks) Interval BlocksWrtn Blocks/Sec Cmds/Block KBytes/Blk FullWrite TimerPop
------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
near line 820 (1) 14:49:01 -> 14:51:33 0 0.0 0.0 0.00 0 0
(2) 14:51:34 -> 14:53:03 26162 290.6 26.0 15.20 26155 7
(3) 14:53:04 -> 14:54:34 25828 286.9 26.0 15.20 25822 6
(4) 14:54:35 -> 14:56:05 29885 332.0 26.0 15.20 29880 4
What is the current (5) 14:56:06 -> 14:57:36 28159 312.8 26.0 15.20 28151 8
(6) 14:57:37 -> 14:59:07 29441 327.1 25.9 15.20 29434 7
block_size? Is this (7) 14:59:08 -> 15:00:37 28303 314.4 26.0 15.20 28289 14
(8) 15:00:38 -> 15:02:08 25032 278.1 26.0 15.20 25018 14
efficient with respect to (9) 15:02:09 -> 15:03:39 3 0.0 1.3 0.30 0 3
(10) 15:03:40 -> 15:05:10 2 0.0 3.0 0.70 0 2
the number of commands (11) 15:05:11 -> 15:06:41 0 0.0 0.0 0.00 0 0
(12) 15:06:42 -> 15:08:11 0 0.0 0.0 0.00 0 0
per block? Consider the (13) 15:08:12 -> 15:09:42
(14) 15:09:43 -> 15:10:13
0
0
0.0
0.0
0.0
0.0
0.00
0.00
0
0
0
0
blocks per second written ------------------------------ ----------
192815
----------
237.9
----------
20.6
----------
11.93
----------
192749
----------
65
- does this seem high?
What does TimerPop Answers:
refer to? Should we (1) Probably 16K since we are getting ~15.2KB/block
(2) Well…sort of - we have small commands, so we are getting ~26 commands/block…but this is streaming data,
change that so the fewer writes and the more commands per write the better.
configuration? (3) Since there are 64 blocks per segment, we are running about 4 segments per segment - ouch - a bit high.
Both this and #2 could be improved by increasing the block_size to 256 (moving to 32 would only help a little -
we are always looking for significant changes).
(4) blocks that were flushed to disk because the timer controlled by init_sqm_write_delay and
init_sqm_write_max_delay expired before the block filled.
(5) obviously NOT! If a lot of blocks were written due to TimerPop and we were concerned about disk space
utilization efficiency, then we could increase those configurations to allow writes to be delayed longer. In our
case, the writes due to time expiration are negligible - especially during peak processing which is when we
should be considering.

Inbound Queue SQM Write Speed (Time in secs)


7) Scroll down to ~line 850 Source Connection: ASE1.tpcc (103)

and the section on Write Interval


------------------------------
WriteTime
----------
ms/Write
----------
MB/Sec
----------
WaitSegTm
----------
ms/Alloc
----------
AddCmdTm
----------
CallbackTm
----------
(1) 14:49:01 -> 14:51:33 0.000 0.0 0.00 0.000 0.0 0.000 0.000
Speed (2) 14:51:34 -> 14:53:03
(3) 14:53:04 -> 14:54:34
6.098
7.810
0.2
0.3
63.77
49.17
1.023
1.095
0.3
0.3
12.485
12.029
12.485
12.029
(4) 14:54:35 -> 14:56:05 8.752 0.2 50.78 1.311 0.3 13.639 13.639
(5) 14:56:06 -> 14:57:36 8.062 0.2 51.95 1.156 0.3 13.007 13.007
(6) 14:57:37 -> 14:59:07 8.791 0.2 49.78 1.234 0.3 13.647 13.647
Does the disk speed (7) 14:59:08 -> 15:00:37
(8) 15:00:38 -> 15:02:08
8.642
7.215
0.3
0.2
48.68
51.56
1.222
1.033
0.3
0.3
13.513
11.332
13.513
11.332
(9) 15:02:09 -> 15:03:39 0.001 0.3 0.00 0.000 0.0 0.000 0.000
seem okay? What does (10) 15:03:40 -> 15:05:10
(11) 15:05:11 -> 15:06:41
0.000
0.000
0.0
0.0
0.00
0.00
0.000
0.000
0.0
0.0
0.000
0.000
0.000
0.000
WaitSegTm refer to? (12) 15:06:42 -> 15:08:11
(13) 15:08:12 -> 15:09:42
0.000
0.000
0.0
0.0
0.00
0.00
0.000
0.000
0.0
0.0
0.000
0.000
0.000
0.000
(14) 15:09:43 -> 15:10:13 0.000 0.0 0.00 0.000 0.0 0.000 0.000
How much of an impact ------------------------------ ----------
55.371
----------
0.2
----------
40.63
----------
8.074
----------
0.2
----------
89.652
----------
89.652
does that have per
segment? Is more time Answers:
spent filling out the blocks (1) Yes - nominally, 2-8ms/IO is acceptable - but with a SAN with write cache, we would expect writes to be
<2ms/IO. We are well below that.
by adding commands or (2) The amount of time spent waiting for a new segment to be allocated before we can write the next block to
writing the blocks to disk? disk (1 segment = 64 blocks)
(3) Very little - about 0.3ms/segment allocated - which means RSSD speed is not an issue. However, given the
How busy is the SQM volume of segments allocated, it does add up
(e.g. is it busy 100% or is (4) Adding commands to block…but not by much ~12 seconds vs. ~8 seconds.
(5) If we add up the times, we are using 8+1+13+13=35 seconds out of 1.5 minutes (90 seconds) - so we are
there more capacity) only using ~1/3rd of each interval - considerable capacity to spare.

24
Inbound Queue SQM Segment Processing (Time in secs)
8) Scroll down to the Source Connection: ASE1.tpcc (103)

Segment processing Interval


------------------------------
SegsActive
----------
Allocated
----------
Deallocatd
----------
TimeNewSeg
----------
AvgNewSeg
----------
Affinity
----------
UpdsRsOQID
----------
UpOQID/Min
----------
(1) 14:49:01 -> 14:51:33 2009 0 0 0.000 0.000 0 0 0.0
section (~line 920) (2) 14:51:34 -> 14:53:03
(3) 14:53:04 -> 14:54:34
1
177
408
404
2647
0
68.117
89.967
0.168
0.224
0
0
41
40
27.3
26.6
(4) 14:54:35 -> 14:56:05 646 467 0 89.734 0.193 0 46 30.6
(5) 14:56:06 -> 14:57:36 1087 439 0 89.573 0.205 0 44 29.3
(6) 14:57:37 -> 14:59:07 1549 459 0 89.682 0.196 0 46 30.6
What is the peak number (7) 14:59:08 -> 15:00:37
(8) 15:00:38 -> 15:02:08
1993
2386
441
390
0
0
89.432
75.851
0.204
0.195
0
0
45
39
30.0
26.0
(9) 15:02:09 -> 15:03:39 2386 0 0 0.000 0.000 0 0 0.0
of active segments? How (10) 15:03:40 -> 15:05:10
(11) 15:05:11 -> 15:06:41
2021
1759
0
0
627
0
0.000
0.000
0.000
0.000
0
0
0
0
0.0
0.0
does the rate of (12) 15:06:42 -> 15:08:11
(13) 15:08:12 -> 15:09:42
1759
1759
0
0
0
0
0.000
0.000
0.000
0.000
0
0
0
0
0.0
0.0
(14) 15:09:43 -> 15:10:13 1759 0 0 0.000 0.000 0 0 0.0
deallocation match the ------------------------------ ----------
2386
----------
3008
----------
3274
----------
592.356
----------
0.153
----------
0
----------
301
----------
22.3
rate of allocation? What
feature in RS 15.7 affects Answers:
(1) 2386.
this? What does (2) We are allocating ~400 segments per interval - yet deallocation seems practically stalled
UpdsRsOQID refer to? (3) RepAgent 15.7 added a new daemon to asynchronously delete segments. In this case, it doesn't seem to be
reacting very quickly. If the space consumption was an issue (e.g. we really don't have any latency), we could
Does the rate seem turn it off via sqm_async_seg_delete, but this increases the workload of the SQM thread - but we have excess
excessive? How can we capacity - not suggesting we do this - in fact at this point it may be simply patience is necessary as it appears as
change this? soon as activity starts in earnest, it does in fact clear old segments (so the async daemon may only be active
when the SQM itself is active). Consequently, the alert about disk space usage in this case is ignorable.
(4) For RS recovery processing, we record the segment allocation in rs_oqid table in the RSSD. The
UpdsRsOQID represents how often we are updating this table.
(5) Yes - nominally, we would like RS recovery to be similar to database recovery - e.g. a few (low) minutes per
queue for recovery is fine. At ~30 updates per minute, we are looking at ~2 second recovery speed.
(6) We could increase sqm_recover_segs - but since it is already set to 10 (the maximum recommended), we
probably need to instead increase the block_size to 256. This will have the effect of increasing the size of a
segment from 1MB to 16MB (256KB blocks are 16x larger…and we still have 64 blocks/segment), which means
we will allocate 16x fewer segments for the same data volume…and consequently have 16x lower updates
(which would still be <1 minute).

9) You have completed the You are now able to:


exercise!
 Analyze RepAgent latency
 Analyze SQM utilization and processing
 Determine which features will be likely to benefit the inbound
processing time and reduce any latency that exists

25
Lab 3: SQMR, SQT & DIST Analysis

TOOL AND TOPIC

Estimated time: 10 minutes

Objective

The objective of this exercise is to analyze potential bottlenecks in the distribution phase of replication - specifically the
SQMR, SQT and DIST threads and modules. Due to time constraints, we will use an example without routes and the
RSI module will not be analyzed as part of this lab. The reason that analysis of performance for the distribution phase
is important is that if there is any latency in this section of processing, it will build up in the inbound queue - which
eventually could fill and then cause the primary log to start filling.

Exercise Description

The following description only serves as a short overview to what questions the student will be able to learn during the
exercise that are commonly asked questions in a normal performance situation.

 Is there any latency in reading from the inbound queue?


 Was the SQM page cache sufficient?
 Is there any latency in the SQT cache and what is the cause?
 Is the SQT cache size large enough - and how can we tell?
 Is there any latency in the DIST thread?
 Why is the DIST thread lagging?

26
EXERCISE 3 – SOLUTION

Explanation Screenshot

Inbound Queue SQM Reader (SQT/SQMR) Commands


1) Scroll to line ~1066 where Source Connection: ASE1.tpcc (103)

the SQM Reader starts. Interval Cmds Read Cmds/Sec BlocksRead ReadCached Cached Pct
------------------------------ ---------- ---------- ---------- ---------- ----------
(1) 14:49:01 -> 14:51:33 0 0.0 0 0 0.0
(2) 14:51:34 -> 14:53:03 680387 7559.8 26157 26157 100.0
Is the SQM page cache (3) 14:53:04 -> 14:54:34 672429 7471.4 25828 25828 100.0
(4) 14:54:35 -> 14:56:05 777227 8635.8 29883 29883 100.0
large enough? How can (5) 14:56:06 -> 14:57:36 733341 8148.2 28155 28155 100.0
(6) 14:57:37 -> 14:59:07 765141 8501.5 29440 29440 100.0
we tell?? In what other (7) 14:59:08 -> 15:00:37 736846 8187.1 28304 28304 100.0
(8) 15:00:38 -> 15:02:08 651122 7234.6 25031 25031 100.0
way can we tell the (9) 15:02:09 -> 15:03:39 4 0.0 3 3 100.0
(10) 15:03:40 -> 15:05:10 6 0.0 2 2 100.0
SQT/SQMR is keeping up (11) 15:05:11 -> 15:06:41 0 0.0 0 0 0.0
(12) 15:06:42 -> 15:08:11 0 0.0 0 0 0.0
with the RepAgent User (13) 15:08:12 -> 15:09:42
(14) 15:09:43 -> 15:10:13
0
0
0.0
0.0
0
0
0
0
0.0
0.0
thread? If the SQT was ------------------------------ ----------
5016503
----------
3981.3
----------
192803
----------
192803
----------
64.2
lagging behind by 2GB
would a 8MB SQM page
Answers:
cache help? (1) Yes.
(2) The BlocksReadCached is 100% across the board. Typically anything higher than 70% is easily acceptable.
(3) The SQM read rate (cmds/sec) is very similar
(4) No. The SQM page cache is ONLY effective if the latency does not extend beyond the cache. If latency
overruns the cache, then the cache is essentially useless until such a time the latency is reduced. For this
reason, extremely large page caches are pretty much useless.

Inbound Queue SQM Reader (SQT/SQMR) Processing (Time in secs)


2) Scroll down to the next Source Connection: ASE1.tpcc (103)

section (SQM Read Time) Interval


------------------------------
ReadTime
----------
msPerIO
----------
ReadTmSeg
----------
CacheTime
----------
SleepTime
----------
SleepWrite
----------
AvgSleepTm
----------
Sleep/Blck
----------
(1) 14:49:01 -> 14:51:33 0.000 0.0 0.000 0.000 89.850 359 0.250 0.0
~line 1102. (2) 14:51:34 -> 14:53:03
(3) 14:53:04 -> 14:54:34
0.000
0.000
0.0
0.0
0.000
0.000
0.114
0.097
65.404
69.355
479329
484541
0.000
0.000
18.3
18.7
(4) 14:54:35 -> 14:56:05 0.000 0.0 0.000 0.115 64.757 548279 0.000 18.3
(5) 14:56:06 -> 14:57:36 0.000 0.0 0.000 0.110 65.006 510591 0.000 18.1
(6) 14:57:37 -> 14:59:07 0.000 0.0 0.000 0.115 61.701 524560 0.000 17.8
Why is the ReadTime all (7) 14:59:08 -> 15:00:37
(8) 15:00:38 -> 15:02:08
0.000
0.000
0.0
0.0
0.000
0.000
0.112
0.097
61.930
67.897
494832
452977
0.000
0.000
17.4
18.0
(9) 15:02:09 -> 15:03:39 0.000 0.0 0.000 0.000 89.568 362 0.247 120.6
0's? What is CacheTime (10) 15:03:40 -> 15:05:10
(11) 15:05:11 -> 15:06:41
0.000
0.000
0.0
0.0
0.000
0.000
0.000
0.000
89.576
89.826
362
359
0.247
0.250
181.0
0.0
vs. ReadTime? Which (12) 15:06:42 -> 15:08:11
(13) 15:08:12 -> 15:09:42
0.000
0.000
0.0
0.0
0.000
0.000
0.000
0.000
89.578
89.569
358
358
0.250
0.250
0.0
0.0
(14) 15:09:43 -> 15:10:13 0.000 0.0 0.000 0.000 30.024 120 0.250 0.0
thread is sleeping for the ------------------------------ ----------
0.000
----------
0.0
----------
0.000
----------
0.760
----------
635.194
----------
3495833
----------
0.054
----------
47.6
SleepTime? What thread
will this impact? Why is Answers:
(1) Because all the reads were read from cache (above) - this represents time spent doing physical reads
there a huge order of (2) CacheTime represents the time reading from cache (e.g. a LogicalRead) vs. a PhysicalRead
magnitude shift if (3) The SQT thread since the SQMR is part of SQT. Note that it is sleeping 2/3rds of entire interval during
processing periods (~65 out of 90 seconds)
SleepWrites while there (4) The DIST thread - possibly.
was activity? Is it (5) It represents contention between the SQMR trying to read from a block that the SQM was trying to add cmds
excessive? How can we to. When there is no activity this will not happen nor will it occur when there is any latency larger than 1 block.
(6) Not really - if we were sleeping 100x per block during activity periods, it might be an issue.
control it? What is the (7) This is controlled via sqt_init_read_delay and sqt_max_read_delay
peak real backlog of the (8) 4MB
(9) SQT thread sleeping means SQMR can't read - when it is rewoken some time later, enough time has passed
SQM Reader? What is that it is lagging by a few MB. However, since the sqm_cache_size is 2048 pages and each SQM page is 4
the likely cause? blocks (each block is 16KB), we have 2048*4*16=131072KB of SQM page cache - or 128MB. Remember, we
are writing about 50-60MB/sec (#7 in last lab), so 4MB of backlog represents ~70ms. This needs to be
considered when sizing SQM page cache as well as configure SQT read delays.

27
Inbound SQT Cache Memory
3) Scroll down to line ~1164 Source Connection: ASE1.tpcc (103)

and the first part of the Interval CmdsRead CmdsPerSec CmdMaxTran CmdAvgTran CacheMem
------------------------------ ---------- ---------- ---------- ---------- ----------
SQT cache processing. (1) 14:49:01 -> 14:51:33 0 0.0 0 0 0.00
(2) 14:51:34 -> 14:53:03 680606 7562.2 3 2 58.14
(3) 14:53:04 -> 14:54:34 672542 7472.6 3 2 7.52
(4) 14:54:35 -> 14:56:05 777281 8636.4 4 2 6.66
Notice the command rate. (5) 14:56:06 -> 14:57:36 733595 8151.0 3 2 14.94
(6) 14:57:37 -> 14:59:07 765453 8505.0 4 2 29.27
How does this compare to (7) 14:59:08 -> 15:00:37 737000 8188.8 3 2 56.33
(8) 15:00:38 -> 15:02:08 651219 7235.7 4 2 10.26
RepAgent User? What is (9) 15:02:09 -> 15:03:39 4 0.0 1 1 0.00
(10) 15:03:40 -> 15:05:10 6 0.0 1 1 0.00
the largest transaction (11) 15:05:11 -> 15:06:41 0 0.0 0 0 0.00
(12) 15:06:42 -> 15:08:11 0 0.0 0 0 0.00
size - how many DML (13) 15:08:12 -> 15:09:42
(14) 15:09:43 -> 15:10:13
0
0
0.0
0.0
0
0
0
0
0.00
0.00
commands? What is the ------------------------------ ----------
5017706
----------
3982.2
----------
4
----------
1
----------
58.14
maximum SQT cache
used? How much is Answers:
configured? (1) Nearly identical - of course it is nearly identical with SQMR as it should be as both modules are in the same
thread and there isn't even a buffer between them. Note that if a transaction gets removed from cache, the
SQMR will re-read it - but not the SQT - so there can be differences.
(2) The largest transaction has 2 DML commands plus a begin & commit for 4 commands total
(3) ~58MB
(4) Since there are not connection overrides and server level dist_sqt_max_cache_size is 0, we then use the
server sqt_max_cache_size of 209715200 bytes - or ~200MB….about 4x what we actually used.

Inbound SQT Cache Processing (Time in secs)


4) Scroll down to SQT Source Connection: ASE1.tpcc (103)

Cache processing time Interval ReadSQMTm AddCacheTm DelCacheTm SQTWakeup RsyncPurge


------------------------------ ---------- ---------- ---------- ---------- ----------
(~line 1200). (1) 14:49:01 -> 14:51:33 89.604 0.000 0.000 0 0
(2) 14:51:34 -> 14:53:03 75.140 3.354 3.722 531348 0
(3) 14:53:04 -> 14:54:34 77.057 3.125 2.325 538413 0
(4) 14:54:35 -> 14:56:05 74.291 3.682 2.656 610045 0
Why do you think the (5) 14:56:06 -> 14:57:36 75.332 3.517 2.096 570310 0
(6) 14:57:37 -> 14:59:07 73.312 3.916 2.406 583070 0
ReadSQMTime is so high (7) 14:59:08 -> 15:00:37 73.733 3.852 2.321 552657 0
(8) 15:00:38 -> 15:02:08 77.011 3.054 1.867 504144 0
- where is most of that (9) 15:02:09 -> 15:03:39 89.572 0.000 0.000 2 0
(10) 15:03:40 -> 15:05:10 89.580 0.000 0.000 2 0
time spent? What does (11) 15:05:11 -> 15:06:41 89.581 0.000 0.000 0 0
(12) 15:06:42 -> 15:08:11 89.583 0.000 0.000 0 0
SQTWakeup refer to? (13) 15:08:12 -> 15:09:42
(14) 15:09:43 -> 15:10:13
89.573
30.025
0.000
0.000
0.000
0.000
0
0
0
0
and how is it related to ------------------------------ ----------
1093.394
----------
24.500
----------
17.393
----------
3889991
----------
0
SleepWrite back in #10?
Is this a problem? Answers:
(1) This time is cumulative - to read from the SQM includes SQMR times - of which a large chunk is SleepTime -
so a big factor in this time is the SQMR sleep time
(2) It refers to when a SQT client (DIST in this case) had to wake up the SQT thread so it could get the SQT to
forward a message from cache.
(3) The SQT could be sleeping due to 2 primary causes - first, it is awaiting a physical read - second, it is
sleeping due to contention on the inbound queue
(4) Yes. Although we have no waits on physical reads, we are sleeping a lot due to contention on inbound
queue. One of the reasons for the 0.00 average sleep times back in #2 is that as soon as SQT goes to sleep the
DIST is trying to wake it up so it can get the next transaction or transaction command from the SQT cache.

28
Inbound SQT Cache Txn Queues
5) Scroll down to ~line 1270 Source Connection: ASE1.tpcc (103)

to the SQT txn queues. Interval OpenTxnAdd TxnRemoved ClosedAdd EmptyTxnRm ReadTxnAdd TruncTrAdd
------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
(1) 14:49:01 -> 14:51:33 0 0 0 0 0 0
(2) 14:51:34 -> 14:53:03 226375 0 226322 40 226157 226375
What can be inferred by (3) 14:53:04 -> 14:54:34 223971 0 223971 2 223942 223971
(4) 14:54:35 -> 14:56:05 258764 0 258767 1 258909 258764
the OpenTxnAdd, (5) 14:56:06 -> 14:57:36 244045 0 244051 2 242389 244045
(6) 14:57:37 -> 14:59:07 254640 0 254634 1 256206 254640
ClosedAdd, ReadTxnAdd (7) 14:59:08 -> 15:00:37 245149 0 245153 2 245113 245149
(8) 15:00:38 -> 15:02:08 216809 0 216660 169 216767 216809
values in interval #1? Are (9) 15:02:09 -> 15:03:39 2 0 1 2 1 2
(10) 15:03:40 -> 15:05:10 4 0 2 2 2 4
any transactions removed (11) 15:05:11 -> 15:06:41 0 0 0 0 0 0
(12) 15:06:42 -> 15:08:11 0 0 0 0 0 0
from cache (any interval)? (13) 15:08:12 -> 15:09:42
(14) 15:09:43 -> 15:10:13
0
0
0
0
0
0
0
0
0
0
0
0
From which queue would ------------------------------ ----------
1669759
----------
0
----------
1669561
----------
221
----------
1669486
----------
1669759
they always be removed?
How do empty Answers:
transactions occur? What (1) While they are nearly identical, ClosedAdd is ~53 less than OpenTxnAdd and ReadTxnAdd is 165 less than
Closed - but if we ignore the 40 empty transactions, we are only 125 transactions behind at the end of the
do more than 1 interval.
OpenTrans imply with (2) No
(3) Open - only OPEN transactions can be removed from cache. Once CLOSED, they remain in cache.
respect to primary (4) It could be due to iso3 activity at primary, procs executed in chain mode - or simply that a transaction with
database? What is the DML occurred, but none of the affected tables were replicated. This is normally not an issue unless extremely
maximum CloseTrans in high and then likely points to either ULC configuration issues (empty txns are not flushed to the txn log in ASE 15
if completely in ULC) or a lot of procedures executed in chain mode.
queue? Is this an issue? (5) It suggest the number of peak concurrent write transactions at the primary during that interval
Why are the number of (6) 1384 transactions - note that this is the peak value.
(7) Yes - Although small in comparison to the 226K transactions processed, it still reflects buffering due to DIST
ReadTrans so much lower latency in reading from the SQT cache. Ideally, we want ClosedTrans to be very low in non-DSI SQT caches.
than ClosedTrans? Why (8) Because once the transaction is read from cache, and all other transactions on the same block have been
read, the transaction can be truncated. As a result, the ReadTrans should always be small in comparison.
is TruncTrans so high? (9) Because a tran is added to the truncate list as soon as the begin tran is read

DIST Command Processing


6) Scroll to ~line 1350, DIST Source Connection: ASE1.tpcc (103)

command processing. Interval CmdsRead Cmds/Sec NoRepdef NoRepdef% Duplicates


------------------------------ ---------- ---------- ---------- ---------- ----------
(1) 14:49:01 -> 14:51:33 0 0.0 0 0.0 0
(2) 14:51:34 -> 14:53:03 678614 7540.0 0 0.0 0
How does the overall rate (3) 14:53:04 -> 14:54:34 671823 7465.0 0 0.0 0
(4) 14:54:35 -> 14:56:05 776826 8631.0 0 0.0 0
compare? What is the (5) 14:56:06 -> 14:57:36 727222 8080.0 0 0.0 0
(6) 14:57:37 -> 14:59:07 768787 8542.0 0 0.0 0
issue if no repdef exists? (7) 14:59:08 -> 15:00:37 735363 8171.0 0 0.0 0
(8) 15:00:38 -> 15:02:08 650364 7226.0 0 0.0 0
(9) 15:02:09 -> 15:03:39 0 0.0 0 0.0 0
(10) 15:03:40 -> 15:05:10 0 0.0 0 0.0 0
(11) 15:05:11 -> 15:06:41 0 0.0 0 0.0 0
(12) 15:06:42 -> 15:08:11 0 0.0 0 0.0 0
(13) 15:08:12 -> 15:09:42 0 0.0 0 0.0 0
(14) 15:09:43 -> 15:10:13 0 0.0 0 0.0 0
------------------------------ ---------- ---------- ---------- ---------- ----------
5008999 55655.0 0 0.0 0

Answers:
(1) Compares identically to SQT (or nearly so)
(2) Unless using ASE 15.7+ with RS 15.7+ and leveraging repdef elimination, the resulting updates & deletes will
have an inefficient where clause as the where clause will be formed from all non-LOB/CLOB columns

DIST Processing Time (Time in secs)


7) Scroll to ~line 1456, DIST Source Connection: ASE1.tpcc (103)

Processing Time Interval


------------------------------
ReadTime
----------
ParseTime
----------
SreTime
----------
TDDlvrTime
----------
TDPackTime
----------
MDDlvrTime
----------
MDProcTime
----------
MDWrMsgTm
----------
(1) 14:49:01 -> 14:51:33 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
(2) 14:51:34 -> 14:53:03 49.262 0.000 0.206 11.135 4.889 7.993 4.368 2.090
(3) 14:53:04 -> 14:54:34 70.640 0.000 0.201 10.980 4.823 7.912 4.311 2.103
(4) 14:54:35 -> 14:56:05 67.155 0.000 0.236 12.939 5.627 9.408 5.083 2.548
Where is most of the (5) 14:56:06 -> 14:57:36
(6) 14:57:37 -> 14:59:07
68.760
67.269
0.000
0.000
0.218
0.242
12.101
12.910
5.256
5.634
8.780
9.390
4.702
5.032
2.383
2.574
DIST time spent? Why? (7) 14:59:08 -> 15:00:37
(8) 15:00:38 -> 15:02:08
68.216
57.598
0.000
0.000
0.226
0.195
12.307
10.771
5.324
4.683
8.960
7.838
4.777
4.226
2.459
2.103
(9) 15:02:09 -> 15:03:39 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Can we improve this? Is (10) 15:03:40 -> 15:05:10
(11) 15:05:11 -> 15:06:41
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
writing to the outbound (12) 15:06:42 -> 15:08:11
(13) 15:08:12 -> 15:09:42
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
(14) 15:09:43 -> 15:10:13 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
queue delaying DIST ------------------------------ ----------
448.900
----------
0.000
----------
1.524
----------
83.143
----------
36.236
----------
60.281
----------
32.499
----------
16.260
processing? What about
the SRE (and STS cache Answers:
(1) ReadTime - which means reading from the SQT cache
consideration)? Why do (2) Because the SQT was sleeping a lot due to contention with inbound queue's current block
you think the ParseTime (3) Yes - we can enable dist_direct_cache_read which will allow the DIST to read directly from the SQT cache
without going through the SQT thread to gain access. Note that increasing the SQT read delays will make the
is all 0's situation worse as the SQT will sleep longer after each read attempt and more likely will need to get woken up by
the DIST thread.
(4) No - the time to write the commands to the outbound queue essentially is contained in MDWrMsgTm - which
is fairly negligible.
(5) No - the SRE time is the smallest of all other than the ParseTime
(6) Because it is leveraging the SQM command cache

29
DIST SQM Command Cache Utilization
8) Scroll to ~line 1495, DIST Source Connection: ASE1.tpcc (103)

SQM Command Cache Interval DctRepRcv RADctRep DctRepPct DctRepSend


------------------------------ ---------- ---------- ---------- ----------
(1) 14:49:01 -> 14:51:33 0 0 0.0 0
(2) 14:51:34 -> 14:53:03 678468 679168 99.8 678467
What thread sends the (3) 14:53:04 -> 14:54:34 671823 670167 100.2 671822
(4) 14:54:35 -> 14:56:05 776720 776300 100.0 776717
SQM command cache its (5) 14:56:06 -> 14:57:36 727164 732332 99.2 727163
(6) 14:57:37 -> 14:59:07 768609 763898 100.6 768605
commands? What other (7) 14:59:08 -> 15:00:37 735336 735465 99.9 735335
(8) 15:00:38 -> 15:02:08 650293 650279 100.0 650290
threads than DIST are (9) 15:02:09 -> 15:03:39 0 4 0.0 0
(10) 15:03:40 -> 15:05:10 0 6 0.0 0
capable of utilizing or (11) 15:05:11 -> 15:06:41 0 0 0.0 0
(12) 15:06:42 -> 15:08:11 0 0 0.0 0
appending to SQM (13) 15:08:12 -> 15:09:42
(14) 15:09:43 -> 15:10:13
0
0
0
0
0.0
0.0
0
0
command cache? Does ------------------------------ ----------
5008413
----------
5007619
----------
49.9
----------
5008399
having
dist_cmd_direct_replicate Answers:
enabled make sense if (1) RepAgent User thread (it is part of the PAK step)
(2) Only the RepAgent User thread, the DIST thread and the DSI threads interact with the SQM command cache
using a route? (other than SQM managing the cache). Both the RepAgent User and DIST threads send commands to the SQM
command cache, while both the DIST and DSI read from the command cache.
(3) No. While using cmd_direct_replicate is an option as the path from RepAgent  DIST can leverage the SQM
command cache, remember, a route is to a different RS, most likely in a different host - so direct memory sharing
is not easily doable. In addition, the route protocol (RTL) is used which requires reparsing at the RRS.

9) You have completed the You are now able to:


exercise!
 Analyze SQT cache utilization and processing
 Analyze DIST processing and latency
 Determine which features will be likely to benefit the inbound
processing time and reduce any latency that exists

30
Lab 4: Outbound Queue SQM, DSI & DSIEXEC Analysis

TOOL AND TOPIC

Estimated time: 10 minutes

Objective

The objective of this exercise is to analyze potential bottlenecks in the delivery phase, including the outbound queue
reader (SQMR), the DSI and its various phases and finally the DSIEXEC. Since this area is the most common area of
performance issues, students should pay close attention to key points highlighted by these exercises to help learn the
techniques that will be useful in isolating the cause of latency in their own environments.

Exercise Description

The following description only serves as a short overview to what questions the student will be able to learn during the
exercise that are commonly asked questions in a normal performance situation.

 Is the DSI SQT cache size correctly?


 Are we grouping transactions effectively?
 Where is time spent by the DSIEXEC thread?
 What options exist for improving latency?

31
EXERCISE # – SOLUTION

Explanation Screenshot

Outbound (Inbound WS) Queue SQM Reader (DSI/SQMR & WS-DSI/SQMR) Commands
1) Scroll down to ~line 1815 Destination Connection: ASE2.tpcc (102)

and the outbound queue Interval Cmds Read Cmds/Sec BlocksRead ReadCached Cached Pct
------------------------------ ---------- ---------- ---------- ---------- ----------
SQMR reader section. (1) 14:49:01 -> 14:51:33 0 0.0 0 0 0.0
(2) 14:51:34 -> 14:53:03 226795 2519.9 3280 3280 100.0
(3) 14:53:04 -> 14:54:34 168960 1877.3 2441 2441 100.0
(4) 14:54:35 -> 14:56:05 170880 1898.6 2468 2468 100.0
Notice that the as (5) 14:56:06 -> 14:57:36 173373 1926.3 2510 2510 100.0
(6) 14:57:37 -> 14:59:07 167820 1864.6 3370 1518 45.0
combination of page cache (7) 14:59:08 -> 15:00:37 173040 1922.6 4462 538 12.1
(8) 15:00:38 -> 15:02:08 174000 1933.3 4360 700 16.1
and SQT cache keeps (9) 15:02:09 -> 15:03:39 208794 2319.9 5492 538 9.8
(10) 15:03:40 -> 15:05:10 213608 2373.4 5493 685 12.5
physical reads at bay until (11) 15:05:11 -> 15:06:41 214620 2384.6 5445 767 14.1
(12) 15:06:42 -> 15:08:11 213420 2371.3 5520 646 11.7
interval #6 despite the DSI (13) 15:08:12 -> 15:09:42
(14) 15:09:43 -> 15:10:13
216360
76440
2404.0
2548.0
5395
2072
879
136
16.3
6.6
latency. ------------------------------ ----------
2398110
----------
2024.5
----------
52308
----------
17106
----------
38.8

Outbound (Inbound WS) Queue SQM Reader (DSI/SQMR & WS-DSI/SQMR) Processing (Time in secs)
2) Scroll down to ~line 1880 Destination Connection: ASE2.tpcc (102)

and SQMR processing Interval ReadTime msPerIO ReadTmSeg CacheTime BacklogSeg BacklogMB
------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
times. (1) 14:49:01 -> 14:51:33 0.000 0.0 0.000 0.000 0 0.00
(2) 14:51:34 -> 14:53:03 0.000 0.0 0.000 0.013 102 103.00
(3) 14:53:04 -> 14:54:34 0.000 0.0 0.000 0.009 216 217.00
(4) 14:54:35 -> 14:56:05 0.000 0.0 0.000 0.010 354 355.00
Note that once SQM page (5) 14:56:06 -> 14:57:36 0.000 0.0 0.000 0.012 480 481.00
(6) 14:57:37 -> 14:59:07 0.185 0.0 0.000 0.007 617 618.00
cache is full that the SQMR (7) 14:59:08 -> 15:00:37 0.308 0.0 0.000 0.006 745 746.00
(8) 15:00:38 -> 15:02:08 0.295 0.0 0.000 0.006 859 860.00
has to do physical reads (9) 15:02:09 -> 15:03:39 0.291 0.0 0.000 0.006 853 854.00
(10) 15:03:40 -> 15:05:10 0.269 0.0 0.000 0.006 806 807.00
and hence ReadTime (11) 15:05:11 -> 15:06:41 0.266 0.0 0.000 0.007 758 759.00
(12) 15:06:42 -> 15:08:11 0.254 0.0 0.000 0.006 709 710.00
starts in interval 6 (13) 15:08:12 -> 15:09:42
(14) 15:09:43 -> 15:10:13
0.260
0.101
0.0
0.0
0.000
0.000
0.007
0.002
661
611
662.00
612.00
------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
2.229 0.0 0.000 0.097 859 860.00
Note also the increasing
backlog until it peaks at
interval 8.
Outbound DSI (Inbound WS-DSI) SQT Cache Memory
3) Scroll to ~line 1953 SQT Destination Connection: ASE2.tpcc (102)

Cache memory to look at Interval CmdsRead CmdsPerSec CmdMaxTran CmdAvgTran CacheMem


------------------------------ ---------- ---------- ---------- ---------- ----------
the DSI SQT cache (1) 14:49:01 -> 14:51:33 0 0.0 0 0 0.00
(2) 14:51:34 -> 14:53:03 226888 2520.9 3 3 200.00
utilization (3) 14:53:04 -> 14:54:34 168960 1877.3 3 3 200.00
(4) 14:54:35 -> 14:56:05 170940 1899.3 3 3 200.00
(5) 14:56:06 -> 14:57:36 173400 1926.6 3 3 200.00
(6) 14:57:37 -> 14:59:07 167880 1865.3 3 3 200.00
Note that SQT cache is full (7) 14:59:08 -> 15:00:37 173040 1922.6 3 3 200.00
(8) 15:00:38 -> 15:02:08 174000 1933.3 3 3 200.00
(200MB used) from interval (9) 15:02:09 -> 15:03:39 208854 2320.6 3 3 200.00
(10) 15:03:40 -> 15:05:10 213776 2375.2 3 3 200.00
#2 on. Note that it is using (11) 15:05:11 -> 15:06:41 214620 2384.6 3 3 200.00
(12) 15:06:42 -> 15:08:11 213420 2371.3 3 3 200.00
the server's default (13) 15:08:12 -> 15:09:42
(14) 15:09:43 -> 15:10:13
216390
76440
2404.3
2548.0
3
3
3
3
200.00
200.00
sqt_max_cache_size ------------------------------ ----------
2398608
----------
2024.9
----------
3
----------
2
----------
200.00
(200MB) setting vs. the
server's
dsi_sqt_max_cache_size
value (31MB).

32
Outbound DSI (Inbound WS-DSI) SQT Cache Txn Queues
4) Scroll down to ~line 2100 - Destination Connection: ASE2.tpcc (102)

the outbound DSI SQT txn Interval TxnRemoved OpenTrans ClosedTran ReadTrans TruncTrans
------------------------------ ---------- ---------- ---------- ---------- ----------
queues (1) 14:49:01 -> 14:51:33 0 0 0 0 0
(2) 14:51:34 -> 14:53:03 0 1 29376 8 29385
(3) 14:53:04 -> 14:54:34 0 1 34113 20 34134
(4) 14:54:35 -> 14:56:05 0 1 34113 19 34134
Is the SQT cache too large (5) 14:56:06 -> 14:57:36 0 1 34113 20 34134
(6) 14:57:37 -> 14:59:07 0 1 34113 20 34134
or too small or just right? (7) 14:59:08 -> 15:00:37 0 1 34113 20 34134
(8) 15:00:38 -> 15:02:08 0 1 34113 20 34134
Why are there so many (9) 15:02:09 -> 15:03:39 0 1 34114 18 34134
(10) 15:03:40 -> 15:05:10 0 1 34116 16 34133
transactions in the (11) 15:05:11 -> 15:06:41 0 1 34113 20 34134
(12) 15:06:42 -> 15:08:11 0 1 34113 20 34134
CLOSED queue? How (13) 15:08:12 -> 15:09:42
(14) 15:09:43 -> 15:10:13
0
0
1
1
34114
34113
15
20
34130
34134
many transactions should ------------------------------ ----------
0
----------
13
----------
438737
----------
236
----------
438988
there be?
Answers:
(1) Too large
(2) Because DSIEXEC latency is causing the SQT cache to simply buffer pending transactions
(3) Nominally, we only need as many transactions as dsi_max_xacts_in_group or other configurations such as
HVAR require. Generally speaking, more than 100 txns in CLOSED is usually a sign of DSIEXEC latency.

DSI Transaction Processing (1)


5) Scroll to line ~2230, DSI Destination Connection: ASE2.tpcc (102)

Transaction processing Interval


------------------------------
ReadGroups
----------
ReadUngrpd
----------
GroupsSent
----------
UngrpdSent
----------
XactsInGrp
----------
GrpsCommit
----------
UngrpdCmt
----------
(1) 14:49:01 -> 14:51:33 0 0 0 0 0.0 0 0
(2) 14:51:34 -> 14:53:03 2095 41875 2095 41875 19.9 2094 41855
(3) 14:53:04 -> 14:54:34 2812 56240 2812 56240 20.0 2812 56240
(4) 14:54:35 -> 14:56:05 2845 56900 2845 56900 20.0 2845 56900
Why might there be (5) 14:56:06 -> 14:57:36
(6) 14:57:37 -> 14:59:07
2886
2794
57720
55880
2886
2794
57720
55880
20.0
20.0
2887
2794
57740
55880
differences between the (7) 14:59:08 -> 15:00:37
(8) 15:00:38 -> 15:02:08
2883
2896
57660
57920
2883
2896
57660
57920
20.0
20.0
2883
2896
57660
57920
(9) 15:02:09 -> 15:03:39 3477 69540 3477 69540 20.0 3477 69540
number of groups read vs. (10) 15:03:40 -> 15:05:10
(11) 15:05:11 -> 15:06:41
3558
3572
71160
71440
3558
3572
71160
71440
20.0
20.0
3558
3572
71160
71440
groups sent/committed?? (12) 15:06:42 -> 15:08:11
(13) 15:08:12 -> 15:09:42
3553
3603
71060
72060
3553
3603
71060
72060
20.0
20.0
3553
3602
71060
72040
(14) 15:09:43 -> 15:10:13 1272 25440 1272 25440 20.0 1272 25440
------------------------------ ---------- ---------- ---------- ---------- ---------- ---------- ----------
38246 764895 38246 764895 18.5 38245 764875

Answers:
(1) Due to processing failures, a group may be retried in smaller groups

DSI Transaction Group Closures (1)


6) Scroll to ~line 2297 to look Destination Connection: ASE2.tpcc (102)

at Transaction Group Interval MaxBytes NoneOrig MixedUser MixedMode PartnRule MaxTrans


------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
closure reasons. (1) 14:49:01 -> 14:51:33 0 0 0 0 0 0
(2) 14:51:34 -> 14:53:03 0 3 0 0 0 3765
(3) 14:53:04 -> 14:54:34 0 0 0 0 0 2812
(4) 14:54:35 -> 14:56:05 0 0 0 0 0 2845
Why is MaxTrans so high? (5) 14:56:06 -> 14:57:36 0 0 0 0 0 2886
(6) 14:57:37 -> 14:59:07 0 0 0 0 0 2794
Is this an issue? (7) 14:59:08 -> 15:00:37 0 0 0 0 0 2883
(8) 15:00:38 -> 15:02:08 0 0 0 0 0 2896
(9) 15:02:09 -> 15:03:39 0 0 0 0 0 3478
(10) 15:03:40 -> 15:05:10 0 0 0 0 0 3557
(11) 15:05:11 -> 15:06:41 0 0 0 0 0 3572
(12) 15:06:42 -> 15:08:11 0 0 0 0 0 3553
(13) 15:08:12 -> 15:09:42 0 0 0 0 0 3603
(14) 15:09:43 -> 15:10:13 0 0 0 0 0 1272
------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
0 3 0 0 0 39916

DSI Transaction Group Closures (2)


Destination Connection: ASE2.tpcc (102)

Interval HQNoneOrig HQCDBCmd HQCDBSize HQSQTSize SQTSize Dispatch


------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
(1) 14:49:01 -> 14:51:33 0 0 0 0 0 0
(2) 14:51:34 -> 14:53:03 0 0 0 0 0 2
(3) 14:53:04 -> 14:54:34 0 0 0 0 0 0
(4) 14:54:35 -> 14:56:05 0 0 0 0 0 0
(5) 14:56:06 -> 14:57:36 0 0 0 0 0 0
(6) 14:57:37 -> 14:59:07 0 0 0 0 0 0
(7) 14:59:08 -> 15:00:37 0 0 0 0 0 0
(8) 15:00:38 -> 15:02:08 0 0 0 0 0 0
(9) 15:02:09 -> 15:03:39 0 0 0 0 0 0
(10) 15:03:40 -> 15:05:10 0 0 0 0 0 0
(11) 15:05:11 -> 15:06:41 0 0 0 0 0 0
(12) 15:06:42 -> 15:08:11 0 0 0 0 0 0
(13) 15:08:12 -> 15:09:42 0 0 0 0 0 0
(14) 15:09:43 -> 15:10:13 0 0 0 0 0 0
------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
0 0 0 0 0 2

Answers:
(1) Because transaction groups were almost always closed due to exceeding dsi_max_xacts_in_group
(2) No.

33
DSIEXEC Command Processing
7) Scroll to line ~2917; Destination Connection: ASE2.tpcc (102)

DSIEXEC command Interval CmdSucceed CmdsPerSec MBytesSent Bytes/Cmd InCmdCnt OutCmdCnt


------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
processing. (1) 14:49:01 -> 14:51:33 0 0.0 0.00 0.0 0 0
(2) 14:51:34 -> 14:53:03 125565 1395.1 53.57 447.3 46022 50210
(3) 14:53:04 -> 14:54:34 168720 1874.6 71.93 447.0 61864 67488
(4) 14:54:35 -> 14:56:05 170700 1896.6 72.78 447.0 62590 68280
When could a single input (5) 14:56:06 -> 14:57:36 173220 1924.6 73.86 447.0 63492 69264
(6) 14:57:37 -> 14:59:07 167640 1862.6 71.50 447.2 61468 67056
command result in multiple (7) 14:59:08 -> 15:00:37 172980 1922.0 73.74 447.0 63426 69192
(8) 15:00:38 -> 15:02:08 173760 1930.6 74.07 446.9 63712 69504
output commands? (9) 15:02:09 -> 15:03:39 208620 2318.0 88.93 446.9 76494 83448
(10) 15:03:40 -> 15:05:10 213480 2372.0 91.01 447.0 78254 85368
(11) 15:05:11 -> 15:06:41 214320 2381.3 91.39 447.1 78584 85728
(12) 15:06:42 -> 15:08:11 213180 2368.6 90.92 447.2 78166 85272
(13) 15:08:12 -> 15:09:42 216120 2401.3 92.15 447.0 79244 86448
(14) 15:09:43 -> 15:10:13 76320 2544.0 32.53 446.8 27984 30528
------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
2294625 27191.3 978.37 5811.4 841300 917786

Answers:
(1) When modifying identity or timestamp columns, RS first hast to send the appropriate set option and then
disable/unset when done with each command. This is because some commands (such as set identity_insert)
are only allowed to be on for a single table at a time.

DSIEXEC Processing Times (1) (time in secs)


8) Scroll down to line ~3049, Destination Connection: ASE2.tpcc (102)

DSIEXEC processing Interval


------------------------------
GetTranTm
----------
ReadTime
----------
ParseTime
----------
FSMapTime
----------
BatchTime
----------
PrepareTm
----------
SendTime
----------
(1) 14:49:01 -> 14:51:33 0.000 0.000 0.000 0.000 0.000 0.000 0.000
Times. (2) 14:51:34 -> 14:53:03
(3) 14:53:04 -> 14:54:34
0.945
1.305
0.330
0.448
0.000
1.710
1.431
1.817
63.460
87.290
0.172
0.034
0.049
0.066
(4) 14:54:35 -> 14:56:05 1.350 0.429 2.939 1.780 87.078 0.035 0.067
(5) 14:56:06 -> 14:57:36 1.214 0.411 3.260 1.787 87.179 0.037 0.069
(6) 14:57:37 -> 14:59:07 1.398 0.425 3.133 1.739 87.176 0.035 0.068
Despite (7) 14:59:08 -> 15:00:37
(8) 15:00:38 -> 15:02:08
1.435
1.314
0.435
0.427
3.350
3.259
1.786
1.795
86.852
86.933
0.037
0.038
0.069
0.069
(9) 15:02:09 -> 15:03:39 0.888 0.423 3.836 2.051 86.658 0.044 0.083
dist_cmd_direct_replicate (10) 15:03:40 -> 15:05:10
(11) 15:05:11 -> 15:06:41
0.913
0.915
0.430
0.429
4.003
3.983
2.093
2.107
86.592
86.650
0.045
0.046
0.085
0.084
not being effective due to (12) 15:06:42 -> 15:08:11
(13) 15:08:12 -> 15:09:42
0.910
0.838
0.429
0.431
4.031
4.063
2.095
2.123
86.768
86.730
0.046
0.046
0.084
0.086
(14) 15:09:43 -> 15:10:13 0.326 0.153 1.426 0.747 28.837 0.016 0.034
latency, is the command ------------------------------ ----------
13.751
----------
5.200
----------
38.993
----------
23.351
----------
1048.203
----------
0.631
----------
0.913
read and parse time an
DSIEXEC Processing Times (2) (time in secs)
issue? When would you Destination Connection: ASE2.tpcc (102)

expect FSMapTime to be Interval


------------------------------
BulkTime
----------
ResultTime
----------
ExecCmdTm
----------
FinishTran
----------
RelTranTm
----------
TranTime
----------
(1) 14:49:01 -> 14:51:33 0.000 0.000 0.000 0.000 0.000 0.000
high? Why is SendTime (2) 14:51:34 -> 14:53:03
(3) 14:53:04 -> 14:54:34
0.000
0.000
61.106
82.558
62.850
84.808
0.036
0.058
0.038
0.049
64.437
88.603
low compared to Result (4) 14:54:35 -> 14:56:05
(5) 14:56:06 -> 14:57:36
0.000
0.000
81.167
81.121
83.400
83.371
0.075
0.071
0.049
0.049
88.413
88.544
(6) 14:57:37 -> 14:59:07 0.000 81.065 83.255 0.046 0.050 88.491
time? What are the two (7) 14:59:08 -> 15:00:37
(8) 15:00:38 -> 15:02:08
0.000
0.000
80.483
80.764
82.734
83.027
0.054
0.066
0.052
0.051
88.225
88.316

largest time consumers? (9) 15:02:09 -> 15:03:39


(10) 15:03:40 -> 15:05:10
0.000
0.000
80.349
80.048
82.935
82.688
0.021
0.021
0.056
0.057
88.407
88.400
(11) 15:05:11 -> 15:06:41 0.000 80.122 82.778 0.022 0.058 88.474
(12) 15:06:42 -> 15:08:11 0.000 80.178 82.820 0.021 0.057 88.554
(13) 15:08:12 -> 15:09:42 0.000 80.180 82.858 0.022 0.058 88.542
(14) 15:09:43 -> 15:10:13 0.000 26.472 27.419 0.008 0.021 29.459
------------------------------ ---------- ---------- ---------- ---------- ---------- ----------
0.000 975.613 1004.943 0.521 0.645 1066.865

Answers:
(1) No - together the two total ~44 seconds….fairly small.
(2) If there were custom function strings - especially larger/multi-statement ones
(3) It is measuring strictly the time for ct_send() which is not execution time - just the time to send the commands
to ASSE.
(4) BatchTime and ResultTime - note that ExecCmdTm and TranTime are aggregate values for time counters in
preceding columns (e.g. ExecCmdTime=PrepareTime+SendTime+ResultTime); TranTime=ResultTime +
ExecCmdTime + FinishTranTime + RelTranTime.

DSIEXEC Batch Flush Causes (1)


9) Scroll to line ~3130, Destination Connection: ASE2.tpcc (102)

DSIEXEC Batch Flush Interval


------------------------------
NumBatches
----------
AvgBatchSz
----------
MaxBatchSz
----------
Cmds/Batch
----------
ResultProc
----------
CommitNext
----------
MaxCmds
----------
(1) 14:49:01 -> 14:51:33 0 0 0 0.0 0 0 0
causes (2) 14:51:34 -> 14:53:03
(3) 14:53:04 -> 14:54:34
4187
5624
2743
2721
8748
8271
29.9
30.0
0
0
4187
5624
0
0
(4) 14:54:35 -> 14:56:05 5690 2723 7751 30.0 0 5690 0
(5) 14:56:06 -> 14:57:36 5772 2724 7828 30.0 0 5772 0
(6) 14:57:37 -> 14:59:07 5588 2741 8031 30.0 0 5588 0
What was the maximum (7) 14:59:08 -> 15:00:37
(8) 15:00:38 -> 15:02:08
5765
5792
2725
2723
7931
7957
30.0
30.0
0
0
5765
5792
0
0
(9) 15:02:09 -> 15:03:39 6955 2714 8134 29.9 0 6955 0
size of each command (10) 15:03:40 -> 15:05:10
(11) 15:05:11 -> 15:06:41
7114
7144
2724
2732
8115
8342
30.0
30.0
0
0
7114
7144
0
0
batch? What does this (12) 15:06:42 -> 15:08:11
(13) 15:08:12 -> 15:09:42
7106
7205
2733
2727
8321
8087
30.0
29.9
0
0
7106
7205
0
0
(14) 15:09:43 -> 15:10:13 2544 2707 8066 30.0 0 2544 0
suggest for ------------------------------ ----------
76486
----------
2531
----------
8748
----------
27.8
----------
0
----------
76486
----------
0
dsi_cmd_batch_size?
Could this be having an Answers:
effect on the BatchTime in (1) ~8000 +/- bytes
(2) dsi_cmd_batch_size is likely at its default config of 8192 vs. recommended setting of 65536
the above (#8) (3) Yes - however, it is possible that dynamic_sql and/or dsi_bulk_copy were enabled which would show up as
high batch time. (In looking in configs and reports both are disabled)

34
10) You have completed the You are now able to:
exercise!
 Analyze DSI SQT cache and SQMR latency
 Isolate when the RDB or RS configs are driving latency

35
© 2013 by SAP AG or an SAP affiliate company. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG.
The information contained herein may be changed without prior notice.
Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. National product specifications
may vary.
These materials are provided by SAP AG and its affiliated companies (“SAP Group”) for informational purposes only, without representation or warranty of any kind, and
SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in
the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and other
countries.
Please see
http://www.sap.com/corporate-en/legal/copyright/index.epx#trademark
for additional trademark information and notices.

S-ar putea să vă placă și