Sunteți pe pagina 1din 11

SAP BW on HANA & HANA Smart Data Access

Virtual Table Statistics


SAP BW ON HANA & HANA SMART DATA ACCESS – VIRTUAL TABLE STATISTICS

TABLE OF CONTENTS
DOCUMENT VERSION ............................................................................................................................... 3
DESCRIPTION ............................................................................................................................................ 3
WHY DO WE NEED STATISITCS ON VIRTUAL TABLES FOR HANA SMART DATA ACCESS? ............. 4
HOW CAN STATISTICS ON VIRTUAL TABLES BE CREATED? ............................................................... 4
Creation of statistics with ABAP program ................................................................................................ 4
Creation of statistics with SQL console ................................................................................................... 5
HOW ABOUT AN EXAMPLE? .................................................................................................................... 6

2
SAP BW ON HANA & HANA SMART DATA ACCESS – VIRTUAL TABLE STATISTICS

Document history

DOCUMENT DESCRIPTION DATE


VERSION
1.0 First Version 14.03.2014
1.1 Version with the following updates: 12.08.2014
New default cardinality for virtual tables as of HANA revision 74.01
Slight changes in phrasing

3
SAP BW ON HANA & HANA SMART DATA ACCESS – VIRTUAL TABLE STATISTICS

WHY DO WE NEED STATISITCS ON VIRTUAL TABLES FOR HANA SMART DATA ACCESS?

Virtual tables are used in the context of HANA Smart Data Access to connect to a remote source. In order to
create an optimized query execution plan, HANA should have database statistics for the virtual table. The
simplest statistics would be just the number of records of the source table. If there are no statistics, a default
value will be used (see article “How can I create an Open ODS View of type Virtual Table?” on this SCN page?).

Note: there is currently no way to create the statistics for the virtual table out of the DB statistics for the table
in the remote source, but they are created basically by single COUNT statements. Steps into this direction
are planned, but no concrete time line can be provided. Of course, the query execution in the remote
database optimizes the query execution based on its own techniques.

As of HANA Revision 74.01 a new default cardinality for virtual tables has been introduced. If database
statistics are not available for the virtual table, then HANA assumes a cardinality of 1 million records for the
virtual table (formerly 10.000 records). This should better “protect” the source database against expensive
queries caused by suboptimal query optimization. The default cardinality is set by parameter
virtual_table_default_cardinality in the indexserver.ini (section smart_data_access).

HOW CAN STATISTICS ON VIRTUAL TABLES BE CREATED?

Statistics can be created with the HANA Studio SQL console. Alternatively BW provides the program
RSSDA_CREATE_TABLE_STAT to create statistics which can also be used to refresh statistics periodically.

Creation of statistics with ABAP program

Execute the program RSSDA_CREATE_TABLE_STAT with the following selections, see also note 1990181:

InfoProvider

Name of Open ODS View with SAP HANA Smart Data Access
Or Name of InfoProvider with Near-line Storage using SDA for read access.

Fieldname

Enter a fieldname if statistics should be created for selected fields only. If no field is provided, the statistics
are created for all fields of the virtual table.

Checkbox “With Histograms?”

HANA can optionally create histogram statistics to better evaluate the costs. Please note that this option
causes a higher workload on the remote source during statistics creation than simple statistics.

4
SAP BW ON HANA & HANA SMART DATA ACCESS – VIRTUAL TABLE STATISTICS

Creation of statistics with SQL console

(1) Simple Statistics on one field

create statistics on "<Virtual Table Name>" ("<Field Name>") type simple;

(2) Simple Statistics on all fields

In order to better evaluate the costs of semi-join optimizations, simple statistics should however be created
on all fields which are potentially in a join condition.

create statistics on "<Virtual Table Name>" type simple;

(3) Statistics on all columns including histograms

The best possible query optimizations however rely on the full set of statistics, which also include histogram
information. These statistics can be created as follows:

create statistics on "<Virtual Table Name>" type all;

As a starting point we recommend creating simple statistics on one low cardinality field. For further fine
tuning also e.g. histograms could be used see the HANA SQL reference (section 1.8.1.12) and note
1872652 for more information.

Note: As for other classic DB statistics, it is not necessary to re-create/refresh the statistics on Virtual Tables
after each change of the data in the remote source, but only if significant changes, e.g. massive growth or
different value distribution occurred.

5
SAP BW ON HANA & HANA SMART DATA ACCESS – VIRTUAL TABLE STATISTICS

HOW ABOUT AN EXAMPLE?

We show how the behavior changes with and without statistics on a virtual table. This query is built on top of
an Open ODS View called “XSB_01B_BZH”. To be sure that no statistics are available for the virtual table,
we drop the statistics in the SQL console:

DROP STATISTICS ON "/BIC/EXSB_01B_BZH"

Then we execute the query:

Picture 1: Query Result

The BW query statistics show that the database time for the query took 128 seconds.

Picture 2: BW query statistics

6
SAP BW ON HANA & HANA SMART DATA ACCESS – VIRTUAL TABLE STATISTICS

The SQL statement sent to the remote database does not contain any filter condition, which means that no
SEMI-join or join-relocation is applied.

SELECT
"W1"."PRODUCT",
"W1"."STORE",
"W1"."DOC_CURRENCY",
COUNT(*),
SUM("W1"."COSTWT")
FROM
"SAPKIT"."YSB_50MIO" "W1"
GROUP BY
"W1"."PRODUCT",
"W1"."STORE",
"W1"."DOC_CURRENCY"

We are executing this statement with select count (*) in the remote source (here a remote HANA DB) to find
out how much records are selected in the source. This information can also be found in the local HANA
under Provisioning Smart Data Access:

Picture 3: Number of records selected in the remote source

Now statistics are created for the virtual table of the Open ODS View with program
RSSDA_CREATE_TABLE_STAT. The HANA Query optimizer is now aware of the number of rows of the
source table to improve the query plan optimization.

Picture 4: Selection screen of program RSSDA_CREATE_TABLE_STAT

7
SAP BW ON HANA & HANA SMART DATA ACCESS – VIRTUAL TABLE STATISTICS

To compute the statistics, a SQL query is executed in the remote database as shown below (visible when the
federation trace is set to “debug”):

SELECT
"/BIC/EXSB_01B_BZH"."DOC_CURRENCY",
COUNT(*)
FROM
"SAPKIT"."YSB_50MIO" "/BIC/EXSB_01B_BZH"
GROUP BY
"/BIC/EXSB_01B_BZH"."DOC_CURRENCY"
ORDER BY
"/BIC/EXSB_01B_BZH"."DOC_CURRENCY" ASC

The number of selected records corresponds to the cardinality of the field. Therefore the smaller the
cardinality the faster the statistics are created. As mentioned at the beginning of this document, it is planned
to change this with the implementation of a new statistics concept.

Picture 5: Number of records per DOC_CURRENCY value

Now the same query is executed again with statistics for the virtual table of InfoProvider XSB_01B_BZH:

Picture 6: BW query statistics after execution with virtual table statistics

The query database read time decreased from 128 to 1 second.

8
SAP BW ON HANA & HANA SMART DATA ACCESS – VIRTUAL TABLE STATISTICS

Under Provisioning Smart Data Access in the HANA Studio e.g. the SQL statement is shown as it is sent
to the remote source (with IN-clause):

SELECT
SQ.*
FROM (SELECT
"W1"."PRODUCT" AS "PRODUCT",
"W1"."STORE" AS "STORE",
"W1"."DOC_CURRENCY" AS "DOC_CURRENCY",
COUNT(*) AS COL0,
SUM("W1"."COSTWT") AS COL1
FROM
"SAPKIT"."YSB_50MIO" "W1"
GROUP BY
"W1"."PRODUCT",
"W1"."STORE",
"W1"."DOC_CURRENCY" ) SQ
WHERE
SQ."STORE" IN ('CH05');

This statement is executed again in the remote source with select count (*) to find out how much records are
selected in the source database. This information can also be found in the local HANA under Provisioning
Smart Data Access:

Picture 7: Number of records selected in the remote source

If HANA does not have statistics about the size of the remote fact table, the query optimizer can only
generate a plan optimization bx using default values. These defaults may not be suitable in many scenarios
and therefore may lead to suboptimal query performance. In our example, the optimizer may decide sending
the 27 Mio. records from the remote source to the local HANA ifno optimiazation like semi-join are performed
(see picture 3). For details about the query execution optimizations please see document “How does a BEx
Query execution with SDA look like?” on this SCN page.).

After having created statistics, which provide the information that the fact table to be joined is big, the
optimizer was able to decide for the semi-join execution.
Also it is known to the local HANA - via the selective filter on field STORE (selection of one characteristic
value for STORE) – that the semi-join would reduce the result set dramatically. The cardinality of field
STORE is 406 in our example and the remote fact table has approximately 54 Mio.rows. Therefore about
133.000 records (if we assume equipartition) to be transferred from the remote source could be expected
when filtering on one characteristic value. In fact 77.827 rows had to be transferred from the remote source
(see picture 7).

9
SAP BW ON HANA & HANA SMART DATA ACCESS – VIRTUAL TABLE STATISTICS

So, despite the importance that the optimizer should have information about the number of rows in a source
table, statistics should be available for fields which are filtered (directly or indirectly via the join condition).
If the optimizer knows of the cardinality of a field, it can better judge the selectivityof a filter condition.

Example 1:
A table has 10.000 rows and 10.000 distinct values in a field. A single value filter on this field will return 0 or
1 record which means that this is a very selective filter.

Example 2:
A table has 10.000 rows and 1 distinct value in a field. A single value filter on this field can return up to
10.000 records which means that the filter might not be selective.
In this case histograms are important. With histograms on this field, HANA could recognize, if no records or
all records are returned according to the filter condition. In case alle records have to be returned it makes
sense not to execute a semi-join optimization.

When looking at the optimizations like semi-join or join relocation it is always the trade-off between costs for
the optimization execution and the saving for reading and transferring data from the source to the local
HANA.

10
www.sap.com

© 2014 SAP SE. All rights reserved.

SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP


BusinessObjects Explorer, StreamWork, SAP HANA, and other SAP
products and services mentioned herein as well as their respective
logos are trademarks or registered trademarks of SAP SE in Germany
and other countries.

Business Objects and the Business Objects logo, BusinessObjects,


Crystal Reports, Crystal Decisions, Web Intelligence, Xcelsius, and
other Business Objects products and services mentioned herein as
well as their respective logos are trademarks or registered trademarks
of Business Objects Software Ltd. Business Objects is an SAP
company.

Sybase and Adaptive Server, iAnywhere, Sybase 365, SQL


Anywhere, and other Sybase products and services mentioned herein
as well as their respective logos are trademarks or registered
trademarks of Sybase Inc. Sybase is an SAP company.

Crossgate, m@gic EDDY, B2B 360°, and B2B 360° Services are
registered trademarks of Crossgate AG in Germany and other
countries. Crossgate is an SAP company.

All other product and service names mentioned are the trademarks of
their respective companies. Data contained in this document serves
informational purposes only. National product specifications may vary.

These materials are subject to change without notice. These materials


are provided by SAP SE and its affiliated companies ("SAP Group")
for informational purposes only, without representation or warranty of
any kind, and SAP Group shall not be liable for errors or omissions
with respect to the materials. The only warranties for SAP Group
products and services are those that are set forth in the express
warranty statements accompanying such products and services, if
any. Nothing herein should be construed as constituting an additional
warranty.

S-ar putea să vă placă și