Documente Academic
Documente Profesional
Documente Cultură
Course Guide
Version: ADVPD-941-MAR14-CG
© 2000–2014 MicroStrategy Incorporated. All rights reserved.
This Course (course and course materials) and any Software are provided “as is” and without express or limited
warranty of any kind by either MicroStrategy Incorporated (“MicroStrategy”) or anyone who has been involved in the
creation, production, or distribution of the Course or Software, including, but not limited to, the implied warranties of
merchantability and fitness for a particular purpose. The entire risk as to the quality and performance of the Course
and Software is with you. Should the Course or Software prove defective, you (and not MicroStrategy or anyone else
who has been involved with the creation, production, or distribution of the Course or Software) assume the entire cost
of all necessary servicing, repair, or correction.
In no event will MicroStrategy or any other person involved with the creation, production, or distribution of the Course
or Software be liable to you on account of any claim for damage, including any lost profits, lost savings, or other
special, incidental, consequential, or exemplary damages, including but not limited to any damages assessed against or
paid by you to any third party, arising from the use, inability to use, quality, or performance of such Course and
Software, even if MicroStrategy or any such other person or entity has been advised of the possibility of such damages,
or for the claim by any other party. In addition, MicroStrategy or any other person involved in the creation, production,
or distribution of the Course and Software shall not be liable for any claim by you or any other party for damages
arising from the use, inability to use, quality, or performance of such Course and Software, based upon principles of
contract warranty, negligence, strict liability for the negligence of indemnity or contribution, the failure of any remedy
to achieve its essential purpose, or otherwise.
The Course and the Software are copyrighted and all rights are reserved by MicroStrategy. MicroStrategy reserves the
right to make periodic modifications to the Course or the Software without obligation to notify any person or entity of
such revision. Copying, duplicating, selling, or otherwise distributing any part of the Course or Software without prior
written consent of an authorized representative of MicroStrategy are prohibited.
U.S. Government Restricted Rights. It is acknowledged that the Course and Software were developed at private
expense, that no part is public domain, and that the Course and Software are Commercial Computer Software and/or
Commercial Computer Software Documentation provided with RESTRICTED RIGHTS under Federal Acquisition
Regulations and agency supplements to them. Use, duplication, or disclosure by the U.S. Government is subject to
restrictions as set forth in subparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at
DFAR 252.227-7013 et. seq. or subparagraphs (c)(1) and (2) of the Commercial Computer Software—Restricted Rights
at FAR 52.227-19, as applicable. The Contractor is MicroStrategy, 1850 Towers Crescent Plaza, Vienna, Virginia 22182.
Rights are reserved under copyright laws of the United States with respect to unpublished portions of the Software.
Copyright Information
Trademark Information
All other company and product names may be trademarks of the respective companies with which they are associated.
Specifications subject to change without notice. MicroStrategy is not responsible for errors or omissions.
MicroStrategy makes no warranties or commitments concerning the availability of future products or versions that
may be planned or under development.
Patent Information
This product is patented. One or more of the following patents may apply to the product sold herein: U.S. Patent Nos.
6,154,766, 6,173,310, 6,260,050, 6,263,051, 6,269,393, 6,279,033, 6,567,796, 6,587,547, 6,606,596, 6,658,093,
6,658,432, 6,662,195, 6,671,715, 6,691,100, 6,694,316, 6,697,808, 6,704,723, 6,741,980, 6,765,997, 6,768,788,
6,772,137, 6,788,768, 6,798,867, 6,801,910, 6,820,073, 6,829,334, 6,836,537, 6,850,603, 6,859,798, 6,873,693,
6,885,734, 6,940,953, 6,964,012, 6,977,992, 6,996,568, 6,996,569, 7,003,512, 7,010,518, 7,016,480, 7,020,251,
7,039,165, 7,082,422, 7,113,993, 7,127,403, 7,174,349, 7,181,417, 7,194,457, 7,197,461, 7,228,303, 7,260,577, 7,266,181,
7,272,212, 7,302,639, 7,324,942, 7,330,847, 7,340,040, 7,356,758, 7,356,840, 7,415,438, 7,428,302, 7,430,562,
7,440,898, 7,486,780, 7,509,671, 7,516,181, 7,559,048, 7,574,376, 7,617,201, 7,725,811, 7,801,967, 7,836,178, 7,861,161,
7,861,253, 7,881,443, 7,925,616, 7,945,584, 7,970,782, 8,005,870, 8,051,168, 8,051,369, 8,094,788, 8,130,918,
8,296,287, 8,321,411 and 8,452,755. Other patent applications are pending.
How to Contact Us
TABLE OF CONTENTS
Course Description
• Project architects
Course Prerequisites
Before starting this course, you should know all topics covered in the following
courses:
Follow-up Courses
After taking this course, you might consider taking the following courses:
Related Certifications
This course does not have any recommended follow-up certifications.
Course Objectives
After completing this course, you will be able to:
• Describe the project design process, and describe the basic and advanced
schema objects you can create with MicroStrategy Architect. (Page 18)
• Define the primary and secondary database instance, use the Architect
graphical interface to maintain project tables, describe how the
MicroStrategy SQL Engine is aggregate aware, and create aggregate fact
tables using data marts. (Page 28)
• Describe how you can use MultiSource Option to access heterogeneous data
sources, associate tables in a project to multiple database instances, create
objects for multisource reports, and create multisource reports. (Page 74)
Content Descriptions
Each major section of this course begins with a Description heading. The
Description introduces you to the content contained in that section.
Learning Objectives
Learning objectives enable you to focus on the key knowledge and skills you
should obtain by successfully completing this course. Objectives are provided
for you at the following three levels:
Lessons
Each lesson sequentially presents concepts and guides you with step-by-step
procedures. Illustrations, screen examples, bulleted text, notes, and definition
tables help you to achieve the learning objectives.
• Review
• Case Study
• Business Scenario
• Exercises
Typographical Standards
Following are explanations of the font style changes, icons, and different types
of notes that you see in this course.
Actions
References to screen elements and keys that are the focus of actions are in bold
Arial font style. The following example shows this style:
Code
Sum(Sales)/Number of Months
Data Entry
References to literal data you must type in an exercise or procedure are in bold
Arial font style. References to data you type that could vary from user to user or
system to system are in bold italic Arial font style. The following example
shows this style:
Keyboard Keys
Press CTRL+B.
New Terms
New terms to note are in regular italic font style. These terms are defined when
they are first encountered in the course. The following example shows this
style:
MicroStrategy Courses
Core Courses
• Implementing MicroStrategy: Development and Deployment
Advanced Courses
• MicroStrategy Administration: Configuration and Security
Lesson Description
In this lesson, you will first review the project design process and learn about
the tools and components that enable you to manage the project schema. Next,
you will review the basic schema objects that you can create in MicroStrategy
Architect. Finally, you will learn about additional schema objects that enable
you to perform advanced functions.
Lesson Objectives
After completing the topics in this lesson, you will be able to:
• Describe the project design process and learn about the tools and
components that enable you to manage the project schema. (Page 19)
• Describe the basic schema objects that you can create in MicroStrategy
Architect and learn about additional schema objects that enable you to
perform advanced functions. (Page 22)
Recall that project design involves more than just creating a project in
MicroStrategy Architect. Understanding how users want to report on
information in the data warehouse, how data in the warehouse is related, and
how that data is stored are all fundamental parts of the project design process.
When you design the logical data model, you need to determine the
information that users want to see in reports and determine what information
is actually available in the source systems. Finally, you design the model that
incorporates both.
When you design the data warehouse schema, you need to first consider the
advantages and disadvantages of various structures for storing data in the data
warehouse. You then determine the optimal schema design that balances the
reporting requirements, performance requirements, and maintenance
overhead. Finally, you create the data warehouse using this schema design or
modify the existing data warehouse to use this schema design.
You need to have a solid design for the logical data model and data warehouse
schema before you move on to creating the actual project. Both of these
components can directly affect how you query data in a project, what data you
can query, how fast queries run, and so forth.
Managing the project schema is the final and ongoing step in the project design
process. Over the life of the project, your logical data model or data warehouse
may change, or your reporting needs may change, which can necessitate
changes to schema objects.
In the MicroStrategy Architect: Advanced Project Design course, you will learn
about different strategies and tools that help you manage your project’s
schema. In particular, you will learn about the following:
Recall that schema objects are logical objects that relate application objects to
data warehouse content. They are the bridge between your reporting
environment and your data warehouse. As such, you have to create the basic
schema objects a project requires before you can complete any other tasks,
such as creating templates, filters, reports, or documents.
In the MicroStrategy Architect: Advanced Project Design course, you will learn
about additional properties of facts. You will learn how to create different types
of fact extensions that will enable you to report on facts at additional levels,
beyond how they are stored in the data warehouse.
You can also create other types of schema objects in MicroStrategy that you use
for more advanced functions:
OR
If you are modifying an existing fact, in the Schema Objects folder, select
the Facts folder.
• Column Alias—This tab enables you to modify the column alias for a fact.
• Extensions—This tab enables you to create, modify, and delete level
extensions for a fact.
All facts have a definition and column alias, but level extensions for facts are
optional. You already know about the first two tabs from the Project Design
Essentials course. You will learn more about extensions later in this course.
Lesson Summary
In this lesson, you learned:
• The project design process involves the following steps: designing the
logical data model, designing the data warehouse schema, creating the
project in MicroStrategy Architect, and managing the project schema.
• As part of managing the project schema, you will learn how to define
primary and secondary database instances, understand
aggregation-awareness, create data marts, enable multisource report
execution, and use fact editors.
• You can also create other types of schema objects in MicroStrategy —such
as transformations and partition mappings—that you use for more
advanced functions.
• The Fact Editor is one of the schema object editors available in Developer.
It enables you to create or modify any type of fact or fact expression and
configure a variety of fact-related settings.
Lesson Description
This lesson covers a variety of advanced topics that enable you to maintain a
MicroStrategy project as it changes over time and help you optimize
performance within your project.
In this lesson, you will review the concept of primary and secondary database
instances. You will then learn about options that enable you to maintain
project tables. You will also learn how the MicroStrategy SQL Engine is
aggregate aware. Finally, you will learn how to create aggregate fact tables
using data marts. You will learn what a data mart is, how to create data mart
reports and data mart tables, and how to incorporate data mart tables into a
project.
Lesson Objectives
After completing the topics in this lesson, you will be able to:
• Describe the primary and secondary database instances and their use in
MicroStrategy project. (Page 29)
• Describe how the Warehouse Tables pane options in the Architect graphical
interface can help you maintain a project over time. (Page 32)
• Describe how MicroStrategy Architect calculates logical table size, and how
the SQL Engine uses logical table size to select the optimal table for a
query. (Page 35)
• Define a data mart, and list and define data mart objects; create a data mart
table by creating and executing a data mart report; list and define data mart
column creation options, and use a data mart table in a project. (Page 40)
Ifconnection
you are using security features such as warehouse authentication or
mapping, different users may access the same data
warehouse using different DSNs or logins. However, even in these cases,
the project database instance is still associated with a default DSN and
login.
Although a project uses a single primary database instance to access the data
warehouse, you can create any number of secondary database instances that
point to a variety of data sources. You can then use these database instances for
other tasks such as creating data marts and Freeform SQL or Query Builder
reports.
Toappropriate
create and configure database instances, you must have the
administrative privileges.
2 Using the Database Instances manager, create the database instance that
points to the secondary data source.
ToArchitect:
learn how to create database instances, refer to the MicroStrategy
Project Design Essentials or MicroStrategy
Administration: Configuration and Security courses.
This window asks you if you want to configure the data mart
optimization for the database instance. You do not have to configure
data mart optimization if you do not plan to use that database
instance to create data marts. For information on database
optimization, see “Data Mart Optimization” starting on page 44.
The following image shows the Project Configuration Editor with a single
primary database instance and multiple secondary database instances
associated with the MicroStrategy Tutorial project:
Primary and Secondary Database Instances
Maintaining Warehouse
You already learned how to use the Warehouse Tables pane in Architect
graphical interface to select the data warehouse tables you want to use in a
MicroStrategy project. Now, you will learn about other options available that
enable you to maintain a project.
Maintaining Tables
After you add warehouse tables to a project and create schema and application
objects, you may find that the warehouse schema changes over time. The
database administrator may alter the structure of a table, for example, by
adding additional columns. Some table sizes may grow over time, while others
remain the same, making them better candidates for aggregate queries. Some
tables may become obsolete and may be removed from the warehouse.
The Warehouse Tables pane provides the following options for individual
tables:
• Update Structure—If the table structure has changed since you added the
table to the project, you can click Update Structure to force MicroStrategy
Architect to recognize the changes.
• Show Sample Data—This option enables you to view the first 100 rows of
data in a table.
You can view a table’s structure before you add it to the project. To show
the columns in a table, click expand.
Aggregation Awareness
The FACT_SALES table stores dollar and unit sales data at the lowest possible
level of detail—by item, employee, and date. Therefore, it is the base fact table
for these two facts. The FACT_SALES_AGG table stores dollar and unit sales
data at a higher level of detail—by category, region, and month. Therefore, it is
an aggregate fact table for these two facts.
Because they store data at a higher level, aggregate fact tables reduce query
time. For example, if you want to view a report that shows unit sales by region,
you can obtain the result set more quickly using the FACT_SALES_AGG table
than the FACT_SALES table.
In a data warehouse, you often have multiple aggregate fact tables for the same
fact or set of facts to enable you to more quickly analyze fact data at various
levels of detail.
There are two actions that you need to perform to integrate an aggregate fact
table into an existing project:
1 Add the table to the project using the Warehouse Tables pane in Architect
graphical interface.
2 If necessary, map the existing attributes and facts to the aggregate table.
If your aggregate fact table structure is consistent with your base fact table
structure, MicroStrategy Architect will automatically add the table to the
definitions of your existing attributes and facts. However, if your aggregate fact
table structure contains new columns that have not been mapped to existing
attribute form expressions and fact expressions, you must manually map the
new table to the desired attributes and facts.
IfMicroStrategy
the aggregate fact table structure matches the base fact table,
Architect can automatically map the new table to existing
attributes and facts as long as automatic mapping is used for the
corresponding attribute form expressions and fact expressions.
MicroStrategy Architect assigns a size to every table when you initially add
them to a project. These size assignments are stored in the metadata.
MicroStrategy Architect assigns sizes based on the columns in the tables and
the attributes to which those columns correspond. Because MicroStrategy
Architect uses the logical attribute definitions to assign a size to each table in
the project, this measurement is referred to as logical table size.
Logical table size is the sum of the weight for each attribute contained in the
table. Attribute weight is defined as the position of an attribute in its hierarchy
divided by the number of attributes in the hierarchy, multiplied by a factor of
10. Using this formula, MicroStrategy Architect calculates the respective
weight of each attribute as shown in the illustration above. The logical table
size of each fact table is simply the sum of its respective attribute weights.
You can view the logical table size for each table in the Logical Table Editor.
When the SQL Engine can obtain data from two or more tables in the
warehouse, it looks at the logical table size and generates SQL against the table
with the smallest logical table size. This process helps the SQL Engine select
the optimal table for a query.
At times, you may need to reassign the logical table size for a table. For
example, in the previous illustration for logical table sizes, there are two
aggregate fact tables that both have the same logical table size of 15. However,
one of these tables contains item and region information, and the other one has
class and store information. Clearly, based on the attributes they contain, the
table with item and region information is larger. There are many more items
than classes to which items belong. In this example, where the logical table size
is the same but the physical size is actually very different, you can change the
logical table size automatically assigned by MicroStrategy Architect.
Generally, smaller logical size does equate to smaller physical size. Tables with
higher-level attributes usually have a smaller logical table size than tables with
lower-level attributes. However, there are times when this is not the case due to
the particular combination of attributes in a table. In such cases, you have to
change the logical table size to force the SQL Engine to use the table that you
know has a smaller physical size.
2 In the Properties pane, in the Definition section, click the Logical Size box,
type the new logical table size value.
ToEditor.
lock the logical size of the table you need to access Logical Size
You cannot lock the size of the table from the Properties pane.
To change logical table sizes and lock the table sizes using the Logical Size
Editor:
1 On the Design tab, in the Editors section, click Edit logical size of tables
button.
2 In the Logical Size Editor, for the table you want to modify, in the Size value
box, type a new logical table size value.
3 If you want to preserve the logical table size of a table, select its Size locked
check box.
You should select Size Locked option if you want to ensure that the
logical size you have selected is not overwritten by MicroStrategy
Architect during updates of the project schema. When you update
the project schema, you can choose to update logical table sizes. You
may need to perform this action for other tables. Selecting this
option allows you to update the logical sizes of other tables while
preserving the sizes of tables that you have manually assigned.
Data Marts
• Creating tables for very large result sets and then using other applications
such as Microsoft Excel or Microsoft Access to access the data
In this lesson, you will use data marts to create aggregate fact tables.
You can use data marts in other usage scenarios. Combining data marts
with MicroStrategy data mining features or with Freeform SQL reports
are two such scenarios.
In this example, forecasting data is stored at the employee and date level in the
FORECAST_SALES base fact table. However, you want to report on the
Forecast Unit Sold at the Region level. This requires three joins from the fact
table to the LU_REGION lookup table. In addition, the FORECAST_SALES
table may have millions of rows. This query may be very costly, especially if
users request it often.
What if you could create an aggregate table that limits the number of joins and
the number of rows in the fact table? You can achieve this by creating a data
mart table. You can then bring this table into your project, map the Forecast
Unit Sales and metric to it, and have your region-level reports automatically
use it, as shown below:
Aggregate Fact Table Created as Data Mart
• Data mart report—This is a metadata object that you create in the Report
Editor. When executed, the data mart report creates the data mart table in
the warehouse of your choice. The data mart report contains attributes,
metrics, and other application objects that translate into columns in the
data mart table.
• Data mart table—This is the relational table created after the execution of a
data mart report.
When you create a data mart report, you must specify a database instance in
which to create the data mart table.
You create a data mart in a database instance in one of the following ways:
• Option 2—Use a secondary project database instance that exists in the same
warehouse as the primary project database instance.
• Option 3—Use a different database instance than the project, and one that
is in a different warehouse than the primary project database instance.
The following figure illustrates each of these data mart database instance
options:
Data Mart Database Instance Options
If you use the primary project database instance, then you do not need to take
any additional steps to create a data mart. You simply select the primary data
mart database instance as a target when you create the data mart report.
If you plan to use a secondary project database instance, then you must create
that database instance before creating the data mart. You then associate this
database instance to the project in the Project Configuration Editor.
This message does not display if you have enabled data mart
optimization for the data mart database instance before you associated
this database instance to a project.
When you click Yes, the Database Instances editor for the data mart database
instance opens with the Advanced tab automatically selected.
Data mart optimization occurs when you create a data mart in the
primary project database instance or in a database instance that points
to the same data warehouse as the primary project database instance.
1 In the Database Instances editor, on the Advanced tab, under Data mart
optimization, select the This database instance is located in the same
warehouse as check box.
Ifwarehouse
the data mart database instance does not reside in the same
as the project database instance, do not select this check
box.
3 Click OK.
The following image shows the data mart optimization option for the Forecast
Data database instance residing in the same warehouse as the Tutorial Data:
Data Mart Optimization Option
Why Optimize?
When you create a data mart using the primary project database instance or
using a database instance that resides in the same warehouse as the primary
project database instance, you simplify the SQL that is generated to create the
data mart. You also conserve the Intelligence Server machine resources by
minimizing the memory footprint.
The following table displays sample SQL generated when creating a data mart
using a database instance in the same data warehouse as the primary project
database instance, and when using a different data warehouse:
Sample SQL
When you create the data mart in the same data warehouse, MicroStrategy
Intelligence Server creates the data mart table in the project warehouse and
then inserts the result data rows directly into the table.
When you create the data mart in a different data warehouse, MicroStrategy
Intelligence Server extracts the results from the project data warehouse with a
SELECT statement and brings the result set into the Intelligence Server
machine’s memory. It then creates the data mart table in the different data
warehouse and inserts the results.
1 In the Report Editor, on the Data menu, select Configure Data Mart.
2 In the Report Data Mart Setup window, on the General tab, in the Data
mart database instance drop-down list, select the database instance in
which you want to create the data mart table.
3 In the Table name box, type the name of the data mart table you want to
create.
The table name you type is not validated by the system at this point.
By default, the This table name contains placeholders check box is selected.
The selection of this check box enables you to specify whether the data mart
table uses placeholders to name the table. Placeholder names enable you to
modify table names dynamically.
The following table lists placeholders available for naming data mart tables
.
Data Mart Placeholders
!U user name
!O report name
!j job ID
!r report GUID
!t timestamp
!p project name
!z project GUID
5 On the Advanced tab, specify data mart governors and table creation
properties.
6 On the SQL Statements tab, specify SQL statements that can be inserted
before and after the table is created or before data is inserted in the table.
You may see a warning that data mart tables created in common table
spaces may overwrite someone else’s data mart table. If you want to
proceed, click OK.
9 Execute the data mart report to create the data mart table.
For example, using the Forecasting Project you can create a data mart report
that contains the Region and Year attributes and the Forecast Units Sold
metric, as shown in the image below:
Data Mart Report Definition
You can then convert this report into a data mart. The following image shows
the Report Data Mart Setup Window with data mart report configured as
REGION_YEAR_FORECAST_UNIT_SALES table in the Forecast Data
database instance:
Report Data Mart Setup Window
The following image shows a message displayed by the data mart report when
it is executed:
Data Mart Execution Complete Message
A data mart table has the same structure as any other data warehouse table. By
default, it contains columns corresponding to all attribute forms and metric
columns present on the report template.
You can control the structure of a data mart table in the following ways:
• You can control what attribute columns are included in the data mart table.
• You can determine the names for the columns that contain the metric
calculations.
Attribute Columns
A data mart table contains an attribute ID column for each attribute selected in
the data mart report. Additionally, depending on the default display for each
attribute in the data mart report, the data mart table can also include attribute
description columns.
Generally, you would remove any non-ID form descriptions from the
data mart report display to avoid storing duplicate attribute
descriptions in the data mart table.
Consider the data mart report from the previous example that has the Region
and Year attributes and the Forecast Units Sold metric on the template.
Assuming that the default display for the Region attribute is ID and
description, and for Year is ID, when the data mart report is executed, the data
mart table contains the following columns:
• REGION_ID
• REGION_NAME
• YEAR_ID
• WJXBFS1
If you do not want to include attribute description columns in your data mart
table to improve query performance, you must modify the attribute display and
forms available in report objects for each attribute in the data mart report.
2 In the Attribute Display window, in the Attribute drop-down list, select the
attribute whose display you want to modify.
3 Under Select one of the display options below, click Use the following
attribute forms.
5 Click the upper > button to move the ID form to the Displayed forms list.
7 Click the upper < button to remove the non-ID forms from the report
display.
9 Click the lower < button to remove the non-ID forms from the report
objects.
10 Click OK.
The following image shows the Attribute Display window with the Region
attribute configured to use only the ID form:
Attribute Display Options
Using the previous example, after you change the display for the Region
attribute from description to ID only, and then execute the data mart report,
the data mart table contains the following columns:
• REGION_ID
• YEAR_ID
• WJXBFS1
A data mart table contains a column that corresponds to each metric selected
in the data mart report. These columns, created from metric calculations,
become the fact columns.
If you want to use a different name, you can create a column alias for the fact
column that contains the metric calculation.
You specify the column alias in the Metric Editor of the metric on which the
column is based.
1 In the Metric Editor, on the Tools menu, point to Advanced Settings and
select Metric Column Options.
2 In the Metric Column Alias Options window, in the Column Name used in
table SQL creation box, type a name for the metric column.
3 In the Data type drop-down list, select the data type and, if appropriate,
define other relevant parameter setting(s).
The following image shows a custom Total_Unit_Sales column alias for the
Forecast Units Sold metric:
Metric Column Alias Options Window
Using the same example, after you change the column alias for the Forecast
Revenue metric, and then execute the data mart report, the data mart table
contains the following columns:
• REGION_ID
• YEAR_ID
• Total_Unit_Sales
To use a data mart table as a source table in the project in which the data mart
was created, you must first add the table to the project, then update the
appropriate fact, and finally update the project schema.
2 On the Warehouse Tables pane, expand the Forecast Data. You can now see
the data mart table.
3 Right-click the data mart table and select Add Table to Project.
4 The Results Preview window shows attributes and facts that will be created.
In the Results Preview window, in the Fact tab, clear the check box for the
facts.
There might be scenarios where you want to keep the facts. If you want
to create facts, ensure you select the appropriate fact check box in the
Fact tab of the Results Preview window.
5 Click OK.
To use the data mart table as a source table from which to execute reports, you
must update the fact on which the metric used to create the data mart table is
based.
1 In Architect graphical interface, edit the fact that is used in the metric of the
data mart report.
2 In the Project Tables view tab, locate the table that contains the fact and
right-click to edit the fact on which the data mart metric is based.
3 In the Fact Editor, create a new fact expression that uses the data mart table
as a source table.
Ifunderlying
the fact column in the data mart table is named the same as the
fact, you may select the Automatic mapping method for the
new fact expression. By default, when you create a data mart table, the
fact has a unique name that is different from the other facts in the
project.
4 Click OK.
The following image shows the Forecast Units Sold fact mapped to the data
mart aggregate table:
Mapping a Fact to a Data Mart Table
After you add the data mart table to a project and update the appropriate fact
object, you must also update the schema logical information in the metadata.
Lesson Summary
In this lesson, you learned:
• The database instance you select during the project creation process
becomes this project’s primary database instance.
• You can associate any number of secondary database instances with a single
project. You use secondary database instances to create data marts,
Freeform SQL, Query Builder, and MDX reports. You associate secondary
database instances with a project using the Project Configuration Editor.
• Base fact tables are tables that store a fact or set of facts at the lowest
possible level of detail. Aggregate fact tables are tables that store a fact or
set of facts at a higher, or summarized, level of detail.
• To use aggregate tables in the project, you first add the table to the project
using the Warehouse Tables pane in Architect graphical interface. If
necessary, you also map the existing attributes and facts to the aggregate
table.
• Logical table size is the sum of the weight for each attribute contained in the
table. Attribute weight is defined as the position of an attribute in its
hierarchy divided by the number of attributes in the hierarchy multiplied by
a factor of 10.
• You can change the logical table size either in the Logical Size Editor or
from the Properties pane in Architect graphical interface.
• A data mart is a relational table containing a report result set. A data mart
consists of two objects: the data mart report and the data mart table.
• You can use data marts to create aggregate tables and tables based on large
report result sets.
• Data mart reports are created in the Report Editor of a new or existing
report. After you execute the data mart report, the data mart table is created
in the chosen warehouse.
• You can choose to create only attribute ID columns in the data mart table by
removing all non-ID forms from the attribute display and report objects in
the data mart report.
• For the columns in the data mart table that contain metric calculations, you
can use the existing metric name as the column name or you can create a
new column alias.
• To use a data mart table in a project, you must add it to the project using the
Warehouse Tables pane in Architect graphical interface, update the
corresponding fact, and then update the project schema.
Exercises: Managing Project Schema
You will also use the Forecasting Project to complete other exercises
throughout this course.
The schema for this project consists of the following lookup tables:
Lookup Tables
The schema for this project consists of the following fact tables:
Fact Tables
Before you create a data mart, you will first modify the Forecast Revenue
metric to use a custom Forecast_Revenue column alias. You will then create a
data mart report with Region ID and Quarter ID attribute forms and the
Forecast Revenue and Forecast Units Sold metrics on the template. Name
the data mart table REGION_FORECAST_SALES. Save the data mart report
as Regional Forecast Revenue in the Public Objects\Reports folder.
Next, you will add the table to the Forecasting Project. Then, you will create a
new fact expression for the Forecast Revenue fact that uses the
Forecast_Revenue column in the REGION_FORECAST_SALES table. You
will also create and run a Data Mart Test report with the Region attribute and
the Forecast Revenue metric on the template to confirm that the Forecast
Revenue fact uses the REGION_FORECAST_SALES table.
Finally, you will change the logical table size for the
REGION_FORECAST_SALES table to 30 and run the Data Mart Test report
to view the impact of your change on the report SQL.
Detailed Instructions
Modify metric alias
3 In the Public Objects folder, in the Metrics folder, edit the Forecast
Revenue metric.
4 In the Metric Editor, on the Tools menu, point to Advanced Settings and
select Metric Column Options.
5 In the Metric Column Alias Options window, in the Column Name used in
table SQL creation box, type Forecast_Revenue.
8 If you did not edit Forecast Units Sold during the lesson with your
instructor, repeat steps 3 to 7 for the Forecast Units Sold metric so the
Column Name used in table SQL creation box is Total_Units_Sales.
9 In the Public Objects folder, in the Reports folder, create the following
report:
You can access the Region attribute from the Geography hierarchy.
You can access the Quarter attribute from the Time hierarchy. The
Forecast Revenue and Forecast Units Sold metrics are located in the
Metrics folder.
12 Under Select one of the display options below, click Use the following
attribute forms.
14 Click the upper > button to move the ID form to the Displayed forms list.
16 Click the upper < button to remove the DESC form from the Displayed
forms list.
18 Click the lower < button to remove the DESC form from the Report objects
forms list.
20 Under Select one of the display options below, click Use the following
attribute forms.
22 Click the upper > button to move the ID form to the Displayed forms list.
24 Click the upper < button to remove the DESC form from the Displayed
forms list.
26 Click the lower < button to remove the DESC form from the Report objects
forms list.
27 Click OK.
28 In the Report Editor, on the Data menu, select Configure Data Mart.
29 In the Report Data Mart Setup window, on the General tab, in the Data
mart database instance drop-down list, ensure the Forecast Data database
instance is selected.
31 Click OK.
After the report executes, you see a message that the result data has been
stored in the REGION_FORECAST_SALES table, as shown below:
37 In the Read Only window, select Edit: This will lock all schema objects in
this project from other users.
38 Open the Architect graphical interface, click Project Tables View tab.
41 In the Results Preview window, in the Fact tab, clear the Forecast Revenue
and Total Unit Sales check boxes.
42 Click OK.
43 In the Project Tables View tab, find the FORECAST_SALES table and select
the Forecast Revenue fact.
45 In the Fact Editor, create a new fact expression that uses the
Forecast_Revenue column in the REGION_FORECAST_SALES table as
a source table.
47 Click OK.
49 In the Project Tables View tab, find the FORECAST_SALES table and select
the Forecast Units Sold fact.
51 In the Fact Editor, create a new fact expression that uses the
Total_Unit_Sales column in the REGION_FORECAST_SALES table as a
source table.
53 Click OK.
58 Click OK.
Ensure that the Recalculate table logical sizes check box is selected
in the Schema Update window.
You can access the Region attribute from the Geography hierarchy.
The Forecast Revenue metric is located in the Metrics folder.
62 Run the report. The result set should resemble like the following:
63 View the report in SQL View. The SQL should look like the following:
In the FROM clause, notice that the data is retrieved from the new
aggregate table REGION_FORECAST_SALES.
64 Save the report in the Public Objects/Reports folder as Data Mart Test and
close the report.
66 In Architect graphical interface, on the Design tab, click Edit logical size
of tables button as shown below:
When you select this check box, the logical size for the table will not be
recalculated when you update the schema.
69 Click OK.
Ensure that the Recalculate table logical sizes check box is selected
in the Schema Update window.
Test the change of the logical table size on the report SQL
72 In Developer, in the Reports folder, right-click the Data Mart Test report
and select View SQL. The SQL should look like the following:
Notice that this time, the Engine picked the base FORECAST_SALES table
over the aggregate REGION_FORECAST_SALES table because it has a
smaller logical table size.
Lesson Description
In this lesson, you will first learn how MultiSource Option works, including the
use of primary and secondary database instances at the table level, support for
duplicate tables, SQL generation for multisource reports, and common use
cases. Then, you will learn how to configure a project for heterogeneous data
access, including associating tables with multiple database instances, creating
objects to use in multisource reports, and creating multisource reports.
Lesson Objectives
After completing the topics in this lesson, you will be able to:
• Associate tables with single or multiple data sources, change the primary
database instance for a table, and remove a database instance from a
table. (Page 88)
• Create a report that contains objects from multiple data sources and
describe how the Engine processes the SQL for such reports. (Page 104)
By default, the objects in a standard report come from a single data source.
However, you can use the MultiSource Option—an add-on component to
Intelligence Server—to overcome this limitation. MultiSource Option enables
you to define a single project schema that uses multiple data sources. As a
result, you can create a standard report that executes SQL against multiple
data sources.
For example, consider the following scenario in which actual revenue data is
stored in one data warehouse, while forecast revenue data is stored in a second
data warehouse:
Report with Objects from Multiple Data Sources
If you want to create a report that includes the revenue and forecast revenue
data for each region, you must execute SQL against both data warehouses to
retrieve the result set. You obtain the data for each of the metrics from their
respective data warehouses. And you can also obtain the region data from
either data warehouse, since it exists in both databases. MultiSource Option
enables you to create this type of report.
With MultiSource Option, you have the ability to define primary and secondary
database instances at the table level and connect to them directly within the
MicroStrategy platform. This capability enables you to define a project schema
across multiple relational data sources.
• You can add tables to the project from different database instances, not just
the primary database instance for the project.
Any SQL database instance that exists within the project source is
available to the project as a secondary database instance from which
you can select tables.
• You can associate a single project table with multiple database instances,
which essentially creates duplicate tables.
However, keep in mind that MultiSource Option has the following limitations:
• You can use MultiSource Option to connect to any data source that you
access using an ODBC driver, including Microsoft Excel® files and text
files. You cannot use MultiSource Option to connect to MDX or other
non-relational data sources.
• MultiSource Option does not support fact tables partitioned across multiple
data sources.
You can have lookup, relationship, and fact tables duplicated across multiple
data sources. However, because of how the Engine selects the data source for
fact tables, there is benefit to using duplicate tables only for lookup and
relationship tables.
For information on how the Engine selects the data sources for tables,
see “Selecting the Optimal Data Source for Fact Tables” starting on
page 80 and “Selecting the Optimal Data Source for Lookup Tables”
starting on page 81.
When you bring duplicate tables to the project, you must consider the
following guidelines required by MultiSource Option:
• Corresponding columns in duplicate tables must either have the same data
type or compatible data types.
Tocompatibility
maintain data consistency, the Engine applies data type
rules when it joins columns in tables from different
database instances. For information on these rules, see “Joining
Data from Different Data Sources” starting on page 81.
• The number of columns in the table associated with the primary database
instance has to be less than or equal to the number of columns in the table
associated with the secondary database instance. Any extra columns in the
secondary table are not imported into the project.
With reports that use multiple data sources, much of the work of the SQL
Engine remains the same. However, the Engine performs two additional tasks:
Every project table has a primary database instance. The primary database
instance is the first one to which it is mapped. If you have duplicate tables, the
same table can have both primary and secondary database instances. The
primary table is the one that exists in the primary database instance. The
secondary table is the one that exists in the secondary database instance.
You can change which database instance is the primary one for a table.
For information on changing the primary database instance for a table,
see “Changing the Primary Database Instance for a Table” starting on
page 95.
You can have multiple secondary tables if the table is mapped to more
than one secondary database instance.
In this example, the two fact tables each map to only one database. The
primary database instance for the REGION_SALES table is the Sales Data
Warehouse. The primary database instance for the FORECAST_SALES table is
the Forecast Data Warehouse.
However, the LU_REGION table exists in both data warehouses, so you can
map it to both database instances as duplicate lookup tables. You can assign
either data warehouse as the primary or secondary database instance for this
table.
If you designate the Sales Data Warehouse as the primary database instance
for this table, the LU_REGION table from that database is the primary table.
The LU_REGION table from the Forecast Data Warehouse is the secondary
table.
If a table is available in a single data source, that source is the only one the
Engine can use in the report SQL to obtain the necessary data. However, if a
table is available in multiple data sources, the Engine uses specific logic to
select the optimal data source.
SQL generation for reports is focused on metric data. When the Engine needs
to calculate a metric, it first has to determine the best source for the underlying
fact. After taking into account attributes in the template, the metric's
dimensionality, and report or metric filters, the Engine uses the following logic
to select the optimal data source for a fact:
• If the fact comes from a fact table that is available in the primary database
instance for the project, the Engine calculates the metric using the primary
database instance for the project.
• If the fact comes from a fact table that is not available in the primary
database instance for the project, the Engine calculates the metric using a
secondary database instance. If the fact table is available in more than one
secondary database instance, the Engine selects the database instance with
the smallest GUID (alphabetically).
In essence, the Engine considers only the primary and secondary database
instance designation at the project level when selecting the data source for a
fact. When you have a fact table available in multiple sources, it does not
matter which sources have primary versus secondary designation for the table.
After selecting the optimal data source for a fact, the Engine also has to select
the best source for any corresponding attributes. The Engine uses the following
logic to select the optimal data source for an attribute:
• If the attribute comes from a lookup table that exists in the same data
source as the one selected for the fact, the Engine obtains the attribute data
from this same database instance.
• If the attribute comes from a lookup table that does not exist in the same
data source as the one selected for the fact, the Engine obtains the attribute
data from the primary database instance for the lookup table and moves it
to the database instance used as the fact source.
In essence, when you have a lookup table available in multiple sources, it can
matter which sources have primary versus secondary designation for the table.
This same logic also applies if the Engine has to retrieve attribute
information from a relationship table.
However, if you are just browsing attribute elements, the Engine treats lookup
tables for attributes like fact tables. If the lookup table exists in the primary
database instance for a project, the Engine queries that database instance.
Otherwise, it uses the secondary database instance with the smallest GUID
(alphabetically).
When the Engine needs to join data from different data sources, it selects data
from the first data source to the memory of the Intelligence Server. Then, it
creates a temporary table in the second data source and inserts the data into
this table to continue processing the result set.
You may have a data source that either does not support the creation of
temporary tables or in which you do not want to create temporary
tables. If so, you can configure the CREATE and INSERT support
VLDB property for the corresponding database instance to not support
CREATE and INSERT statements. This action makes the database
instance read only and forces the Engine to always move data out of this
source.
To work with tables from different data sources, the Engine joins table columns
based on specific data type compatibility rules. The following table lists these
rules:
BigDecimal BigDecimal
Binary Binary
CellFormatData CellFormatData
LongVarBin LongVarBin
LongVarChar LongVarChar
NChar NChar
NVarChar NVarChar
Unsigned Unsigned
VarBin VarBin
The data type of a column is based on the Engine data type definition, not the
data type definition in the physical database.
These two data type definitions may not always be the same.
Often, different data sources are optimal for different passes in the SQL for a
report. For such queries, the Engine follows the data type compatibility rules
and moves data between the different data sources until it finishes processing
the result set.
You can obtain the region data from either data warehouse. However, revenue
data is available only in the Sales Data Warehouse, and forecast revenue data is
available only in the Forecast Data Warehouse.
You can store lookup and relationship tables in a single data source, but it is
also common for lookup and relationship tables to be split across data sources.
Therefore, you may have a report that requires you to join data in one data
source using relationships stored in another data source.
The lookup tables that relate the Category, Subcategory, and Item attributes
are stored in the Sales Data Warehouse. Forecast revenue is stored at the item
level in the Forecast Data Warehouse. You need to aggregate this data to the
category level to calculate the forecast revenue for the report. However, the
Forecast Data Warehouse stores only the relationship between Item and
Subcategory. Therefore, this aggregation requires you to join the item-level
forecast revenue data to the category data using the lookup tables in the Sales
Data Warehouse.
Filtering Qualifications
You may also have reports where you need to filter data from one data source
by qualifying on data that comes from another data source.
The report contains a filter based on the Forecast Revenue metric. This fact
data is stored in the Forecast Data Warehouse. However, you use this filter to
qualify on the revenue for each category, which is stored in the Sales Data
Warehouse.
You can have this same scenario with other types of filter qualifications. The
following image shows an example that involves an attribute qualification:
Attribute Qualification
The report contains a filter based on the Category attribute. This attribute is
stored in the Sales Data Warehouse. However, you use this filter to determine
which elements are included in the result set for the Product attribute, which is
stored in the Forecast Data Warehouse.
These cases are just a few examples of how you can use MultiSource Option to
combine data from multiple sources in a single report. You can use
MultiSource Option to help meet a variety of reporting needs, including the
following:
• Using separate data sources for simple versus more complex queries
Now that you have learned how MultiSource Option works, you are ready to
learn how to configure a project for heterogeneous data access.
Although you can use MultiSource Option with standard reports, you
cannot use it in conjunction with Freeform SQL or Query Builder
reports. The Engine has to use multipass SQL to access multiple data
sources and move data between them. Since Freeform SQL and Query
Builder reports allow only a single pass of SQL, they cannot take
advantage of this feature.
For example, in the following scenario, you have two databases that you want
to use as data sources for the MicroStrategy Tutorial project:
Data Sources for the MicroStrategy Tutorial Project
Tutorial Data is the primary data warehouse for the project. It contains a
variety of lookup, relationship, and fact tables. It stores the actual revenue
data.
Inmultiple
this lesson, you will analyze two scenarios for joining data from
sources. One scenario uses the aggregate fact table and
generates a relatively simple SQL statement. The second scenario uses
the base fact table, and therefore generates a more complex SQL
statement.
For any report where you want to include information about actual and
forecast revenue, you need to access tables in both of these data warehouses
and join the data to produce the result set. To make a report like this possible,
you need to add the tables from Forecast Data to the MicroStrategy Tutorial
project.
You can add tables to a project from other database instances using the
Architect graphical interface.
In the sample scenario, there are many tables that map only to Tutorial Data,
which is the primary database instance for the project. These tables are already
part of the project.
However, there are two tables that map to a data source other than Tutorial
Data:
Table with a Single Data Source
1 In Developer, open the project to which you want to add the tables.
OR
If the secondary database instance is not associated with the project, in the
Warehouse Tables pane, right-click anywhere and select Select Database
Instance.
In the Select Database Instance window, select the database instance you
want to associate with the project.
Click OK.
4 In the Warehouse Tables pane, expand the database instance that contains
the tables you want to add.
The tables display on the Project Tables View tab. The color mapping
for the tables indicates their association with the selected database
instance.
The following image shows the Architect graphical interface with the
FORECAST_SALES and REGION_FORECAST_SALES tables added to the
project:
FORECAST_SALES and REGION_FORECAST_SALES Tables Added to
Project
The color mapping shows that these tables are associated with Forecast Data,
not Tutorial Data.
In the sample scenario, there are two tables that map to both data sources:
Tables with Multiple Data Sources
These two tables are already part of the project and associated with the Tutorial
Data database instance. However, you need to add them to the project for the
Forecast Data database instance as well.
OR
If the secondary database instance is not associated with the project, in the
Warehouse Tables pane, right-click anywhere and select Select Database
Instance.
In the Select Database Instance window, select the database instance you
want to associate with the project.
Click OK.
4 In the Warehouse Tables pane, expand the database instance that contains
the tables you want to add.
6 In the warning window, under Available options, ensure the Indicate that
<Table Name> is also available from the current DB Instance option is
selected.
Iftable
you click the Make no changes to <Table Name> option, the
is not mapped to the selected database instance, and no
duplicate table is created.
If you want to view and respond to the warnings for each duplicate table
individually, click OK.
OR
If you want to respond to the warnings for all duplicate tables at the same
time, click OK for All.
The tables display on the Project Tables View tab. The color mapping
for the tables indicates their association with multiple database
instances. The color that corresponds to the primary database
instance is used for the table header, while the color that
corresponds to the secondary database instance is used to shadow
the table.
The following image shows the Architect graphical interface with the
LU_COUNTRY and LU_REGION tables mapped to multiple data sources:
LU_COUNTRY and LU_REGION Mapped to Multiple Data Sources
The primary database instance for these tables is Tutorial Data. However, the
color mapping behind the table indicates that they are also mapped to the
Forecast Data database instance.
For example, in the sample scenario, there are two tables that map to both data
sources:
Tables with Multiple Data Sources
LU_COUNTRY and LU_REGION exist in both the Tutorial Data and Forecast
Data data warehouses. These tables were first added to the project using
Tutorial Data, so it is the primary database instance for these tables. However,
you can change the primary database instance for either of these tables from
Tutorial Data to Forecast Data.
You can use either the Architect graphical interface to change the primary
database instance for a table.
To change the primary database instance for a table in the Architect graphical
interface:
3 In the Architect graphical interface, in the Properties pane, click the Tables
tab.
4 On the Tables tab, in the drop-down list, select the table for which you want
to change the primary database instance.
6 Beside the current primary database instance, click the Browse button:
9 Click OK.
The changes you made are reflected in the Properties pane, in the
database instances associated with the Primary DB Instance and
Secondary DB Instances properties.
When you change the primary database instance for a table, the color mapping
in the Architect graphical interface also changes to reflect the appropriate
associations.
However, when a table maps to multiple data sources, you may want to remove
one database instance, while still keeping the table associated with other
database instances.
Insincesuchthatcases, you do not want to remove the table from the project
removes its association to all database instances.
The same Available Database Instance window that you used to change the
primary database instance of a table in Architect graphical interface can be
used to remove a database instance from a table.
Ifinstance,
the database instance you want to remove is the primary database
you must change the primary database instance first. This
action removes the association between the database instance and the
table.
After you have associated tables with the appropriate database instances, the
next step in configuring a project to access heterogeneous data sources is to
create the necessary objects for multisource reports. When you add tables from
other data sources, you may need to create additional attributes, facts, or
metrics that you can use in reports to access that data.
For example, in the sample scenario, you have four tables in the Forecast Data
data warehouse that are used in the project:
Forecast Data Tables Used in Project
Since the LU_COUNTRY and LU_REGION tables also exist in the Tutorial
Data data warehouse, the project already contains the corresponding
attributes. If these attributes use the Automatic mapping method, Architect
automatically maps them to the duplicate tables from Forecast Data.
Forecast_Revenue
The following image shows the Forecast Revenue fact mapped to the
REGION_FORECAST_SALES table:
Mapping of Forecast Revenue Fact
Later in this lesson, you will modify this fact to point to the
FORECAST_SALES base fact table instead of the aggregate table.
Now that you have created the necessary objects for multisource reports, the
final step in configuring a project to access heterogeneous data sources is to
create a report that uses objects from multiple data sources.
In this report, you can obtain data for the Region attribute from either the
Tutorial Data or Forecast Data data warehouses. You can obtain data for the
Revenue metric only from Tutorial Data. You can obtain data for the Forecast
Revenue metric only from Forecast Data.
2 Aggregate the forecast revenue for each region using Forecast Data.
3 Obtain the region descriptions using either Tutorial Data or Forecast Data.
4 Consolidate the region, revenue, and forecast revenue data to produce the
final result set.
Based on the logic the Engine uses to select the optimal data source for each
pass, it moves data between the two databases to process the query.
The following images show the primary SQL passes for the sample report and
explain what happens in each pass.
This first image shows the SQL passes the Engine uses to calculate the Revenue
metric in the report:
SQL for Calculating the Revenue Metric
Since revenue data is stored only in Tutorial Data, the Engine performs this
calculation in that database. The first SQL pass creates a temporary table in
which to store the revenue for each region. The second SQL pass aggregates the
revenue for each region using a fact table in Tutorial Data and inserts it into the
temporary table.
This second image shows the SQL passes the Engine uses to calculate the
Forecast Revenue metric in the report:
SQL for Calculating the Forecast Revenue Metric
Since forecast revenue data is stored only in Forecast Data, the Engine
performs this calculation in that database. The first SQL pass calculates the
forecast revenue for each region using the aggregate fact table in Forecast Data.
The second SQL pass creates a temporary table in Tutorial Data in which to
store the forecast revenue for each region. The third SQL pass inserts the
forecast revenue data for each region into the temporary table in Tutorial Data.
When data is moved from one data source to another, the SQL view of a
report does not show the entire INSERT statement. Rather than
showing all the values that are inserted into a temporary table, it shows
only a single set of values. However, it does show the number of rows
inserted into the table above the INSERT statement.
Now that the Engine has calculated all the metrics for the report, it uses
Tutorial Data to obtain the region descriptions and consolidate the results of
both metric calculations.
Toconsolidation
minimize the movement of data, the Engine performs the final
pass using the database instance in which the most
temporary tables were created while processing the report. In this
example, both temporary tables were created in the Tutorial Data
database instance, so the Engine selects the primary database instance.
Ifselects
there is a tie between two or more database instances, the Engine
the primary database instance for the project. If none of them
are the primary database instance for the project, the Engine selects the
database instance with the smallest GUID (alphabetically).
This last image shows the SQL pass the Engine uses to produce the final result
set for the report:
SQL for Consolidating the Result Set
This SQL pass retrieves the region descriptions from a lookup table in Tutorial
Data. Then, it retrieves the region IDs and revenue and forecast revenue for
each region from the appropriate temporary tables to produce the final result
set.
After the consolidation pass, the Engine also generates SQL to drop each
temporary table created while processing the query unless you configure
the VLDB properties to change this behavior.
The following image shows the result set for the report:
Result Set for Sample Report
In the first scenario, the forecast revenue data was readily available in the
secondary data source at the requested report level. Both the Region attribute
and the Forecast Revenue fact were mapped to the same aggregate table.
Therefore, the report SQL was fairly straightforward.
But what happens if the relationship between the attribute on the report
template and the attributes at which the fact table is stored are not defined in
the secondary data source? This is the case when you map the Forecast
Revenue fact to the FORECAST_SALES base fact table from the Forecast Data
database instance instead of the aggregate table. This base fact table stores the
forecast data at the employee, item, date, and order levels. However, in the
sample report, you want to aggregate the forecast data to the region level.
To use the FORECAST_SALES fact table, you need to modify the Forecast
Revenue fact to map to that table. You need to use the following expression
that combines three fact columns from this table:
FORECAST_QTY_SOLD * (FORECAST_UNIT_PRICE
- FORECAST_DISCOUNT)
You then have to remove the fact expression that points to the
REGION_FORECAST_SALES aggregate table from the fact definition. The
following image shows the Forecast Revenue fact mapped to the
FORECAST_SALES table:
Mapping of Forecast Revenue Fact
When you now run the sample report to retrieve the forecast data, the Engine
must aggregate the forecast data from the employee level to the region level.
The process of determining the relationship between these two attributes takes
place in the report SQL.
After updating the project schema, you may have to purge the report
cache to view the new SQL.
The following images show the primary SQL passes for the sample report and
explain what happens in each pass.
This first image shows the SQL passes the Engine uses to calculate the Revenue
metric in the report:
SQL for Calculating the Revenue Metric
Since revenue data is stored only in Tutorial Data, the Engine performs this
calculation in that database. The first SQL pass creates a temporary table in
which to store the revenue for each region. The second SQL pass aggregates the
revenue for each region using a fact table in Tutorial Data and inserts it into the
temporary table.
This second image shows the SQL passes the Engine uses to determine the
relationships between call centers and employees:
SQL for Determining Call Center and Employee Relationships
Call center and employee relationships are stored only in Tutorial Data. The
FORECAST_SALES fact table stores its facts at the employee level. However,
for this report, the Engine has to calculate the Forecast Revenue metric at the
region level. Before the Engine can calculate this metric at the region level, it
first has to determine the relationships between employees, call centers, and
regions to accurately aggregate the forecast revenue data. Since the forecast
revenue data is stored only in Forecast Data, the Engine selects the necessary
call center and employee data from Tutorial Data and moves it to Forecast
Data.
The first SQL pass selects the call centers and employees from Tutorial Data.
The second SQL pass creates a temporary table in Forecast Data in which to
store the call centers and employees. The third SQL pass inserts the call centers
and employees from Tutorial Data into the temporary table in Forecast Data.
This third image shows the SQL passes the Engine uses to determine the
relationships between regions and call centers:
SQL for Determining Region and Call Center Relationships
Region and call center relationships are also stored only in Tutorial Data. Just
as the Engine has to determine the relationships between employees and call
centers, it also has to determine the relationships between call centers and
regions before aggregating the forecast revenue data to the region level. Since
the forecast revenue data is stored only in Forecast Data, the Engine selects the
necessary region and call center data from Tutorial Data and moves it to
Forecast Data.
The first SQL pass selects the regions and call centers from Tutorial Data. The
second SQL pass creates a temporary table in Forecast Data in which to store
the regions and call centers. The third SQL pass inserts the regions and call
centers from Tutorial Data into the temporary table in Forecast Data.
Now, the Engine has all the information it needs to aggregate the forecast
revenue data from the employee level to the region level.
This fourth image shows the SQL passes the Engine uses to calculate the
Forecast Revenue metric in the report:
SQL for Calculating the Forecast Revenue Metric
Since forecast revenue data is stored only in Forecast Data, the Engine
performs this calculation in that database. The first SQL pass aggregates the
forecast revenue for each region using the appropriate fact table in Forecast
Data. The second SQL pass creates a temporary table in Tutorial Data in which
to store the forecast revenue for each region. The third SQL pass inserts the
forecast revenue data for each region into the temporary table in Tutorial Data.
Now that the Engine has calculated all the metrics for the report, it uses
Tutorial Data to obtain the region descriptions and consolidate the results of
both metric calculations.
This last image shows the SQL pass the Engine uses to produce the final result
set for the report:
SQL for Consolidating the Result Set
This SQL pass retrieves the region descriptions from a lookup table in Tutorial
Data. Then, it retrieves the region IDs and revenue and forecast revenue for
each region from the appropriate temporary tables to produce the final result
set.
After the consolidation pass, the Engine also generates SQL to drop each
temporary table created while processing the query unless you configure
the VLDB properties to change this behavior.
The following image shows the result set for the report:
Result Set for Sample Report
Lesson Summary
In this lesson, you learned:
• With MultiSource Option, you can create a standard report that executes
SQL against multiple data sources.
• With MultiSource Option, you can connect to any data source that you
access using an ODBC driver, including Microsoft Excel and text files. You
cannot connect to MDX or non-relational data sources.
• With MultiSource Option, you can define primary and secondary database
instances at the table level and connect to them directly within the
MicroStrategy platform.
• You create duplicate tables when you map the same project table to more
than one database instance.
• When the Engine generates SQL for multisource reports, it determines the
optimal database instance for each SQL pass and identifies when joins need
to occur across database instances.
• If you have duplicate tables, the primary table is the one that is mapped to
the primary database instance. The secondary table is the one that is
mapped to the secondary database instance.
• The Engine uses specific logic to select the optimal data source for fact and
lookup tables.
• When the Engine needs to join data from different data sources, it moves
data between them as it processes the result set for a report. It joins table
columns using specific data type compatibility rules.
• You can change the primary database instance for a table in the Architect
graphical interface.
• You can remove a database instance from a table in the Architect graphical
interface.
Exercises: Using MicroStrategy MultiSource
Option
You should complete the following exercises using the MicroStrategy Tutorial
project, which is found in the MicroStrategy Analytics Modules project source.
Overview
You may have already completed the exercise depicted below if you followed
along with your instructor during the demonstration for this chapter. If you
already brought the FORECAST_SALES, REGION_FORECAST_SALES,
LU_REGION, and LU_COUNTRY tables to the Tutorial project from the
Forecast Project, and created the Forecast Revenue fact and metric, you may
skip this exercise. The subsequent exercises are dependent upon
accomplishing these tasks first.
In this exercise, you will use the Warehouse Tables pane in Architect graphical
interface to add the following tables from the Forecast Data database instance
to the MicroStrategy Tutorial project:
FORECAST_SALES
LU_COUNTRY
LU_REGION
Detailed Instructions
Add the LU_REGION and LU_COUNTRY tables to the project and change the
Database Instance
5 In the Warehouse Tables pane, in the list of tables available in the Forecast
Data database instance, right-click the LU_REGION table and select Add
Table to Project.
6 In the Options window, under Available options, keep the Indicate that
LU_REGION is also available from the current DB Instance option
selected.
7 Click OK.
8 In the Architect graphical interface, in the Properties pane, click the Tables
tab.
11 Beside the current primary database instance, click the Browse button:
13 Click OK.
17 In the Results Preview window, in the Fact tab, keep Forecast Unit Price
only and clear all other facts. On the Attribute tab, keep the default
selection.
18 Click OK.
21 In the Create New Fact Expression window, in the Source table list, select
FORECAST_SALES.
24 Click OK.
27 Click OK.
Overview
In this exercise, you will use the Warehouse Tables pane in Architect graphical
interface to add the following tables from the Forecast Data database instance
to the MicroStrategy Tutorial project:
LU_PRODUCT
LU_SUBCATEG
You will also add the LU_GROUP table from the Inventory Data database
instance. As you add these tables, you should configure them as follows:
You need the Product attribute for the multisource report you will create later
in these exercises.
• Make sure the ID and DESC attribute forms for the Product attribute map
the following as source tables:
LU_PRODUCT
The primary lookup table for the Product attribute displays in bold
text. You should use Automatic mapping for both attribute form
expressions.
After you have completed these tasks, save and update project schema, and
close Architect.
Detailed Instructions
13 In the Modify Attribute Form window, ensure that the LU_PRODUCT and
LU_GROUP source tables are selected and click OK.
14 In the Warehouse Tables pane, in the list of tables available in the Forecast
Data database instance, right-click the LU_SUBCATEG table and select
Add Table to Project.
15 In the Options window, under Available options, keep the Indicate that
LU_SUBCATEG is also available from the current DB Instance option
selected.
16 Click OK.
Create Product and Category relationship and update the Product user
hierarchy
17 On the Hierarchy View tab, select the Product attribute and drag it to
Category attribute. A one-to-many relationship is created with Product as
the parent attribute.
18 On the Properties pane, in the Location property click Browse and save the
Product attribute in the Schema Objects\Attribute\Products folder.
19 On the Home tab, in the Hierarchy section, in the drop-down list, select
Products.
21 In the Select Objects window under Available objects, select the Product
attribute and click the > button to move Product to the Selected objects list.
22 Click OK.
24 Click OK.
27 Close Architect.
Overview
In this exercise, you will create a multisource report. The report should contain
the following attributes and metrics: Product, Category, Revenue, and
Forecast Revenue. The report should also contain an attribute element filter
with the following condition: Product In list (Entertainment).
Save the report in the Public Objects\Reports folder as Revenue and Forecast
Revenue for Entertainment Categories.
Run the report. The result set should look like the following:
Detailed Instructions
You can access the Product and Category attributes from the
Products hierarchy.
You can create a local filter on the report. You do not need to create
the filter separately in the Filter Editor.
5 Compare your results to the expected report in the Overview section at the
beginning of this exercise.
Lesson Description
This lesson describes the various ways you can extend fact levels using
MicroStrategy Architect.
In this lesson, you will first learn about the three types of fact level
extensions—degradation, extension, and disallow. Then, you will learn how to
create each of these types of fact level extensions.
Lesson Objectives
After completing the topics in this lesson, you will be able to:
• Describe how the level at which a fact is stored affects reports and describe
the three types of fact level extensions available in MicroStrategy
Architect. (Page 131)
• Describe the purpose of fact extensions and create fact extensions to extend
the levels of facts to other hierarchies. (Page 143)
• Describe the purpose of fact disallows and disallow fact levels to prevent
unnecessary cross joins from occurring. (Page 156)
Facts are stored in data warehouse tables at particular levels. The level of a fact
is defined by the attribute IDs present in the fact table. The level at which a fact
is stored directly affects how you can report on that fact. You can report on a
fact at the level at which it is stored, or you can aggregate it to a higher level.
For example, the following illustration shows a fact table in which the Unit
Sales fact is stored at the item and week level:
Fact Level
Itwithin
is possible for the same fact to be stored at different attribute levels
a hierarchy. For example, you could have another fact table that
stores unit sales by item and date, rather than item and week. This fact
table would store unit sales for items at a lower level within the Time
hierarchy.
Based on this fact table, you can report on unit sales at the item or week levels.
You can also aggregate the unit sales data to the month, quarter, or year levels
within the Time hierarchy and to the subcategory and category levels within
the Product hierarchy. However, you cannot create a report that shows unit
sales by date because the fact is stored at the higher level of week. You cannot
determine the unit sales at the date level using this fact table.
What if you want to report on unit sales by call center or region? Since the fact
is stored only at the item and week levels and these two attributes have no
direct relationship to attributes in the Geography hierarchy, it is impossible to
perform this type of analysis.
You may also have a scenario in which a metric consists of more than one fact.
For example, the following illustration shows a Sales metric that consists of the
Unit Price and Quantity Sold facts:
Different Fact Levels in a Metric Definition
The Unit Price fact is stored at the item and week levels, while the Quantity
Sold fact is stored at the item and day levels. However, for these two facts to be
used together in a metric expression, they must be stored at the same level.
Otherwise, the calculation can cause errors, depending on the level at which
you need to report on the metric.
As you can see, the levels at which facts are stored limit the levels at which you
can report on facts. If you have a metric that consists of multiple facts, they
have to be stored at the same level to correctly calculate the metric.
Sometimes, facts may not be available in the data warehouse at the levels you
want to analyze in reports. You may not be able to change the levels at which
they are stored in data warehouse tables, but you can change the levels of facts
in MicroStrategy Architect by using fact level extensions. Fact level extensions
enable facts stored in the data warehouse at one level to be reported on at a
different level.
very
Fact level extensions are not commonly applied to facts, but they are
useful in special cases.
OR
In the New Fact-Create New Fact Expression window, create the fact
expression and click OK.
You can create fact level extensions using this process. The remainder of this
lesson describes each of these fact level extensions in detail.
A fact degradation enables you to lower the level of a fact within a hierarchy to
which it is already related.
The following illustration shows a fact table with facts stored at the month level
and a report that requires one of these facts to be displayed at the day level:
Fact Degradation Scenario
In this example, the report contains the Units Received metric that uses the
Units Received fact in its definition.
The Units Received fact is stored at the item and month levels. However, you
need the Units Received fact to be available at the day level. If you try to run
this report, you receive the following error message:
Report Error Without the Fact Degradation
The error message notifies users that the Units Received fact is not available at
the day and item levels. For this report to work, you need to lower the level of
the Units Received fact from month to day using a fact degradation.
1 Select the attribute level to which you want to lower the fact.
2 Select the attribute the SQL Engine can use to join the fact to the attribute
to which you want to lower the fact.
3 Determine the join direction between the join attribute and the fact.
When you create a fact degradation, the fact is already stored in a fact table at a
higher level within the hierarchy than the level at which you want to analyze
the fact. Therefore, you choose the lower-level attribute within the same
hierarchy to which you want to degrade the fact.
For the Units Received fact degradation, you need to lower the level of the
Units Received fact from month to day, so you would select the Day attribute.
When you create a fact degradation, you have to do so precisely because the
fact is not related to the desired attribute level within a hierarchy. Therefore,
you need to select an attribute that the SQL Engine can use to join the fact to
the desired attribute level. With a fact degradation, since you are lowering a
fact for a hierarchy to which it is already related, the join attribute is always a
higher-level attribute from that hierarchy.
For the Units Received fact degradation, the Units Received fact is stored at the
item and month level, so it is related to both the Item and Month attributes.
Since you need to lower the fact to the day level, you select Month as the join
attribute because it is directly related to the Day attribute. The Item attribute is
not directly related to the Day attribute, so you would not use it as the join
attribute.
After you select the join attribute you want to use to relate the desired attribute
and fact, you have to determine how you want the SQL Engine to perform the
join between the join attribute and the fact. There are two possible join
directions. You can join to the fact using only the join attribute itself, or you
can allow the fact to also join to children of the join attribute.
For the Units Received fact degradation, the Units Received fact is stored only
at the item and month level. It is not stored at any other levels of time.
Therefore, you would choose to join only against the attribute itself (Month).
However, if the Units Received fact were stored at another level of time
between month and day, such as week, you may want to join to the fact at the
month or week level. In this case, you could choose to allow the SQL Engine to
join against the attribute and its children (Month and Week). If you do not
want to allow the join at both levels, you could still choose to perform the join
only against the attribute itself.
Ifitsyou allow the SQL Engine to join against the join attribute and any of
children, you need to ensure that the allocation expression you use
for the fact degradation returns values that are valid at any of those
attribute levels.
Some facts are static and do not change value from one level to another. Such
facts do not require an expression to allocate the fact at the lower level. Other
facts do change from one level to another, and you need to define an expression
that correctly allocates the fact data at the lower level. An allocation expression
can include attributes, facts, constants, and any standard expression syntax,
including mathematical operators, pass-through functions, and so forth.
Although the Units Received fact is stored at the month level, its value may be
different depending on the level of time at which you are reporting on the units
received. The units received on a particular day are different from units
received during a month. Therefore, you need an allocation expression to
translate units received at the month level into day-level values. For the Units
Received degradation, you could create an allocation expression to divide the
monthly units received by the duration of the month to get a rough
approximation of units received at the day level:
However, if you were creating a degradation for a fact like Unit Price, its value
might be the same regardless whether you report on it at the month level or the
day level since the unit price does not change during a month. Therefore, you
would not need an allocation expression for this fact degradation.
5 In the General Information window, in the Name box, type a name for the
extension.
8 Click Next.
10 Click Next.
11 In the Join Type window, in the Join attributes list, select the check box for
the attribute you want to use in the join.
12 Click Next.
13 In the Join Attributes Direction window, in the Join attributes list, in the
Join against column, click the arrow icon to set the join direction for the
join attribute.
14 Click Next.
OR
16 Click Next.
You can click Back to go back through the Level Extension Wizard
and make changes.
When you run a report with the Item and Day attributes and the Units
Received metric on the template, the report returns the following result set:
Report Result Set with the Fact Degradation
The image above displays only the first few rows of the result set for the
report.
The report returns the same value for each day within the same month for the
Units Received metric.
While a fact degradation enables you to lower the level of a fact within a
hierarchy to which it is already related, a fact extension enables you to extend
the level of a fact to a level in a different hierarchy to which it is currently
unrelated.
Consider the following simplified data model for the MicroStrategy Tutorial
project:
Fact Extension Data Model
In this data model, you store the Freight fact at the employee, order, and day
level. However, freight data is not stored at the item level. The Freight fact is
not related to any attribute in the Product hierarchy.
If you run a report that contains the Item attribute and a Freight metric that
uses the Freight fact, the result set looks like the following:
Item and Freight—Report Result Set
The image above displays only the first few rows of the result set for the
report.
The report returns the same freight value for each item. This result set is
meaningless because of how the query joins the lookup table for the Item
attribute and the fact table for Freight. The following image shows the SQL for
this report:
Item and Freight—Report SQL
Because there is no relationship between Item and Freight, the SQL Engine
performs a cross join between the fact and lookup tables to retrieve the data for
the report.
Therefore, if you want to view freight information by item, you have to extend
the Freight fact to the item level. You cannot use a fact degradation because
Item is an attribute from a different hierarchy than the attributes that are
already related to the Freight fact.
Asto thean alternative to creating a fact extension, you could add a new level
fact table in the data warehouse itself. However, changing the fact
table is not always an option. The fact data may not be captured at that
level in the source system, or there may be other organizational or
environmental restrictions on changing the table structure.
• Table relation—You select a particular table to use for the join. You should
select this option if you always want the SQL Engine to use the same table
to join the desired attribute and fact.
• Fact relation—Instead of selecting a single table, you select a fact to use for
the join. This option enables the SQL Engine to use any table containing
that fact to join the desired attribute and fact. You should select this option
if you want to allow the SQL Engine to choose the optimal table for a
particular query.
• Cross product—You choose to have the SQL Engine perform a cross join
between the lookup table of the desired attribute and the fact table.
You should use the cross product option only as a last resort. If there
are no tables in your data warehouse that you can use to join the
desired attribute and fact, then this is the only option you have for
creating a fact extension. However, keep in mind that a cross join
requires a great deal of processing overhead, and the resulting data
may not be meaningful.
Toto thelearnProject
more about the fact relation and cross product methods, refer
Design Guide product manual.
1 Select the attribute level to which you want to extend the fact.
2 Select the table you want the SQL Engine to use to join the fact to the
attribute to which you want to extend the fact.
3 Select the attribute or set of attributes the SQL Engine can use to join the
fact to the attribute to which you want to extend the fact.
4 Determine the join direction between the join attributes and the fact.
When you create a fact extension, the fact is completely unrelated to any
attributes in the given hierarchy. The attribute to which you want to extend the
fact depends on how you want to analyze the fact. If you want to report on the
fact at any level in the hierarchy, you should select the lowest-level attribute in
that hierarchy.
For the Freight fact extension, if you want to report on the Freight fact at any
attribute level in the Product hierarchy, you select the Item attribute, which is
the lowest-level attribute in the hierarchy. Selecting a higher-level attribute
from the Product hierarchy, such as Subcategory, only extends the fact to that
attribute level or any attribute above it in the hierarchy. Extending the Freight
fact to the item level enables you to create reports that analyze freight data
using any attribute from the Product hierarchy.
The SQL Engine needs to join the table that contains the fact you are extending
and the lookup table that stores the attribute to which you are extending the
fact. Because these two tables are not related, you have to select another data
warehouse table to serve as a relationship table between the fact and lookup
tables.
After you select the attribute to which you want to extend the fact,
MicroStrategy Architect searches the project warehouse catalog and returns a
list of all tables that contain the ID column of that attribute. Using this list of
candidate tables, you can then select the optimal table for the join. In selecting
a table, you should consider several factors, including the number of possible
join paths, the optimal join path for a given allocation expression, and any
other characteristics specific to your data warehouse environment. For
example, you may want to use a table for the join that you know has better
indexes or is updated more frequently.
In the previous example, the Freight fact is stored in the ORDER_FACT table.
The following image shows the logical view for the ORDER_FACT table:
ORDER_FACT Table—Logical View
The ORDER_FACT table is the only table in the data warehouse that
contains the Freight fact. Therefore, you have to join the ORDER_FACT
table to the LU_ITEM table to relate the Freight fact to the Item
attribute.
When you extend the Freight fact to Item using a table relation, MicroStrategy
Architect returns the following list of candidate tables:
List of Candidate Tables
The remaining tables are all fact tables that contain the Item attribute.
However, most of them have only one or two attributes in common with the
ORDER_FACT table. However, the ORDER_DETAIL table contains many of
the same attributes as the ORDER_FACT table, including Employee, Day,
Customer, and Order. The following image shows the logical view for the
ORDER_DETAIL table:
ORDER_DETAIL Table—Logical View
You could use the ORDER_DETAIL table to join the LU_ITEM and
ORDER_FACT tables using any of the common attribute columns. Because the
ORDER_DETAIL table provides multiple join paths, it is the best table to use
to join the Freight fact to the Item attribute.
The ORDER_DETAIL table is the optimal join table provided you can
use it in conjunction with the allocation expression for the fact
extension. You also have to consider any characteristics of the table that
might render it less optimal because of factors unrelated to its structure.
For example, if this table is only updated monthly and you want reports
that provide the most current data, it would not be the best table to use
for the join.
After you select the table for the join, you then need to select an attribute or set
of attributes from that table on which the SQL Engine should join.
MicroStrategy Architect lists any attributes whose ID columns are present in
the join table. You can either manually select the join attributes, or you can
allow the SQL Engine to select the join attributes dynamically on a
query-by-query basis.
For the Freight fact extension, you select Order as the join attribute. You could
use other attributes in the ORDER_FACT table as the join attribute. These
attributes are listed in the Level Extension Wizard, as shown below:
Possible Join Attributes
However, the allocation expression for this fact extension uses facts that are
related to individual orders, so Order is the optimal join attribute.
If you allow the SQL Engine to dynamically select the join attributes, you do
not perform this step. However, if you manually select the join attributes, you
have to determine how you want the SQL Engine to perform the join between
the join attributes and the fact. Just as with fact degradations, there are two
possible join directions. You can join to the fact using only the join attributes
themselves, or you can allow the fact to also join to the children of the join
attributes.
If you allow the SQL Engine to join against the join attributes and any of their
children, you need to ensure that the allocation expression you use for the fact
extension returns values that are valid at any of those attribute levels.
For the Freight fact extension, you allow the SQL Engine to join only against
the Order attribute itself. Since the Order attribute is already the lowest-level
attribute in the Customers hierarchy, it does not have any child attributes you
could use to join to the fact.
Just as with fact degradations, when you create a fact extension, you may need
to define an expression to allocate the fact data at the extended attribute level.
Some facts are static and do not change value from one attribute level to
another, while other facts have values that do change, depending on the
attribute level.
For the Freight fact extension, you could create the following allocation
expression:
(Freight * [Item-level Units Sold]) /
[Order-level Units Sold]
In the allocation expression, the value of the Freight fact at the order level is
proportionally distributed among items sold in this particular order according
to the units sold. If there are 3 units of the same item in an order of 10 items
total and the freight for this order was $100, then the item-level freight for that
particular item is $30.
5 In the General Information window, in the Name box, type a name for the
extension.
8 Click Next.
10 Click Next.
11 In the Extension Type window, click Specify the relationship table used
to extend the fact.
12 Click Next.
13 In the Table Selection window, in the Available tables list, select the table
that you want to use to extend the fact.
14 Click Next.
If you want the SQL Engine to select the optimal set of attributes for the
join based on the SQL query, click Dynamically select the best set of
attributes (Best Fit).
OR
If you want the SQL Engine to always use a particular attribute or set of
attributes for the join, click I will select the set of attributes.
In the Join attributes list, select the check boxes for the attributes you want
to use in the join.
The list of attributes includes all attributes present in the join table
you selected.
16 Click Next.
IfDirection
you chose to select the join attributes, the Join Attributes
window opens. Continue to step 17. If you chose to let the
SQL Engine determine the join attributes, the Allocation window
opens. Continue to step 19.
17 In the Join Attributes Direction window, in the Join attributes list, in the
Join against column, click the arrow icon to set the join direction for each
join attribute.
18 Click Next.
OR
20 Click Next.
You can click Back to go back through the Level Extension Wizard
and make changes.
You can then run the following report that displays invoice information for all
items in a particular order:
Invoice Report—Freight at the Item Level
In this example, order 148247 includes 4 items. Since only one unit of each
item was sold, the Freight for each item is calculated evenly as $5. This type of
analysis would not be possible without a fact extension.
A fact disallow functions very differently from other types of fact level
extensions. You use fact degradations and extensions to relate the fact to
additional attributes. However, a fact disallow actually prevents unnecessary
cross joins between fact and lookup tables that would otherwise occur by
default.
For example, a project may contain the following hierarchies and fact table:
Sample Data Model
However, what if you want to run a report that displays a Units Received
metric (built from the Units Received fact) as well as the Item and Call Center
attributes? The Item attribute is related to the Units Received fact since it is
part of the Product hierarchy, but the Call Center attribute is not related to this
fact since it is part of the Geography hierarchy. By default, the result set for this
report looks like the following:
Report Result Set—Unrelated Attribute and Fact
The image above only displays the first few rows of the result set for the
report.
Because there is no way to relate call center data to units received data, this
report result displays the units received for every item paired with every call
center.
In an attempt to return data for the report, the SQL Engine generates SQL that
results in a cross join between the lookup table for the Call Center attribute and
the fact table that contains the units received data. The SQL for this report
looks like the following:
Report SQL—Unrelated Attribute and Fact
The report SQL includes the LU_CALL_CTR table in the FROM clause, but it
cannot join this table to the INVENTORY_ORDERS fact table in the WHERE
clause. The cross join makes it possible for the report to return data, but it can
only produce a cross product, which does not yield a meaningful result set.
If you have no need for this cross join and the result set it produces, you can
disallow the Call Center attribute for the Units Received fact. This fact disallow
prevents the SQL Engine from executing the cross join.
5 In the General Information window, in the Name box, type a name for the
extension.
8 Click Next.
10 Click Next.
You can click Back to go back through the Level Extension Wizard
and make changes.
After disallowing the Call Center attribute for the Units Received fact, if you
run the same report, the report fails and the following error message displays:
Report Error with the Fact Disallow
The error message basically notifies users that they cannot extend the Units
Received fact to the Call Center attribute using a cross join.
You cannot use fact disallows to prevent normal joins from occurring,
only cross joins. For example, for the fact table referenced in this topic,
disallowing an attribute from the Time or Product hierarchies for the
Units Received fact would not prevent you from running a report with
attributes from either hierarchy. The Units Received fact is related to
attributes from each of these hierarchies, so a cross join to lookup tables
would not occur.
Lesson Summary
In this lesson, you learned:
• Fact level extensions enable facts stored in the data warehouse at one level
to be reported on at a different level.
• There are three types of fact level extensions: degradation, extension, and
disallow.
• A fact degradation enables you to lower the level of a fact within a hierarchy
to which it is already related.
• You can create a fact extension using the following three methods: table
relation, fact relation, and cross product.
• Using the table relation method to create a fact extension forces the SQL
Engine to always join the desired attribute and fact using a particular table.
• Using the fact relation method to create a fact extension allows the SQL
Engine to join the desired attribute and fact using any table that contains
the fact you select.
• Using the cross product method to create a fact extension allows the SQL
Engine to perform a cross join between the desired attribute and fact.
Exercise: Fact Level Extensions
Overview
Run the report. Add the Month attribute to the template and run the report
again. After reviewing the error message, save the report as Degradation
Example in the Public Objects\Reports folder.
Next, you will create a degradation for the Forecast Cost fact to enable you to
report on that fact at the Month level. In the data warehouse, this fact exists
only at the Quarter level. You should use the following allocation expression for
the fact degradation: [Forecast Cost] / 3. After creating the fact degradation,
you should update the project schema.
Run the report. The result set should look like the following:
The image above only displays the first few rows of the result set for the
report.
Detailed Instructions
Create a report
You can access the Quarter attribute from the Time hierarchy. You
can find the Forecast Cost metric in the Metrics folder.
2 Run the report. The result set should look like the following:
You can report on the Forecast Cost fact at the quarter level.
4 Add the Month attribute to the report template to the right of Quarter.
You can access the Month attribute from the Time hierarchy.
5 Run the report. You should see the following error message:
The error message states that the Forecast Cost fact does not exist at the
month level in the data warehouse. If you want to report on that fact at a
level lower than quarter, you must create a fact degradation.
Create a fact degradation at the month level for the Forecast Cost fact
8 In the Schema Objects, in the Facts folder, open the Forecast Cost fact in
the Fact Editor.
13 Under What would you like to do?, click Lower the fact entry level.
14 Click Next.
15 In the Extended Attributes window, select the Show all attributes check
box.
16 In the Available attributes list, select the Month attribute and click the >
button to add it to the Selected attributes list.
17 Click Next.
19 Click Next.
20 In the Join Attributes Direction window, in the Join attributes list, in the
Join against column, keep the default setting.
21 Click Next.
25 Click Next.
You can click Back to go back through the Level Extension Wizard
and make changes.
29 Run the Degradation Example report. The result set should look like the
following:
The image above only displays the first few rows of the result set for
the report.
You can now report on the Forecast Cost fact at the month level. The
monthly values are only estimates based on the allocation expression you
provided in the definition of the fact degradation. Notice that for each
month within a quarter, the forecast cost value is the same.
Lesson Description
This lesson covers transformations, schema objects that enable you to compare
metric values across time periods.
You will learn about two different types of transformations, and the different
components of a transformation. You will also learn how to create a
transformation in MicroStrategy Architect and how to use transformations in
transformation metrics. Finally, you will examine common uses for
transformations in reporting.
Lesson Objectives
After completing the topics in this lesson, you will be able to:
What Is a Transformation?
Types of Transformations
There are two types of transformations:
Table-Based Transformations
This table also has columns for the parent IDs at all levels. These
columns are not displayed in the image above.
Each date in the LU_DAY table has a transformed value for the previous day,
last month’s date, last quarter’s date, and last year’s date. For example, for
January 5, 2013, the previous day was January 4, 2013, while the last quarter’s
date was October 5, 2012.
In the MTD_DAY transformation table, each date has one or more records for
all dates within its month before and including that date. For example, there is
only one record for the January 1, 2013 date, but there are three records for the
January 3, 2013 date.
This type of transformation data cannot be stored in the lookup table for
the Day attribute because lookup tables store each unique date only
once.
After you associate a transformation object with a metric, the Engine uses the
transformation to generate SQL for that metric. The following illustration
shows how transformation tables act as intermediaries in the metric join path
when you use transformation metrics on a report:
Transformation Tables in Metric Join Path
Depending on the database you are using for your data warehouse, a
table-based transformation may be required when performing a
many-to-many transformation such as a year-to-date calculation. Table-based
transformations are also required any time the business rules for a
transformation cannot be accounted for using in an expression.
Expression-Based Transformations
For example, you could create a Last Quarter or Last Month transformation
using QUARTER_ID-1 or MONTH_ID-1, respectively. You can also create
expression-based transformations using pass-through functions such as
ApplySimple. These types of expressions enable you to take advantage of
database-specific functions that you can use to calculate certain types of
transformations.
Transformation Components
All transformations have the following components:
• Member attributes
• Member expressions
• Member tables
• Mapping type
Member Attributes
Member Expressions
Aexpression-based
single transformation can use a combination of table-based and
transformations. For example, you could create a Last
Year transformation based on the Year, Month, and Day attributes. Year
could use an expression such as YEAR_ID–1.However, Month and Day
could map to columns in a transformation table because their IDs are
not conducive to expression-based transformation.
Member Tables
The member tables store the data for the member attributes. For
expression-based transformations, the member tables are generally lookup
tables that correspond to the attribute being transformed, such as LU_DAY,
LU_QUARTER, and so forth. For table-based transformations, it is the
transformation table that stores the relationship.
Mapping Type
The mapping type determines the way the transformation is created based on
the nature of the data. The mapping type can be one of the following:
Creating Transformations
You create expression-based and table-based transformations using the
Transformation Editor.
3 In the Select a Member Attribute window, select the attribute for which you
want to create a transformation.
4 Click Open.
6 In the Available columns list, drag the column you want to use for the
transformation expression into the Member attribute expression pane.
9 Click OK.
The following image shows the Transformation Editor for the Last Year’s
transformation in the MicroStrategy Tutorial project:
Last Year’s Transformation—Transformation Editor
The Last Year’s transformation has four member attributes mapped to the
transformation columns in their respective lookup tables. It has a one-to-many
mapping type.
The following image shows the Transformation Editor for the Month to Date
transformation in the MicroStrategy Tutorial project:
Month to Date Transformation—Transformation Editor
The Month to Date transformation has a single Day member attribute mapped
to the MTD_DAY_DATE column in the MTD_DAY table. It has a
many-to-many mapping type.
1 In the Metric Editor, in the Definition pane, specify the formula for the
metric.
3 Drag the transformation object you want to use from the Object Browser to
the Transformations pane.
The following image shows the Last Year’s Revenue metric that uses the Last
Year’s transformation:
Last Year’s Revenue Metric
The following image shows the MTD Revenue metric that uses the Month to
Date transformation:
MTD Revenue Metric
The following image shows the LY-MTD Revenue metric that uses both the
Last Year’s and the Month to Date transformations:
LY-MTD Revenue Metric
Transformation Examples
Transformations are typically used to display time-based trends.
Recall that the Last Year’s transformation applied to the Last Year’s Revenue
metric is defined as follows:
Last Year’s Transformation
Notice that the Last Year’s transformation has multiple member attributes, all
from the Time hierarchy, and all pointing to transformation tables for the last
(or previous) year. Creating a transformation that is very generic in nature and
encompasses multiple attributes from a single dimension enables report
developers to create a single transformation metric to satisfy a multitude of
reporting needs. The transformation applied to the metric is dynamically
evaluated at report run time to the appropriate attribute level based on the
report definition.
For example, if you modify the report from the previous example to display
only the December 2012 data, the Last Year’s Revenue metric values reflect
only the previous year’s December data, and not the aggregated data for entire
2012 year. This is shown in the image below:
Last Year’s Revenue for December 2012
Recall that the Last Year’s Revenue for all 2012 for The Rugrats Movie
was $16,450.
The MTD Revenue metric uses the Month to Date transformation and
therefore returns the revenue generated in the current month—September
2012.
The YTD Revenue metric uses the Year to Date transformation. It returns the
total revenue generated between 1/1/2012 and 9/30/2012.
The LY-YTD Revenue metric uses two transformations—Year to Date and Last
Year’s. It returns the last year’s year-to-date revenue generated between
1/1/2012 and 9/30/2012.
The BOH - QTD and EOH - QTD metrics are beginning-on-hand and
end-on-hand inventory counts for the quarter ending on 09/30/2012. Both of
these metrics use the Quarter to Date transformation.
The MTD Units Sold and YTD Units Sold metrics returns the number of units
sold in the current month and year, respectively. They use the Month to Date
and Year to Date transformations.
The YTD Average Inventory metric uses the Year to Date transformation and
returns the average inventory for the 1/1/2012 to 09/30/2012 time period.
The YTD Inventory Turnover metric is a compound metric that returns the
ratio between the number of units sold this year to the average inventory count
for the year.
Transformations are useful when you want to analyze data in respect to time.
However, transformation use is not limited to time-based transformations. For
example, you can use transformations to compare current catalog or
promotion versus previous catalog or promotion, and so forth.
Lesson Summary
In this lesson, you learned:
• The member tables store the data for the member attributes.
• The mapping type determines the way the transformation is created based
on the nature of the data. You can use either one-to-one or many-to-many
mapping type.
Exercise: Transformation
Overview
In this exercise, you will first create a Last Year’s transformation that has a
Year member attribute with a [YEAR_ID] - 1 expression defined on the
LU_YEAR table.
After updating schema, you will then create the Last Year’s Forecast
Revenue transformation metric.
Run the report. Add the Quarter attribute to the template and run the report
again. After reviewing the result set, save the report as Transformation
Example in the Public Objects\Reports folder.
Next, you will modify the Last Year’s transformation by adding three more
member expressions. The transformation should be defined as follows:
Finally, after updating schema, you will run the Transformation Example
report and drill from 2011 Q1 to Day to test the new member expression.
Detailed Instructions
6 Click Open.
9 Click OK.
13 In the Public Objects folder, in the Metrics folder, create a new metric.
You can access the Year attribute from the Time hierarchy. You can
find the Forecast Revenue metric in the Metrics folder.
21 Run the report. The report result set should look like the following:
Only data for the 2012 and 2013 years are displayed, even though there is
Forecast Revenue data for 2011 in the data warehouse. Notice that both
metrics return different values for each row, but the Forecast Revenue data
for 2012 is identical to the Last Year’s Forecast Revenue for 2013, as
expected.
23 Add the Quarter attribute to the report template to the right of Year.
25 Run the report. The report result set should look like the following:
Since the Last Year’s transformation is not defined for the Quarter
attribute, the transformation metric is not evaluated correctly. The metric
values are identical for both metrics in each row. This data cannot be
correct, since you already know from the previous result set that forecast
revenue for each year is different.
30 Click Open.
33 Click OK.
36 Click Open.
39 Click OK.
42 Click Open.
45 Click OK.
48 Run the Transformation Example report. The report result set should look
like the following:
49 Right-click the 2012 Q1 quarter element, point to Drill, point to Down, and
select Day. The drill down report result set should resemble the following:
The image above only displays the first few rows of the result set for
the report.
Lesson Description
Lesson Objectives
After completing the topics in this lesson, you will be able to:
Partitioning is the division of a larger table into smaller tables. You often
implement partitioning in a data warehouse to improve query performance by
reducing the number of records that queries must scan to retrieve a result set.
You can also use partitioning to decrease the amount of time necessary to load
data into data warehouse tables and perform batch processing.
Databases differ dramatically in the size of the data files and physical tables
they can manage effectively. Partitioning support varies by database. Most
database vendors today provide some support for partitioning at the database
level. Regardless, the use of some partitioning strategy is essential in designing
a manageable data warehouse. Like all data warehouse tuning techniques, you
should periodically re-evaluate your partitioning strategy.
• Server level
• Application level
The following illustration shows the difference between these two types of
partitioning:
Two Types of Partitioning
Overview
With warehouse partition mapping, you do not include the original fact table
or the partition base tables in the project. Rather, you create and maintain a
partition mapping table, which MicroStrategy Architect uses to identify the
partitioned base tables as part of a logical whole. You only bring the partition
mapping table to the project.
The partition mapping table has several features that must be included for it to
work appropriately in MicroStrategy Architect:
• You can use any name for the partition mapping table.
• It has a column for each attribute ID by which you partitioned the table.
These attribute IDs represent the partitioning attributes.
• There is a row for each of the partition base tables in the logical whole.
5 In the Warehouse Partition Mapping Editor, on the Logical View tab, click
Add.
7 Click OK.
The Warehouse Partition Mapping Editor has four tabs, each displaying
different information about the partition mapping.
The following image shows the Logical View tab of the Warehouse Partition
Mapping Editor:
Logical View of the Partition Mapping
The Logical View tab shows the attribute by which the base tables are
partitioned. In the image above, the base tables are partitioned by the Quarter
attribute.
The following image shows the Physical View tab of the Warehouse Partition
Mapping Editor:
Physical View of the Partition Mapping
The Physical View tab shows the actual columns in the partition mapping table.
In this example, the partition mapping table contains two columns, PBTNAME
and QUARTER_ID.
The following image shows the logical view of the associated partition base
tables on the Base Table(s) Logical View tab of the Warehouse Partition
Mapping Editor:
Logical View of the Partition Base Tables
The Base Table(s) Logical View tab shows the attributes mapped to the base
partition tables. In this example, the Item and Month attributes are mapped to
the base partition tables.
The following image shows the physical view of the base partition tables on the
Base Table(s) Physical View tab of the Warehouse Partition Mapping Editor:
Physical View of the Partition Base Tables
The Base Table(s) Physical View tab shows the actual columns in the partition
base tables. In this example, the Month attribute is mapped to the MONTH_ID
column, and the Item attribute is mapped to the ITEM_ID column in the base
partition tables. In addition, you can map the Beginning on Hand and End on
Hand facts to the BOH_QTY and EOH_QTY columns, respectively.
When you run a report that requires information from one of the partition base
tables, the Query Engine first runs a prequery to the partition mapping table to
determine which partition to access to obtain the data for the report. The
prequery requests the partition base table names associated with the attribute
IDs from the filtering criteria. Next, the SQL Engine generates SQL against the
appropriate partition base tables to retrieve the report results.
The partition base tables may contain either the same column as the
partitioning attribute or a column corresponding to any child of the
partitioning attribute.
Overview
In metadata partition mapping, the application running against the database
still manages the physically partitioned fact tables, but the execution is
different. Metadata partition mapping does not require a partition mapping
table in the data warehouse. Instead, you define the data contained in each
partition base table in a partition mapping object in MicroStrategy Architect.
This object is stored in the metadata. You update the partition mapping as new
partition base tables are created.
When you execute a report that runs against the partitioned tables, the Query
Engine sends the necessary prequeries to the metadata to determine which
partition base tables need to be included in the report SQL. The SQL Engine
then generates SQL against the appropriate partition base tables to retrieve the
report results.
With metadata partition mapping, you can also add partition base tables from
multiple data sources for a single partition definition. This feature can improve
the performance immensely and can be cost effective, especially when you are
dealing with large amounts of data in a project. For example, all 2012 partition
base tables are in Tutorial Data, but the 2011 partition base tables are in
Archived Tutorial Data. You can add the 2011 and 2012 partition base tables
from both data sources to a project and define the data slice for each partition
base table. The following illustration shows a metadata partition mapping table
from multiple data sources:
6 In the Partition Tables Selection window, in the Available tables list, select
the partition base tables and click the > button to add them to the Selected
tables list.
7 Click OK.
9 Click Define.
10 In the Data Slice Editor, in the Data Slice Definition pane, define the data
contained in the partition base table.
12 Repeat steps 8 to 11 for each partition base table in the partition mapping.
13 If you want to define a logical size for the partition mapping, in the
Metadata Partition Mapping Editor, in the Partition mapping logical size
box, type a value.
This example shows 12 partition base tables partitioned by quarter. You can see
that Q1 2012 is the data slice defined for the INVENTORY_Q1_2012 table.
Similarly, the remaining tables correspond to their respective quarters.
Lesson Summary
In this lesson, you learned:
• There are two types of partitioning: server level and application level.
• When you use warehouse partition mapping, rather than bringing the
original fact table or the partition base tables, you bring the partition
mapping table to the MicroStrategy project.
• The partition mapping table must have a column for each attribute ID by
which you partitioned the fact table. It must also have an additional column
named PBTName whose contents refer to the names of the various
partition mapping tables.
• When you execute a report that requires information from one of the
partition base tables, before the SQL Engine generates the report SQL, the
Query Engine first runs a prequery to the partition mapping table to
determine which partition to access to obtain the data for the report.
• When you use metadata partition mapping, you do not need a partition
mapping table in the data warehouse. Instead, you define the data
contained in each partition base table in a partition mapping object in
MicroStrategy Architect.
• When you use metadata partition mapping, you add the partition base
tables to the project using Architect graphical interface.
• When you execute a report that runs against the partitioned tables, before
the SQL Engine generates the report SQL, the Query Engine first sends the
necessary prequeries to the metadata to determine which partition base
tables need to be included in the report SQL.
Appendix Description
The Warehouse Catalog provides the following options for individual tables:
• Remove—This option enables you to remove a table from the project.
• Show Sample Data—This option enables you to view the first 100 rows of
data in a table.
You can also choose to automatically calculate the row counts for all
project tables using a project-wide option in the Warehouse Catalog.
• Table Prefix—Selecting this option 0pens the Table Prefix window, which
enables you to add and remove prefixes and assign a prefix to a table.
Assigning a prefix to a table means that any time the SQL Engine uses that
table, it includes the prefix when referencing the table.
You can also choose to automatically display the prefixes for all
project tables in the warehouse catalog using a project-wide option
in the Warehouse Catalog.
The following image shows the Table Prefixes subcategory in the View
category:
View Category in the Warehouse Catalog
As you can see, the Warehouse Catalog has a host of options that provide
flexibility in maintaining your project over time.
1 In Developer, open the project to which you want to add the tables.
This drop-down list defaults to the primary database instance for the
project.
4 In the Tables available in the database instance list, select the tables you
want to add.
The tables display in the Tables being used in the project list, along
with the primary database instance for each table.
As you can see, the primary database instance for both of these tables is
Forecast Data, not Tutorial Data.
1 In Developer, open the project to which you want to add the tables.
4 In the Tables available in the database instance list, select the tables you
want to add.
6 In the warning window, under Available options, keep the Indicate that
<Table Name> is also available from the current DB Instance option
selected.
Iftable
you click the Make no changes to <Table Name> option, the
is not mapped to the selected database instance, and no
duplicate table is created.
If you want to view and respond to the warnings for each duplicate table
individually, click OK.
OR
If you want to respond to the warnings for all duplicate tables at the same
time, click OK for All.
The tables display in the Tables being used in the project list along
with the primary database instance for each table. The icons beside
the tables indicate that they are mapped to multiple data sources.
The following image shows the warning you see when you add duplicate tables:
Duplicate Tables Warning
The following image shows the Warehouse Catalog with the LU_COUNTRY
table mapped to multiple data sources:
LU_COUNTRY Mapped to Multiple Data Sources
The following image shows the Warehouse Catalog with the LU_REGION table
mapped to multiple data sources:
LU_REGION Mapped to Multiple Data Sources
The primary database instance for both of these tables is Tutorial Data.
However, the icons beside the tables indicate that they are also mapped to
another database instance.
To change the primary database instance for a table in the Warehouse Catalog:
3 In the Warehouse Catalog, in the Tables being used in the project list,
right-click the table for which you want to change the primary database
instance and select Table Database Instances.
6 Click OK.
The following image shows the option for accessing the database instances for
a table:
Option for Accessing the Database Instances for a Table
The following image shows the Available Database Instances window with the
default database instance configuration for the LU_COUNTRY table:
LU_COUNTRY with Default Database Instance Configuration
Tutorial Data is the primary database instance for the LU_COUNTRY table,
while Forecast Data is a secondary database instance for the table.
The following image shows the Available Database Instances window with the
modified database instance configuration for the LU_COUNTRY table.
Forecast Data is now the primary database instance for the LU_COUNTRY
table, and Tutorial Data is a secondary database instance for the table.
LU_COUNTRY with Modified Database Instance Configuration
INDEX