Sunteți pe pagina 1din 24

White Paper

Understanding Kalido Data Architecture


A Technical Overview of a Kalido-Driven Business Model and Data Warehouse

Dr. Hakan Sarbanoglu


Chief Solutions Architect Kalido

White Paper: Understanding Kalido Data Architecture

Table of Contents
1. Introduction 2. Business model elements in support of business intelligence
3 4 5 6 8 8 10 12 12 14 14 16 17 18 18 20 22

2.1 Kalido Modeler User Interface

2.2 Modeling Classes and Class Associations 2.3 Modeling Measures and Transactions 2.4 Grouping Classes and Transactions

3. Deploying Business Models from Kalido Modeler to the Warehouse 4. How Business Models & changes are stored in a Warehouse

4.1 Adaptive Data Store

4.2 Staging Schema and Warehouse Schema 4.3 Storing Reference and Transaction data in Kalido DIW 4.4 Adaptive Services Core 4.5 Optimum de-normalization of Warehouse Schema 4.6 Business Model elements directly created in Kalido DIW 8.4 4.7 Re-Deploying Updated Business Models to the Warehouse

5. Data Mart Generation & Reporting from Kalido DIW 6. Benefits: Business-Model-Driven Data Warehouse Architecture

1. Introduction
A key component to the Kalido Information Engine is the Kalido Business Information Modeler, a graphical tool for defining and managing business models. A business model defined in the modeler can be deployed to a Kalido Dynamic Information Warehouse (Kalido DIW or the warehouse). Model deployment creates corresponding warehouse object definitions and multiple physical schemas to store meta, reference and transaction data. The Kalido Modeler is composed of two parts: Modeler: A rich, graphical user interface for the creation of business models. It runs on Windows .NET 3.5 WPF (Windows Presentation Foundation) platform. (Available as a free download at www.kalido.com/bmcf) Model Exchange Services (MXS): An adapter between the Modeler and the Kalido warehouse. It is hosted by MS IIS (Internet Information Services) 5.1, 6.0 as a Web service application and runs on the Kalido DIW 8.4 Application Server

When the business model is deployed to a Kalido DIW instance, all model elements and their properties are represented and stored as meta data in the Kalido DIW Adaptive Data Store.

Business model definitions made in the Modeler are stored (serialized) in a binary file called the AIM file. When this model is deployed to a Kalido DIW instance, all model elements and their properties are represented and stored as meta data in the Kalido DIW Adaptive Data Store. The model elements created in Kalido Business Information Modeler, version 1.0, include classes, transactions, identifiers, attributes, associations, measures and groups with their detailed properties. These model elements should be considered as forming the core model for a data warehouse. In order to actually manage the full life cycle of a Kalido data warehouse instance, we need to create some additional object definitions, such as file or feed definitions and summary transaction data set definitions, directly in the Kalido DIW user interface after the deployment of the core model. These are data warehousespecific object types that can not be defined in the Modeler. We can define the core business model object types, for instance classes and transactions, also directly in Kalido DIW, but this would result in the corresponding model stored in the AIM (.aim) file developed by the Kalido Modeler to become out-of-synch with the warehouse. Although there are features to resolve such out-of-synch statuses, the best practice is to make all changes to the core model elements in the Kalido Modeler, not in Kalido DIW, and re-deploy the differences. This white paper introduces the details of what happens to the business model during its full life cycle, starting from its creation in the Modeler, and where/how the model meta data is stored within the end-to-end (Modeling-to-Business Intelligence) Kalido architecture.

White Paper: Understanding Kalido Data Architecture

2. Business model elements in support of business intelligence


A business model, in the information management context, is the description of your business. It represents your business environment in the real world, so it can be used as a reference model for your business analysis and decision-making purposes. As in any reference model, it is a best possible approximation to the real world, and it should be close enough to deduce accurate conclusions about what has actually happened, or what would happen in the future if certain changes occur so that the business could make effective decisions. A business model can represent and support various aspects of your business, so the types of elements seen in a business model depend on the planned use and scope of the business model. If your business model is intended to support your business analysis and decision making process, then the types of model elements should be relevant to business intelligence (BI) needs. In this case, your business model should drive the development, operation and maintenance of integrated data warehouses as the source to BI. A business model supporting data warehouses and BI should include four components: Activities, Measures, Business context and Business rules that bind these things together.

A business model can represent and support various aspects of your business, so the types of elements seen in a business model depend on the planned use and scope of the business model.

A business model in support of master data management and data governance aspects of a business would additionally require, for example, data governance workflows, data derivation processes and validation rules which are not available in version 1.0 of the Kalido Modeler. A business activity type (for example, a procurement order) is perceived as a separate entity type. Each occurrence of a business activity is a business transaction. A business transaction, such as a particular procurement order transaction, is an event that happened at a point in time. In each event, we need to measure some quantities and qualities, like number of units purchased, list price, agreed discount, etc. In each event, a number of business entities are involved, for instance, the product item or material ordered, the supplier, the purchaser, the delivery location, the date of the order procurement, etc. These are the parties involved in a business transaction event. While we carry out the business activities with pre-defined business processes, including exception handling rules, we capture certain facts of each business transaction (measure values and the identifiers of involved business entities) with a transaction record.

This part of the business would be represented with the graphical model in the Modeler shown in Figure 1.

Figure 1: A simple business model created in Kalido Modeler Here the rounded rectangle represents the transactions of the purchase order activity, the orange balls represent the measures captured in each transaction, and the blue rectangles represent classes of involved business entities in these transactions. Materials in the Material class are classified with a two-level hierarchy of categories as defined by the procurement department. The lines between the classes represent classification associations with many to one cardinality, with the arrow side indicating the parent class. Finally, the lines from transaction to classes indicate the involved classes of business entities. It is easy to see how the measures can be aggregated by following the associations through the arrows. 2.1 Kalido Modeler User Interface The Kalido Modeler user interface consists of three functional tabs: Diagram, Table and DIW Operations. Diagram is the main interface used by the architect/modeler to define a business model with maximum use of graphics. Figure 2 shows the functional sections of this interface.

Figure 2: The Kalido Business Information Modeler user interface

White Paper: Understanding Kalido Data Architecture

The modeling process starts in the main window by graphically defining the objects (classes, transactions, the associations between them, groups, etc.). A best practice is to first close the detailed panels on the right and together with business representatives develop a high-level model in the graphical main window, and then go into the Properties panel for each object to provide more detailed definitions and properties. All these definitions are validated in real time against the Kalido DIW model validation rules. The set of rules executed here are a subset of the Kalido DIW validation rules which the warehouse itself applies to its own business models. Validation rule failures are displayed in the Messages Panel as they occur. It is easiest to resolve validation errors as they appear rather than attempting a big clean-up at the end. As the model is modified to correct these errors, the messages are removed from the Messages pane automatically. When the model gets bigger, you can enter a search string into the Find tool bar to locate the objects that include this string and center them in the diagram. The meta data created to represent the business model elements and their properties can also be seen, in real time, in the Table tab (Figure 3).

Figure 3: Business model meta data in the tabular view of the Kalido Modeler

This tabular view of meta data is organized by the object types in the model. In addition to viewing all objects here, we can also edit some of the properties, especially fields like descriptions. However, we cannot create a new object in this tabular interface.

2.2 Modeling Classes and Class Associations A class can be easily created with a single gesture of right-click drag to the right in the graphical view. An association between two classes is created by clicking inside an object and dragging (with the right button down) to the other class. This gesture-based diagram capability requires no additional movement (such as menus, etc.) and therefore minimum effort.

Figure 4 shows a Business Party dimension for a B2B company. In this model, we are working with many Companies that we are associated to as Customer and/ or Supplier (both subtypes of the Company class). We might have multiple accounts opened for each Customer Company. These 3rd-party Companies may also be forming unbalanced (ragged) hierarchies between different levels in their organizations as modeled with an involuted association (shown as a dotted, curly arrow), and they optionally (shown with dotted line) belong to a Global Company Group. In this example, we have also determined that some of these Global Company Groups are considered Key Accounts. Each company is given a Credit Rating and classified by Figure 4: A simple Business Party Model their Standard Industry Classification (SIC) which is also represented as a variable-depth (unbalanced) taxonomy. We also need the Delivery Points (Ship-To) of our customers. Each Customer Account is serviced through one of our Sales Channels. Finally, we have our suppliers modeled as a subtype of Company. Supplier companies (and only Supplier companies) are classified by their Service Levels. After creating an object in the graphical view, we can use the property panel for that object to enter detailed properties like the type of the object and other descriptive information including Attributes and Naming Schemes. Naming schemes, which are Identifiers or codes, are shown in Figure 5. The software is data- driven and intelligent, so that it requires, expects or offers information to be entered based on the modelers inputs thus far. The detailed properties of associations such as the semantic type of the association, the roles played by the business entities at the two ends and the cardinality of end points are entered in the property panel after selecting that association in the diagram (Figure 6). By default, an association is defined as aggregative, meaning it can be used to roll up the measure values to the Figure 5: primary (parent) end point. If there are multiple aggregative Property Panel associations coming off a class and if the associations create multiple aggregate paths (loops) to roll up to any other class, then the extra aggregative associations should be flagged as cut to obtain a single default aggregation path between any two classes (in Kalido DIW it is still possible to navigate the aggregations through cut associations).

Figure 6: Association end points

White Paper: Understanding Kalido Data Architecture

Attributes can be created for each class using the attributes property panel. They can be declared in many available data types as shown in Figure 7 and each can be defined as mandatory and/or primary. The Kalido Modeler offers capabilities to add semantic definitions to the model elements. Although there will not be significant differences in functional treatments based on those semantic descriptions within the Kalido warehouse, these definitions are transferred to the BI Meta Data Layer for access by BI developers and users, so it is worth taking the time to fill in those semantic description fields.

Figure 7: Class Attributes

2.3 Modeling Measures and Transactions Measures are first created at the model level, in the property panel on the model, before they are included in transactions. Each Measure must have a Unit of Measure type (Figure 8). These Measures are called Stored Measures in Kalido DIW, which means we intend to load these measure values from the transaction records in the Staging Area. Measures created at the whole model level can be included in any one (or many) of the Transactions. A Transaction is easily modeled graphically with a right click gesture to the left, which creates a rounded-corner rectangular block. Then, the classes involved in the Transaction are linked by creating associations to those classes (see Figure 1). Next, we include the measures into the transaction definition from the existing measures list. When we define a Transaction entity, we can also add any number of Attributes. These Attributes in Transactions are typically used for further qualifying a Transaction entity, such as the invoice number of an invoice line item transaction, or additional dates. These Attributes, however, should not be confused with the identifiers of involved Figure 8: Properties of Measures business entities which are modeled as involved Classes. 2.4 Grouping Classes and Transactions Each Class or Transaction must belong to a Group. A Group represents a logical collection of Classes and/or Transactions. A Group containing only Classes represents a Dimension in Kalido DIW. We normally include a number of classes that are clustered with associations between them in a Dimension Group. Associations totally within a Dimension Group are used to create denormalized dimension tables (Mapping Tables) in Kalido DIW. After grouping the classes, there can be associations left between the groups. Such associations are transferred to in Kalido DIW as Cross Dimensional Associations and can still be used as aggregate paths.

In Dimension Groups we can also define several Roles. Roles are required in cases where there is a Transaction which refers to Classes within the same Group multiple times. This situation would result in multiple associations, hence multiple joins, between the Transaction and the Mapping Table representing the Transaction and the Dimension Group correspondingly. An example would be where Business Parties (Companies) can play several roles in Transactions, like Sell To, Ship To or Supplier roles. Assigning roles to each of those multiple associations between the same Transaction and Dimension Group would enable separate aggregation and avoid misallocations. A Group containing only Transactions represents a Class of Transaction in Kalido DIW. A Class of Transaction is a logical grouping of similar business transactions that need to be merged to get the full business view. For example: Legacy and ERP sales invoice lines may have different granularities or different measures and probably come from different source systems. (Figure 10)

Figure 9: Dimension Group

Figure 10: Grouped Classes and Transactions

White Paper: Understanding Kalido Data Architecture

3. Deploying Business Models from Kalido Modeler to the Warehouse


When a business model is used to create and manage a data warehouse in Kalido DIW, most of the business model objects are semantically perceived as specific to data warehousing. Table 1 shows this semantic conversion between the Kalido Modeler and the Kalido Warehouse.
Kalido Modeler Object Type Kalido DIW Object Type

When a business model is used to create and manage a data warehouse in Kalido DIW, most of the business model objects are semantically perceived as specific to data warehousing.

Class Group Role Transaction Group Class Naming Scheme Attribute Instance (Business Entity) Measure Transaction Super-type Class, 1...n subtype classes without any peer associations Super-type Class where subclasses forming a single hierarchy Class with two or more mapped associations from other classes Association Transaction Class Association Transaction Attribute Transaction Attribute Warehouse Section semantic conversion

Dimension Role Class of Transaction Class of Business Entity (CBE) Naming Scheme Attribute Business Entity (BE) Measure Transaction Dataset (TDS) Super/sub type CBE Coding Structure Definition (CSD) Mapped CBE Association Transaction Dataset Dimension Column Transaction Dataset Measure Column Transaction Dataset User Defined Column Warehouse Section

Table 1: Modeler to Warehouse (Kalido Modeler to Kalido DIW) object

10

There are also objects which can only be defined in either the Kalido Modeler or Kalido DIW as listed in Table 2.
Kalido Modeler (AIM file) Model Element Kalido DIW Model Element Remarks

Annotation Generic Group File Definition Feed Definition

No equivalent in Kalido DIW, but this is the equivalent of a post-it note which may be attached anywhere within an AIM model. A group may contain any number of other top-level AIM types including other groups. File definitions are not defined in the Modeler Feed definitions are not defined in the Modeler

Aggregated, Aggregated, Calculated and Converted Measures are not defined in Calculated and the Modeler Converted Measures Summary Transaction Dataset Query Definition and Result Set Definition Mapped BE Summary TDSs can only be defined in Kalido DIW Query Definitions and Result Set Definitions are not defined in the Modeler Actual Mapped Business Entity definitions are defined only in the Kalido DIW user interface

Table 2: Objects that are specific to either the Kalido Modeler or Kalido DIW (current as of Kalido DIW 8.4 and Kalido Business Information Modeler 1.0)

The process of deploying a business model created in the Kalido Modeler is shown in Figure 11. The deployment process steps are: The model created in the Kalido Modeler (AIM file) is mapped onto a warehouse model by semantically converting to warehouse model components and mapping or assigning their Object identifiers. The Warehouse Model is transformed into an XML file in KMX format (A meta data definition format standard within the Kalido Information Engine) Kalido DIW imports this XML formatted meta data file and creates the business model elements. During this import, the incoming object definitions are validated against further warehouse model validation rules, like each CBE must belong to one and only one Dimension (whereas a Class in an AIM model can exist in many Groups). A report of all warehouse objects created (including generated system labels) and their object identifiers is returned. The AIM model is updated with the warehouse object identifier and system label information. The AIM model re-serialized, and the resulting form is stored in DIW for use for future synchronization between the Modeler and the Warehouse. Figure 11: Deployment Process (from Kalido Modeler to the Warehouse)
11

White Paper: Understanding Kalido Data Architecture

One best practice in deploying a core business model from the Modeler to a Warehouse is to validate the AIM model against a warehouse instance. This is offered as an automated function by the Kalido Modeler. The validation operation invokes the meta data import step in test mode on the warehouse and reports any issues back to the Modeler. Deployment of a business model from the Modeler to the Warehouse results in importing all model objects into the Warehouse. After a successful deployment, we can immediately view the model objects within the Kalido DIW user interface. Once a model is deployed into a warehouse, it should not be deployed to other warehouses as this would result in a loss of Object IDs associated from the first deployment. If we need to deploy a model to another warehouse we should rather create a copy of the model file (AIM file) or copy and paste the objects as needed. Figure 12 shows this user interface in Kalido DIW and highlights how some of the different parts of the business model are represented. Figure 12: Object types seen in the Kalido DIW User Interface

4. How Business Models and their changes are stored in a Kalido Warehouse Meta data stored as data
In traditional data warehouse design and development process, very few designers start with a conceptual model. Most create a logical data model first and then convert it to a physical schema design. This physical design is then captured as a database creation script which is run on the RDBMS (e.g. Oracle). is the physical design encapsulates data and database objects, including tables, columns, indexes etc. In contrast, in the Kalido Modeler, we do not design tables, columns and indexes; we are instead designing the objects that represent types of real-world entities including classes of business entities, transactions, measures, business associations, etc. Instead of a database creation script, the result of this initial design is meta data exported into an XML file, which is then deployed to a Kalido DIW instance. When a business model is deployed, Kalido DIW imports the business model as meta data and at the same time, automatically creates not one, but multiple database schemas comprising multiple relational tables for each object in the end. The following sections describe how Kalido DIW automatically creates such a data warehouse comprising multiple schemas including an Adaptive Data Store, a Staging Schema and a Warehouse Schema. The design of each of these auto-generated schemas is optimized for its main usage. 4.1. Adaptive Data Store The Adaptive Data Store is used by Kalido DIW to store Meta and Reference data. Its design is very similar to a Triple Store design where information about any type of entities can be normalized to triple records (SubjectPredicate-Object) and can be extended to a Quad Store records with a Name Space. Also known as Associative Modeling, it is possible to decompose any complex object type to multiple triple records each representing an

12

atomic component of its descriptive information (identifiers, attributes and associations) to define a whole object and any kind of objects in a single table. Kalidos version of this design is slightly different, with the use of an additional entity type column and start and end date-time stamp columns. Also, every object is first assigned an internally unique object identifier serving as a surrogate key. Accordingly, references (relational row joins) are done through these object identifiers and all other field values (texts, numbers, etc.) are all pushed out and stored in a second table. Kalidos version of the extended triple record is referred to as a generic record in Kalido documentation. Figure 13 shows a logical meta model, in Entity-Relationship style, including two entity types: Class of Business Entity (CBE) and Association Between Classes. When we define our model in Kalido DIW, or deploy it from the Kalido Modeler, the Adaptive Data Store will understand this model as two CBEs and one Association Between Classes. When storing them as meta data, it wont make a difference if they are Department and Employee classes linked with an assignment association or if they are Facility and Equipment classes linked with an installation association.

Figure 13: Logical meta data model to represent Classes and Associations in Adaptive Data Store Then, Figure 14 shows a highly simplified version of how this simple model is decomposed into generic records:

Figure 14: Logical meta data model to represent Classes and Associations in Adaptive Data Store Actually, there will be many records to define a whole object, for example a Class of Business Entity is described with detailed properties, like type of the class (normal, period of time, currency, and so on), semantic description of the class, whether this class is a non-aggregative or enumeration class, etc., each defined with separate generic records associated to each other. Using this decomposition method, a complex business model comprising various entity types (classes, associations, class attributes, naming schemes, measures, transactions, transaction attributes, class groups and transaction groups) will be defined as meta data in generic records.

13

White Paper: Understanding Kalido Data Architecture

An interesting point is that regardless of the business model subject area, whether it is describing a pharmaceutical, insurance, finance or manufacturing business, the physical schema of the adaptive data store will not change. 4.2. Staging Schema and Warehouse Schema As soon as this model is defined using the Kalido DIW interface or automatically deployed from the Kalido Modeler, Kalido DIW will also create a Staging Table in the Staging Figure 15: Business Model stored as meta data in Kalido DIW Schema and an Attribute Table in the Warehouse Schema for each class in the model. Whereas the staging tables are loading friendly, an attribute table includes columns for each attribute (including the name) and naming scheme (identifier) defined as a part of that Class as well as the object identifier and start/end date-time-stamp columns. In addition, Kalido DIW will also create a Mapping Table to represent all the associations between Classes within a Group. In the Warehouse Schema, the Attribute Tables are referenced from the Mapping Tables. Finally, for each Transaction in the Model, Kalido DIW will create a Transaction Table or Warehouse Section (in warehouse design, these are often called Fact Tables) in the Warehouse Schema, which Figure 16: Generation of results in a BI-friendly snowflake-like schema by default (Figure 16). Warehouse Schema 4.3. Storing Reference and Transaction data in Kalido DIW We can start loading reference and transaction data immediately after the business model is deployed to Kalido DIW (or defined directly in Kalido DIW). There is no need for logical data modeling, logical-to-physical design conversion or creation of the database objects. This immediacy is a key differentiator for Kalido, as well as an enabler of avoiding any disconnect between the designed and as-built schemas. In order to load reference data, the load records brought from the source systems should be loaded into the corresponding staging table for that CBE. Kalido DIW takes over the incoming reference data and loads them first into the Adaptive Data Store. During this load, Kalido DIW creates or maps natural keys (naming schemes) to object identifiers as surrogate keys, validates the incoming data against the defined model, and detects the new and changed data for a delta load. Next, Kalido DIW populates the Warehouse Schema records automatically. This is an incremental refresh of the time-variant mapping and attribute tables. If the model is changed, Kalido DIW automatically recreates these tables with additional columns and repopulates them with all existing data. Where Figure 17: Loading reference and there is a high volume of business entities (e.g. several million) then the transaction data reference data is loaded directly into the Warehouse Schema, bypassing the adaptive data store. These are called Custom CBEs.

14

The reference data (business entities like customers, products, equipments, employees and their properties) are loaded into the Adaptive Data Store with the same decomposition logic. Figure 18 shows the simplified logical model used for storing reference data in the adaptive data store. Here, each business entity is created as an unknown business entity first, and then declared as a member of a Class of Business Entity in a second record (Figure 19). Figure 18: The Adaptive Data Stores highly simplified logical model For example, we create Eastern Sales and Samuel Hirsch records first, and then declare them as being a Department and an Employee. If we have an assignment association between the department and employee CBEs as defined earlier in the model, then by referring to that particular Association Between Classes, we create another record to actually associate Samuel Hirsch to the Eastern Sales department. In Kalido DIW, each association is actually a separate entity Figure 19: Storing reference data changes in Adaptive Data Store with a simple life (start and end dates) and is physically stored as a separate record (compared to value-based foreign key relationship in relational model). More importantly, when an association changes, for instance a new department called Central Sales is created, and Samuel Hirsch moves to that new department in 2004, Kalido DIW end-dates the existing association record and creates a new one with the appropriate effective date. The same method is used to define a change to the model. More than 30 complex entity types, like Attributes, Measures, Transaction Figure 20: Storing reference data changes in Adaptive Data Store Data Sets, additional Naming Schemes (Identifiers), Feed Definitions and their many subtypes are described as meta data in similar ways. Any new or changed model object types (CBE, association between CBEs, attributes, measures, transactions etc.) are introduced as incremental meta data in the Adaptive Data Storage.

15

White Paper: Understanding Kalido Data Architecture

Moreover, all historical changes to the model and reference data are kept to provide an auditable record or corporate memory of the past and are available for time-variant querying. So, the model changes are defined only at the conceptual level. Rather than defining a business-specific schema for storing customers and brands, etc., the logical and physical layers are stable. 4.4 Adaptive Services Core Decomposing meta and reference data objects into such generic records is one thing and re-forming those objects from generic records is another. Many people, systems and organizations have proposed to use flavors of generic designs to store any kind of information with decomposed and highly normalized records, but almost all failed to make it work. This is because this design is not easy to work with due to all the records being tightly linked. If the user is left to manage these records, (s)he has to become an expert in handling generic structures. For instance, to retrieve all information about a business entity, one would need to navigate through multiple generic records with self joins and compose the object back as a whole. Kalidos patented version of generic design comes with a thick object layer on top of the simple and stable physical tables. This object layer is implemented via some layers of database view definitions and functional methods to manipulate those primitive, intermediate and compound object types through an API. This engine is called the Adaptive Services Core. The Figure 21 illustrates the conceptual, logical and physical data model layers of the Kalido Adaptive Data Services Core. The Kalido Adaptive Services Core provides the ultimate level of model abstraction by identifying and storing every piece of information as an object in its own right (Levels 0 and 1 in the data model stack). This enables any number of attributes and relationships to be defined for each individual entity.

Figure 21: Conceptual, logical and physical data model layers of the Kalido Adaptive Data Store shown together with the corresponding Meta Model layers
16

At Level 2 there are two general-purpose models, the Object model and the Descriptor model. Metadata objects such as CBEs, Measures and Transaction Datasets as well as individual Business Entities (BEs) all conform to the Object Model. All attributes, names and identifiers conform to the Descriptor Model. It is easy to add, remove or change these things by linking in a new object or descriptor into the system. A meta object property is represented in the same way as an attribute of a BE. At Level 3 the objects are more specific and separate models are defined for each type of object supported by the data warehouse so there is a separate model for a CBE, another for a Measure and yet another for a BE. There are also some additional models at this level for more general items such as users and security access control lists. In terms of meta modeling, i.e. modeling other models, Level 2 and 3 correspond to Meta Model Layer (M2). The business model is defined in terms of these Level 3 objects and is mapped to the physical design through all these levels. There is also the Application Reference Model which defines the Kalido data warehousing and master data management applications making them perform the functions they are designed for (e.g., data that drives behavior of the Kalido applications, wizards, etc.) Level 2 and 3 correspond to the Business Model Layer (M1) in meta modeling. Finally, the Business Model should represent the real world (M0 level of meta modeling). The functions provided through the Kalido Adaptive Services Core (ASC), the physical Adaptive Data Store schema and SQL queries between these two have been highly tuned over the years. Overall ASC is the most important component of the Kalido DIW architecture. 4.5 Optimum de-normalization of Warehouse Schema There are various configuration options in Kalido DIW designed to make the auto-generated warehouse schema more BI friendly as well as scalable and fast performing. These options include pre-joining selected attribute tables onto their mapping tables to create an extended mapping table (a dimension table), having cross dimensional relationships, multi-granular transaction data sets or custom CBE tables (which is a combined mapping and attribute table aimed at highvolume reference data). At an extreme, we can pre-join all columns of all attribute tables onto their mapping tables and without cross dimensional associations. The result: The Warehouse Schema becomes a pure star schema. At another extreme, we can create all CBEs Figure 22: Typical Kalido Warehouse Schema structure as Custom CBEs, put each of them in their own group (single class in each group), have as many associations as needed between the classes and use foreign key attributes to relate a transaction to another, if needed, then the Kalido Warehouse Schema will look like a third normal form (3NF) schema (with additional surrogate keys and time variance columns). Best practice should be to configure an optimum level of de-normalization using these available options in Kalido DIW.

17

White Paper: Understanding Kalido Data Architecture

4.6 Business Model elements directly created in Kalido DIW 8.4 When a model is deployed to the warehouse, Kalido DIW issues a check-point in the warehouse to manage the synchronization. The business model objects deployed from a corresponding Modeler file (.aim file) are marked as managed by the Modeler tool. If we attempt to change such an object directly within Kalido DIW, it gives a warning message as shown in Figure 23. We may still change that object, for instance add another attribute to an existing Class of Business Entity, but this would make the actual model in Kalido DIW out-of-synch with respect to its corresponding AIM conceptual business model file. In this case, when we re-deploy a new version of the model from the corresponding AIM file, the Kalido DIW instance will be rolled back to that checkpoint and any changes directly made in Kalido DIW on existing core model objects that were imported from the Modeler and any brand new core Figure 23: Protection of objects imported into Kalido DIW object types directly created in Kalido DIW since from the Kalido Modeler the last deployment will be lost. The best practice in managing a DIW model is to maintain a corresponding AIM file for each DIW instance and make all changes to the core object types first in the Modeler and then re-deploy, which will result in a delta import into the Kalido DIW instance. After the core business model elements (in the AIM file) are imported from the Kalido Modeler into the Warehouse, we may need to add some more meta model objects in the warehouse, which are either specific to data warehousing, like Calculated Measures, Aggregated Measures, Business Rule Tables (Mapped CBEs), Summary Transaction Datasets etc.; or needed for end to end integrations, like Staging Table Definitions, File and Feed Definitions, Data Mart Definitions (Query and Result Set Definitions) etc. These kinds of meta model objects are created directly in Kalido DIW using its tree browser type GUI. 4.7 Re-Deploying Updated Business Models to the Warehouse Once an initial business model is successfully deployed into a warehouse instance, a snapshot of the AIM model itself is also stored in the warehouse together with a Check Point. This allows controlling subsequent model updates to be received from the originating AIM Model or done directly in Kalido DIW (which is not recommended for any business model element types that could be defined in Kalido Modeler). We would certainly need to update the models, either during the iterative model development phase of the implementation project or even after the first go-live to support changing requirements as a result of actual business changes or need for additional analytics. The best practice is to define model changes, for instance new classes, new attributes to existing classes, new associations, or any elements that will be removed from the models next version, in Kalido Modeler. The new version of the business model (AIM file) should then be re-deployed to the existing warehouse instance. The model itself is time-variant meta data.

18

During the re-deployment, changes are detected for the existing elements, so for example, there can be a new Attribute defined for an existing CBE, a new Association between two existing CBEs or an association is changed from being mandatory to optional. There may also be new objects (new Classes, Transactions, Measures etc.) in the new version of the AIM Model. These all result in incremental model changes in the warehouse. If there are objects tagged as removed in the new version, like a deleted CBE, then the CBE definition in Kalido DIW is expired after some validations like whether there are any dependent and live model elements. One best practice for incremental model changes for a warehouse instance when there is no previous AIM file available is to extract the as-built warehouse business model into the Kalido Modeler before starting any updates. This is essentially reverse engineering an as-built schema back into a model using tools available in the Kalido Modeler. Once the as-built model is converted to an AIM file, the modeler can then work on the updates to create the next version and can re-deploy it to the development warehouse instance after a validation test. In cases where there are clashing model changes on both the Kalido Modeler and Warehouse, there are some advanced techniques to synchronize them back, like roll-back the warehouse to the check point at the last deployment time, but such advanced methods are not discussed in this paper. While the physical schema under the Adaptive Services Core remains unchanged for any type of business model, the physical schema that is automatically created for a defined model will be composed of Fact Tables (one for each Warehouse Section), Mapping Tables (one for each Dimension and representing the associations between all Business Entities in that dimension in a timevariant structure), Attribute Tables (one for each CBE, storing Names, Attribute and Identifier values) and optionally Custom CBE and Virtual Mapping Tables. Business models defined for financial reporting of an oil company, for corporate performance management of a CPG company, for profitability analysis of an Insurance company and for clinical trial warehouse of a Pharmaceutical company will result in different Warehouse Schemas, as if they have been perfectly designed by an expert Data Warehouse Architect and created by expert DBAs. The major difference here is that Kalido creates the warehouse automatically from the defined business model, removing the conventional steps of converting the conceptual data model first to a logical model, then a physical schema design and finally creating the actual physical schema. But more importantly, model changes are reflected to the Warehouse Schema automatically, without any SQL changes or any disruptive or manual modifications as well as providing a superior design including advanced solutions for handling time variance, multiple identifications, complex business rules and derived information. This is available not only for the initial creation of the data warehouse, but also for managing the full life cycle, i.e. daily operations like loading data, refreshing data marts etc, as well as making model changes without costly and risky re-design and manual schema modifications.

With Kalido, model changes are reflected to the Warehouse Schema automatically, without any disruptive SQL changes or manual modifications, while also handling time variance, multiple identifications, complex business rules and derived information.

19

White Paper: Understanding Kalido Data Architecture

5. Data Mart Generation and Reporting from Kalido DIW using Business Model Elements
While this white paper does not focus on the reporting facilities of Kalido DIW, a brief section is included here to show how it relates to the business model. The Query Definition Wizard (Figure 24) in Kalido DIW uses the business model to determine which reference data is available to summarize the measures by. It dynamically adjusts as different measures are included in the query. It can also deliver calculated measures, currency conversion, year-to-date and other relative time periods. So within a matter of hours a simple data warehouse can be designed and implemented, and, because of the rigor involved in the modeling structures and the iterative development approach it enables, can evolve into an industry-strength data warehouse within weeks. Query generation is defined by the business model but runs from the Warehouse Schema. Model changes may change the format of the tables in this schema, but as they are automatically generated from the business model, no programming is required. The Data Marts would also include the derived information based on the rules defined in the Kalido Business Model (like calculated measures) and in the Query Definitions (like filters). BI tools can pull data from the Warehouse Schema or from the data marts. An advanced option for the BI tools is to access the data in Kalido DIW through the associated meta data.

Figure 24: The Kalido Query Definition Wizard

Figure 25: Time variance selection in a Kalido DIW Query Definition

Kalido also provides the Kalido Universal Information Director tool that automatically creates BI meta data using the scope of interest defined from a subset of the business model so that the BI tools object definitions are mapped to the generated dynamic SQL definitions on the Warehouse Schema or on the Data Marts, making seamless pull of integrated and changing data as a source to BI possible. Kalido UID is available for the creation of Business Objects Universes, Cognos Framework Packages and Microsoft SSAS Universal Dimensional

20

Models. Kalido UID is also capable of reflecting the Kalido Business Model changes to the meta data layers of these bridged BI tools. When the business model changes, for instance new transactions and new hierarchies are added or there is an additional attribute defined in a Class of Business Entity, Kalido UID detects such model changes and identifies the impact for each scope of interest. Then it offers to update the BI semantic layer (i.e. Business Objects Universe, Cognos Framework or Microsoft SSAS) to apply those changes. In addition, Kalido DIW can be linked directly to QlikTechs QlikView product for an end-to-end, business-modeldriven solution for users of QlikView.

Figure 26: End-to-end Kalido architecture including an MDM repository

21

White Paper: Understanding Kalido Data Architecture

6. Benefits of the Business-Model-Driven Kalido Data Warehouse Architecture


Ensures the delivery of accurate, consistent, actionable, transparent and flexible business intelligence by enabling endto-end (staging-to-BI) meta data integration and semantic consistency. A Kalido business model is easily understood by the business and therefore encourages their active involvement in the projects and in the use of the resulting business intelligence Serves as a common abstraction layer and eliminates the disconnects between the different role players and platforms of a data warehousing project

Provides a high degree of flexibility enabled through generic modeling which simplifies change and avoids major re-development effort thus: Fostering an iterative data warehouse development approach Enabling the warehouse to evolve as the business evolves Providing the ability to extend the scope or improve the design during any phase of the warehouse life cycle, including the operational phase

Speed of development and management of change through: The automation of warehouse creation, Multiple schemas managed through the business model for their best usage Rapid development achieved by abstracting and eliminating data level modeling Automatic maintenance of the history of models and data

22

For More Information


If you would like to learn more about business modeling or Kalido, please visit www.kalido.com to read other white papers and to see Kalido product demos. You can also contact Kalido and request a meeting to discuss the principles of business-model-driven data warehousing and find out whether you can benefit from it. We also encourage you to evaluate Kalidos enterprise-scale Master Data Management application (Kalido MDM), which is developed based upon the same generic modeling and adaptive data store principles.

About the Kalido Information Engine


The Kalido Information Engine is a suite of products that gives customers the ability to manage their entire BI infrastructure. Unlike traditional approaches, the Kalido Information Engine puts information management in the hands of the business, while satisfying the strict data management requirements of IT. Working in unison with existing transaction systems, data warehouses and front-end BI tools, the Kalido Information Engine brings new levels of flexibility and insight to corporate decision-making that change how, and the cost at which, business gets done. The Kalido Information Engine is composed of the Kalido suite of products: Kalido Business Information Modeler, Kalido Dynamic Information Warehouse, Kalido Master Data Management and Kalido Universal Information Director.

About Kalido
Kalido delivers active information management for business. Developed through years of successful best practice implementations, Kalidos robust, business model-driven information management engine automatically feeds information to end users through their BI tools, making them more productive far more quickly and reducing internal costs. With Kalido, users never again have to wait for months to answer fundamental questions about business performance which products are selling, which customers are profitable, and which markets are most promising. Kalido software is installed at over 250 locations in more than 100 countries with market leading companies. More information about Kalido can be found at: http://www.kalido.com.

23

Contact Information US Tel: +1 781 202 3290 Eur Tel: +44 (0)845 224 1236 Email: info@kalido.com or visit our website at www.kalido.com

Copyright 2008 Kalido. All rights reserved. Kalido, the Kalido logo and Kalidos product names are trademarks of Kalido. References to other companies and their products use trademarks owned by the respective companies and are for reference purpose only.

WP-UKDA09081-HS

S-ar putea să vă placă și