Documente Academic
Documente Profesional
Documente Cultură
N. B.: (1)
(2)
(3)
(4)
(5)
I.
i.
ii.
iii.
iv.
[Max Marks: 60
10
10
II.
III.
a.
b.
c.
d.
IV.
a.
b.
c.
d.
10
10
V.
Metadata Snapshots
The Import Metadata Wizard
c. Explain multidimensional database architecture with suitable diagram.
d. Explain OLAP Terminologies.
(ii)
_________________________
10
10
(2 Hours)
ii.
10
iii.
iv.
10
The welcome screen will offer us four tasks that we can perform with this
assistant. We'll select the first one to configure the listener, as shown here:
Here,
Suppose, we have already created (Students might have take another example)
project: ACME_DW_PROJECT
Module: ACME_POS
We are going to define source metadata for the following table columns
ITEMS_KEY number(22)
ITEM_NAME varchar2(50)
ITEM_CATEGORY varchar2(50)
ITEM_VENDOR number(22)
ITEM_SKU varchar2(50)
ITEM_BRAND varchar2(50)
ITEM_LIST_PRICE number(6,2)
ITEM_DEPT varchar2(50)
Before we can continue building our data warehouse, we must have all our source
table metadata created. It is not a particularly difficult task. However, attention to
detail is important to make sure what we manually define in the Warehouse Builder
actually matches the source tables we're defining. The tool the Warehouse Builder
provides for creating source metadata is the Data Object Editor, which is the tool
we can use to create any object in the Warehouse Builder that holds data such as
database tables. The steps to manually define the source metadata using Data
Object Editor are:
1. To start building our source tables for the POS transactional SQL Server
database, let's launch the OWB Design Center if it's not already running.
Expand the ACME_DW_PROJECT node and take a look at where we're
going to create these new tables. We have imported the source metadata
into the SQL Server ODBC module so that is where we will create the
tables. Navigate to the Databases | Non-Oracle | ODBC node, and then
select the ACME_POS module under this node. We will create our source
tables under the Tables node, so let's right-click on this node and select
New, from the pop-up menu. As no wizard is available for creating a table,
we are using the Data Object Editor to do this.
2. Upon selecting New, we are presented with the Data Object Editor
screen. It's a clean slate that we get to fill in, and will look similar to the
following screenshot:
There are a number of facets to this interface but we will cover just what
we need now in order to create our source tables. Later on, we'll get a
chance to explore some of the other aspects of this interface for viewing
and editing a data object. The fields to be edited in this Data Object Editor
are as follows:
The first tab it presents to us is the Name tab where we'll give a
name to the first table we're creating. We should not make up
table names here, but use the actual name of the table in the
SQL Server database. Let's starts with the Items table. We'll just
enter its name into the Name field replacing the default,
TABLE_1, which it suggested for us. The Warehouse Builder
will automatically capitalize everything we enter for consistency,
so there is no need to worry about whether we type it in
uppercase or lowercase.
Let's click on the Columns tab next and enter the information that
describes the columns of the Items table. How do we know what
to fill in here? Well, that is easy because the names must all
match the existing names as found in the source POS
transactional SQL Server database. For sizes and types, we just
have to match the SQL Server types that each field is defined
as, making allowances for slight differences between SQL
Server data types and the corresponding Oracle data types.
The following will be the columns, types, and sizes we'll use for the Items
table based on what we found in the Items source table in the POS.
transaction database:
ITEMS_KEY number(22)
ITEM_NAME varchar2(50)
ITEM_CATEGORY varchar2(50)
ITEM_VENDOR number(22)
ITEM_SKU varchar2(50)
ITEM_BRAND varchar2(50)
ITEM_LIST_PRICE number(6,2)
ITEM_DEPT varchar2(50)
c. Write a procedure to create new project in OWB. What is difference
between a module and a project?
Answer:
Steps for creating new project:
Step1: Launch the Design Center
Step2: Right-click on the project name in the Project Explorer and select
Rename from the resulting pop-up menu. Alternatively, we can select the project
name, then click on the Edit menu entry, and then on Rename.
Module:
Modules are grouping mechanisms in the Projects Navigator that correspond to
locations in the Locations Navigator. A single location can correspond to one or
more modules. However, a given module can correspond to only one metadata
location and data location at a time.
The association of a module to a location enables you to perform certain actions
more easily in Oracle Warehouse Builder. For example, group actions such as
creating snapshots, copying, validating, generating, deploying, and so on, can be
performed on all the objects in a module by choosing an action on the context
menu when the module is selected
All modules, including their source and target objects, must have locations
associated with them before they can be deployed. You cannot view source data
or deploy target objects unless there is a location defined for the associated
module.
Project contains a module(s).
Design Center
The Design Center provides the graphical interface for defining sources and
designing targets and ETL processes.
Control Center Service
The Control Center Service is the component that enables you to register
locations. It also enables deployment and execution of the ETL logic you design
in the Design Center such as mappings and process flows.
Target Schema
The target schema is the target to which you load your data and the data objects
that you designed in the Design Center such as cubes, dimensions, views, and
mappings. The target schema contains Warehouse Builder components such as
synonyms that enable the ETL mappings to access the audit/service packages in
the repository. The repository stores all information pertaining to the target
schema such as execution and deployment information.
Warehouse Builder Repository
The repository schema stores metadata definitions for all the sources, targets,
and ETL processes that constitute your design metadata. In addition to
containing design metadata, a repository can also contains the runtime data
generated by the Control Center Manager and Control Center Service.
Workspaces
In defining the repository, you create one or more workspaces, with each
workspace corresponding to a set of users working on related projects.
Repository Browser
The Repository Browser is a web browser interface for reporting on the
repository.
III. Answer any two of the following:
a. Short note on cube and dimensions. (2 marks diagram+ 3 marks
description)
Answer:
Here, sales indicate data about products sold and to be sold in a company.
The dimensions become the business characteristics about the sales, for
example:
A time dimensionusers can look back in time and check various time periods
A store dimensioninformation can be retrieved by store and location
A product dimensionvarious products for sale can be broken out
Think of the dimensions as the edges of a cube, and the intersection of the
dimensions as the measure we are interested in for that particular combination of
time, store, and product. A picture is worth a thousand words, so let's look at what
we're talking about in the following image:
Notice what this cube looks like. How about a Rubik's Cube?
Think of the width of the cube, or a row going across, as the product dimension.
Every piece of information or measure in the same row refers to the same product,
so there are as many rows in the cube as there are products. Think of the height
of the cube, or a column going up and down, as the store dimension. Every piece
of information in a column represents one single store, so there are as many
columns as there are stores. Finally, think of the depth of the cube as the time
dimension, so any piece of information in the rows and columns at the same depth
represent the same point in time. The intersection of each of these three
dimensions locates a single individual cube in the big cube, and that represents
the measure amount we're interested in. In this case, it's dollar sales for a single
product in a single store at a single point in time.
10
b. Explain the steps for importing the metadata for a flat file.
Answer:
Use the Import Metadata Wizard to import metadata definitions into modules.
The steps involved in creating the module and importing the metadata for a flat
file are:
1. The first task we need to create a new module to contain our file definition.
If we look in the Project Explorer under our project, we'll see that there is
a Files node right below the Databases node. Right-click on the Files node
and select New from the pop-up menu to launch the wizard.
2. When we click on the Next button on the Welcome screen, we notice a
slight difference already. The Step 1 of the Create Module wizard only asks
for a name and description. The other options we had for databases above
are not applicable for file modules. We'll enter a name of ACME_FILES and
click on the Next button to move to Step 2.
3. We need to edit the connection in Step 2. So we'll click on the Edit button,
we see in the following image, it only asks us for a name, a description, and
the path to the folder where the files are.
4. The Name field is prefilled with the suggested name based on the module
name. As it did for the database module location names, it adds that number
1 to the end. So, we'll just edit it to remove the number and leave it set to
ACME_FILES_LOCATION.
5. Notice the Type drop-down menu. It has two entries: General and FTP. If
we select FTP (File Transfer Protocolused for getting a file over the
network), it will ask us for slightly more information.
6. The simplest option is to store the file on the same computer on which we
are running the database. This way, all we have to do is enter the path to
the folder that contains the file. We should have a standard path we can use
for any files we might need to import in the future. So we create a folder
called GettingStartedWithOWB_files, which we'll put in the D: drive. Choose
any available drive with enough space and just substitute the appropriate
drive letter. We'll click on the Browse button on the Edit File System
Location dialog box, choose the file path, and click on the OK button.
7. We'll then check the box for Import after finish and click on the Finish
button.
That's it for the Create Module Wizard for files
d. List and explain the functionalities that can be performed by OWB in order
to create data warehouse.
Answer:
The Oracle Warehouse Builder is a tool provided by Oracle, which can be used at
every stage of the implementation of a data warehouse, from initial design and
creation of the table structure to the ETL process and data-quality auditing. So,
the answer to the question of where it fits in iseverywhere.
We can choose to use any or all of the features as needed for our project, so we
do not need to use every feature. Simple data warehouse implementations will
use a subset of the features and as the data warehouse grows in complexity, the
tool provides more features that can be implemented. It is flexible enough to
provide us a number of options for implementing our data warehouse.
List of Functions:
i.
Data modelling
ii. Extraction, Transformation, and Load (ETL)
iii. Data profiling and data quality
iv.
Metadata management
v.
Business-level integration of ERP application data
vi.
Integration with Oracle business intelligence tools for reporting purposes
vii.
Advanced data lineage and impact analysis
Oracle Warehouse Builder is also an extensible data integration and data quality
solutions platform. Oracle Warehouse Builder can be extended to manage
metadata specific to any application, and can integrate with new data source and
target types, and implement support for new data access mechanisms and
platforms, enforce your organization's best practices, and foster the reuse of
components across solutions.
IV. Answer any two of the following:
a. What is staging area? What are advantages & disadvantages of
Staging?(2+3marks)
Answer:
Staging area is the place where source data is stored temporarily into a table in
our target database. Here we can perform any transformation that are required
before loading the source data into the final target table.
Advantages:
1) It provides you a single platform even though you have heterogeneous source
systems.
2) This is the layer where the cleansed and transformed data is temporarily
stored. Once the data is ready to be loaded to the warehouse, we load it in
the staging database. The advantage of using the staging database is that we
add a point in the ETL flow where we can restart the load from. The other
advantages of using staging database is that we can directly utilize the bulk
load utilities provided by the databases and ETL tools while loading the data
in the warehouse/mart, and provide a point in the data flow where we can
audit the data.
3) In the absence of a staging area, the data load will have to go from the OLTP
system to the OLAP system directly, which in fact will severely hamper the
performance of the OLTP system. This is the primary reason for the existence
of a staging area. Without applying any business rule, pushing data into
staging will take less time because there is no business rules or
transformation applied on it.
Disadvantages:
1. It takes more space in database and it may not be cost effective for client.
2. Disadvantage of staging is disk space as we have to dump data into a local
area.
b. List and explain the use of various windows available in mapping editor. (1
mark each)
Answer:
(i)Mapping-The mapping window is the main working area on the right where we
will design the mapping. This window is also referred as canvas.
(ii)Explorer-This window is similar to project explorer in design center.It has two
tabs that is available object tab & selected object tab.
(iii)Mapping properties-The Mapping properties window display various
property that can be set for objects in our mapping. When an object is selected in
the canvas its property will be display in this window.
(iv)Palette-This palette contains each of the object that can be used in our
mapping.We can click on the object we want to place in the mapping and drag it
onto the canvas.
(v)Birds Eye View-This window display miniature version of entire canvas &
allows us to store around the canvas without using scroll bar.
10
10
c. Write the steps for validating and generating in Data Object Editor
Answer:
(I) Validating in the Data Object Editor:
When we validate from the Data Object Editor, it is on an object-byobject basis for objects appearing in the editor canvas. But when we
validate a mapping in the Mapping editor, the mapping as a whole is
validated all at once. Let's close the Data Object Editor and move on to
discuss validating in the Mapping Editor.
But as with the generation from the Design Center, we'll have the
additional information available. The procedure for generating from the
editors is the same as for validation, but the contents of the results
window will be slightly different depending on whether we're in the Data
Object Editor or the Mapping Editor. Let's discuss each individually as
we previously did.
-------------------------------------------------------------------------------(II)Generating in the Data Object Editor:
Data Object Editor and open our POS_TRANS_STAGE table in the
editor by double-clicking on it in the Design Center.
To review the options we have for generating, there is the
(i) Generate... menu entry under the Object main menu, OR
(ii) the Generate entry on the pop-up menu when we right-click on an
object,
(iii)Generate icon on the general toolbar right next to the Validate icon
as shown in the following image:
Result:
Transform is the process of converting the extracted data from its previous form
into the form it needs to be in so that it can be placed into another database.
Transformation occurs by using rules or lookup tables or by combining the data
with other data. After data is extracted, it has to be physically transported to the
target system or to an intermediate system for further processing.
Load is the process of writing the data into the target database.
ETL is used to migrate data from one database to another, to form data marts
and data warehouses and also to convert databases from one format or type to
another
10
Answer:
MOLAP stands for Multi dimensional Online Analytical Processing. MOLAP is the
most used storage type. It is designed to offer maximum query performance to the
users. The data and aggregations are stored in a multidimensional format,
compressed and optimized for performance. When a cube with MOLAP storage is
processed, the data is pulled from the relational database, the aggregations are
performed, and the data is stored in the AS database in the form of binary files.
The data inside the cube will refresh only when the cube is processed, so latency
is high.
Advantages:
Since the data is stored on the OLAP server in optimized format, queries (even
complex calculations) are faster than ROLAP.
The data is compressed so it takes up less space.
And because the data is stored on the OLAP server, you dont need to keep the
connection to the relational database.
Cube browsing is fastest using MOLAP.
Disadvantages:
This doesnt support REAL TIME i.e newly inserted data will not be available for
analysis untill the cube is processed.
b. Short note on
(iii) Metadata Snapshots
(iv)
The Import Metadata Wizard
Answer:
(i) Metadata Snapshots
A snapshot captures all the metadata information about the selected objects and
their relationships at a given point in time. While an object can only have one
current definition in a workspace, it can have multiple snapshots that describe it at
various points in time. Snapshots are stored in the Oracle Database, in contrast to
Metadata Loader exports, which are stored as separate disk files. You can,
however, export snapshots to disk files. Snapshots are also used to support the
recycle bin, providing the information needed to restore a deleted metadata object.
When you take a snapshot, you capture the metadata of all or specific objects in
your workspace at a given point in time. You can use a snapshot to detect and
report changes in your metadata. You can create snapshots of any objects that
you can access from the Projects Navigator.
A snapshot of a collection is not a snapshot of just the shortcuts in the collection
but a snapshot of the actual objects.
(ii)The Import Metadata Wizard
The Import Metadata Wizard automates importing metadata from a database into
a module in Oracle Warehouse Builder. You can import metadata from Oracle
Database and non-Oracle databases. Each module type that stores source or
target data structures has an associated Import Wizard, which automates the
process of importing the metadata to describe the data structures. Importing
metadata saves time and avoids keying errors, for example, by bringing metadata
definitions of existing database objects into Oracle Warehouse Builder.
The Welcome page of the Import Metadata Wizard lists the steps for importing
metadata from source applications into the appropriate module. The Import
Metadata Wizard for Oracle Database supports importing of tables, views,
materialized views, dimensions, cubes, external tables, sequences, user-defined
types, and PL/SQL transformations directly or through object lookups using
synonyms.
When you import an external table, Oracle Warehouse Builder also imports the
associated location and directory information for any associated flat files.
c. Explain multidimensional database architecture with suitable diagram.
Answer:
One of the design objectives of the multidimensional server is to provide fast, linear
access to data regardless of the way the data is being requested. The simplest
request is a two-dimensional slice of data from an n-dimensional hypercube. The
objective is to retrieve the data equally fast, regardless of the requested
dimensions. The requested data is a compound slice in which two or more
dimensions are nested as rows or columns
The second role of the server is to provide calculated results. By far the most
common calculation is aggregation; but more complex calculations, such as ratios
and allocations, are also required. In fact, the design goal should be to offer a
complete algebraic ability where any cell in the hypercube can be derived from any
of the others, using all standard business and statistical functions, including
conditional logic.
MeasuresMeasures are the numeric values in an OLAP database cube that are
available for analysis. The measures could be margin, cost of goods sold, unit
sales, budget amount, and so on.
MultidimensionalMultidimensional databases create cubes of aggregated data
that anticipate how users think about business models. These cubes also deliver
this information efficiently and quickly. Cubes consist of dimensions and measures.
Dimensions are categories of information. For example, locations, stores and
products are typical dimensions. Measures are the content values in a database
that are available for analysis.
MembersIn a OLAP database cube, members are the content values for a
dimension. In the location dimension, they could be Mumbai, Thane, Mulund and
so on. These are all values for location.