Sunteți pe pagina 1din 12

Master of Business Administration - MBA Semester III MI0036 Business Intelligence Tools - 4 Credits Assignment - Set- 1 (60 Marks)

) Note: Each question carries 10 Marks. Answer all the questions. Q.1 Define the term business intelligence tools? Briefly explain how the data from one end gets transformed into information at the other end? [10 Marks] Business Intelligence (BI) is a generic term used to describe leveraging the organizational internal and external data, information for making the best possible business decisions. The field of Business intelligence is very diverse and comprises the tools and technologies used to access and analyze various types of business information. These tools gather and store the Data and allow the user to view and analyze the information from a wide variety of dimensions and thereby assist the decision-makers make better business decisions. Thus the Business Intelligence (BI) systems and tools play a vital role as far as organizations are concerned in making improved decisions in the current cut throat competitive scenario. In simple terms, Business Intelligence is an environment in which business users receive reliable, consistent, meaningful and timely information. This data enables the business users conduct analyses that yield overall understanding of how the business has been, how it is now and how it will be in the near future. Also, the BI tools monitor the financial and operational health of the organization through generation of various types of reports, alerts, alarms, key performance indicators and dashboards. Business intelligence tools are a type of application software designed to help in making better business decisions. These tools aid in the analysis and presentation of data in a more meaningful way and so play a key role in the strategic planning process of an organization. They illustrate business intelligence in the areas of market research and segmentation, customer profiling, customer support, profitability, and inventory and distribution analysis to name a few. Various types of BI systems viz. Decision Support Systems, Executive Information Systems (EIS), Multidimensional Analysis software or OLAP (On-Line Analytical Processing) tools, data mining tools are discussed further. Whatever is the type, the Business Intelligence capabilities of the system is to let its users slice and dice the information from their organization's numerous databases without having to wait for their IT departments to develop complex queries and elicit answers. Although it is possible to build BI systems without the benefit of a data warehouse, most of the systems are an integral part of the user-facing end of the data warehouse in practice. In fact, we can never think of building a data warehouse without BI Systems. That is the reason; sometimes, the words data warehousing and business intelligence are being used interchangeably.

Below Figure depicts how the data from one end gets transformed to information at the other end for business information. Roles in Business Intelligence project: A typical BI Project consists of the following roles and the responsibilities of each of these roles are detailed below:

Project Manager: o Monitors the progress on continuum basis and is responsible for the success of the project. Technical Architect: o Develops and implements the overall technical architecture of the BI system, from the backend hardware/software to the client desktop configurations. Database Administrator (DBA): o Keeps the database available for the applications to run smoothly and also involves in planning and executing a backup/recovery plan, as well as performance tuning. ETL Developer: o Involves himself in planning, developing, and deploying the extraction, transformation, and loading routine for the data warehouse from the legacy systems. Front End Developer: o Develops the front-end, whether it be client-server or over the web. OLAP Developer: o Dexlops the OLAP cubes. Data Modeler: o Is responsible for taking the data structure that exists in the enterprise and model it into a scheme that is suitable for OLAP analysis. QA Group: o Ensures the correctness of the data in the data warehouse. Trainer:

Works with the end users to make them familiar with how the front end is set up so that the end users can get the most benefit out of the system.

Q. 2 what do you mean by data ware house? What are the major concepts and terminology used in the study of data warehouse? [10 Marks]

Q.3 what are the data modeling techniques used in data warehousing environment? [10 Marks] Q.4 Discuss the categories in which data is divided before structuring it into data ware house? [10 Marks] Q.5 Discuss the purpose of executive information system in an organization? [10 Marks] Q.6 Discuss the challenges involved in data integration and coordination process? [10 Marks]

Master of Business Administration - MBA Semester III MI0036 Business Intelligence Tools - 4 Credits Assignment - Set- 2 (60 Marks) Note: Each question carries 10 Marks. Answer all the questions. Q.1 Explain business development life cycle in detail? [10 Marks] Business Intelligence (BI) lifecycle refers to the computer-based techniques used in gathering, evaluating business information, such as sales revenue by products and/divisions associated prices and profits. The BI lifecycle frequently aims to enhance superior business decision-making techniques. The BI lifecycle representation mainly highlights on the iterative approach which is necessary to effectively pull out the greatest profit from investment in Business Intelligence. This approach is iterative as the BI solution needs to progress as and when the business develops. A quick example is a customer who developed a good product for standard selling cost per unit. This made sense and facilitated them to discover price demands in their market and take suitable action. On the other hand when they obtained a company whose standard selling cost was 100 times greater than the metric became deformed and required a second thought. The BI lifecycle model begins with Design phase which has suitable key performance indicators; It executes the structure with proper methodologies and equipments. The Utilise phase involves its own performance cycle. Plan phase concludes what value the key performance indicators should be, and Monitors what they are. Analyses phase is to recognize the variations. Monitoring will use the control panels and scorecards to explain the information quickly and evidently. Analyses will take the benefit of new BI tools such as Excel Microsoft ProClarity, QlikView or other professional software to investigate the information and genuinely know trends in the essential data. A significant step is then to take stock of the procedure. It is to find out how it is performing or it requires any alterations. The Refine phase is critical as it takes BI to the next significant phase, and is often misplaced from a BI program in the excitement of a successful execution. The Business Intelligence Lifecycle acts similar to the Data Maturity1 Lifecycle to offer a complete maturation model on information intelligence. Also this model gives a roadmap to utilise certain rules which helps in the development of a business intelligence program. BI Program is a long term scheme but not a short term. As the business goes on changing, and knowledge that supports analytics also improves, this lifecycle can repeat and forms a complete new performance management. Note: Key performance indicators (KPI) are very significant factors in business intelligence operation. It is the aspect, which gives information about present status of business and future action plan to develop the business. Q.2 Discuss the various components of data warehouse? [10 Marks] The data warehouse architecture is based on a relational database management system server that functions as the central repository for informational data. Operational data and processing is completely separated from data warehouse processing. This central information repository is surrounded by a number of key components designed to make the entire environment functional, manageable and accessible by both the operational systems that source data into the warehouse and by end-user query and analysis tools.

Data Warehouse Database The central data warehouse database is the cornerstone of the data warehousing environment. This database is almost always implemented on the relational database management system (RDBMS) technology. Parallel relational database designs for scalability that include shared-memory, shared disk, or shared-nothing models implemented on various multiprocessor configurations (symmetric multiprocessors or SMP, massively parallel processors or MPP, and/or clusters of uni- or multiprocessors). An innovative approach to speed up a traditional RDBMS by using new index structures to bypass relational table scans. Multidimensional databases (MDDBs) that are based on proprietary database technology; conversely, a dimensional data model can be implemented using a familiar RDBMS. Multi-dimensional databases are designed to overcome any limitations placed on the warehouse by the nature of the relational data model. Sourcing, Acquisition, Cleanup and Transformation Tools A significant portion of the implementation effort is spent extracting data from operational systems and putting it in a format suitable for informational applications that run off the data warehouse. The data sourcing, cleanup, transformation and migration tools perform all of the conversions, summarizations, key changes, structural changes and condensations needed to transform disparate data into information that can be used by the decision support tool. These tools also maintain the meta data. The functionality includes: Removing unwanted data from operational databases Converting to common data names and definitions Establishing defaults for missing data Accommodating source data definition changes The data sourcing, cleanup, extract, transformation and migration tools have to deal with some significant issues including: Database heterogeneity. DBMSs are very different in data models, data access language, data navigation, operations, concurrency, integrity, recovery etc. Data heterogeneity. This is the difference in the way data is defined and used in different models - homonyms, synonyms, unit compatibility (U.S. vs metric), different attributes for the same entity and different ways of modeling the same fact. These tools can save a considerable amount of time and effort. However, significant shortcomings do exist. Meta data Meta data is data about data that describes the data warehouse. It is used for building, maintaining, managing and using the data warehouse. Meta data can be classified into: Technical meta data, which contains information about warehouse data for use by warehouse designers and administrators when carrying out warehouse development and management tasks. Business meta data, which contains information that gives users an easy-to-understand perspective of the information stored in the data warehouse. Access Tools

The principal purpose of data warehousing is to provide information to business users for strategic decision-making. These users interact with the data warehouse using front-end tools. Many of these tools require an information specialist, although many end users develop expertise in the tools. Tools fall into four main categories: query and reporting tools, application development tools, online analytical processing tools, and data mining tools. Query and Reporting tools can be divided into two groups: reporting tools and managed query tools. Reporting tools can be further divided into production reporting tools and report writers. Production reporting tools let companies generate regular operational reports or support high-volume batch jobs such as calculating and printing paychecks. Report writers, on the other hand, are inexpensive desktop tools designed for end-users. A critical success factor for any business today is the ability to use information effectively. Data mining is the process of discovering meaningful new correlations, patterns and trends by digging into large amounts of data stored in the warehouse using artificial intelligence, statistical and mathematical techniques. Data Marts As data warehouses contain larger amounts of data, organizations often create data marts That is precise, specific to a department or product line. Thus data mart is a physical and Logical subset of an Enterprise data warehouse and is also termed as a department-specific Data warehouse. Generally, data marts are organized around a single business process. There are two types of data marts; independent and dependent. The data is fed directly from The legacy systems in case of an independent data mart and the data is fed from the enterprise Data warehouse in case of a dependent data mart. In the long run, the dependent data marts Are much more stable architecturally than the independent data marts. Q.3 Discuss data extraction process? What are the various methods being used for data extraction? [10 Marks] Data Extraction is the act or the process of extracting data out of, which is usually unstructured or badly structured, data sources for added data processing or data storage or data migration. This data can be extracted from the web. The internet pages in the html1, xml2 etc can be considered to be unstructured data source because of the wide variety in the code styles. This also includes exceptions and violations of the standard coding practices. The Logical Extraction Methods There are two kinds of logical extraction methods: Full Extraction Incremental Extraction Full Extraction In full extraction the data is extracted totally from the source system. Since, this extraction reflects all the data which is presently available on the source system there will be no need to keep track of the changes to the data source since the previous successful extraction. The

source data will be given as it is and no additional logical information for example timestamps is required on the source site. Incremental Extraction At a particular point in time, only the data that has been altered since a well-defined event back in the history will be extracted. This event might be the last time of extraction or a more difficult business event like the last booking day of a fiscal period. To recognize this delta change there must be an option to recognize all the changed information since this particular time event. This information can be either given by the source data like an application column. This might reflect the last changed timestamp or a changed table where an appropriate additional mechanism will keep track of the changes apart from the originating transactions. In most of the cases, using the latter method means adding the extraction logic to the source system. 10.5.2 Physical Extraction Methods The data can be either extracted online from the source system or from an offline structure. Such an offline structure may already exist or it may be created by an extraction routine. The following are the methods of physical extraction: Online Extraction Offline Extraction Online Extraction In online extraction the data is extracted directly from the source system itself. The extraction process can then connect directly to the source system to access the source tables themselves or to an intermediate system that keeps the data in a preconfigured manner. For example, snapshot logs or change tables. Offline Extraction The data is not extracted directly from the source system but is kept explicitly outside the original source system. The data already has an existing structure for example, redo logs, archive logs or transportable table spaces. The following structures can be considered: Flat files: In flat files the data is in a defined, generic format. The Additional information about the source object is required for further processing. Dump files: In Dump files the information about the containing objects is included. Redo and archive logs: In redo and archive logs the information is in a special, additional dump file. Transportable table spaces: Transportable table spaces are a powerful way to extract and move large volumes of data between Oracle databases. Oracle Corporation suggests that the transportable table spaces can be used whenever possible, because they can provide significant advantages in performance and manageability over the other extraction techniques. Q.4 Discuss the needs of developing OLAP tools in details? [10 Marks] The Online Analytic Processing is the ability to store and manage the data in a way, so that it can be used effectively to generate the actionable information. The OLAP is between the Data Warehouse and the End-user tools. OLAP can make the Business Intelligence happen by enabling the following:-

changing the data into multi-dimensional cubes. summarizing the pre-aggregated and the delivered data. establishing a strong query management. modeling functions. The following explains the OLAP architecture in BI architecture: Multi Dimensional Online Analytical Processing (MOLAP): Storage of OLAP data in the multi-dimensional mode. There is one array for one combination of dimension and also the associated measures. In this method there is no link between the MOLAP database and the data warehouse database for the query purpose. This means that a user cannot drill down from the MOLAP summary data to the transaction level data of data warehouse. Relational Online Analytical Processing (ROLAP): The OLAP storing the data in the relational form in the dimensional model. This is the de-normalised form in the relational data structure. The ROLAP database of the OLAP server can be linked to the Data warehouse database. Hybrid Online Analytical Processing (HOLAP): Storing the aggregated data in the multi-dimensional model in the OLAP database and keeping the transactional level data in the relational form in the Data Warehouse database. There is a link between the summary MOLAP database of OLAP and the relational transactional database of Data warehouse. OLAP Defined The OLAP can be stated in terms of just five keywords Fast Analysis of Shared Multidimensional Information. Fast, so that the most complex queries which requires not more than 5 seconds can be processed. Analysis is the process of analysing information of all the relevant kinds in order to process the complex queries and also to set up clear criteria for the results of such queries. The information that has to be used for analysis is normally obtained from a shared source, such as data warehouse. Presented in such a multidimensional detail, such data can be useful and important to managerial decisionmaking. 12.7.1 OLAP Techniques The Online analytical processing or OLAP can be implemented in many different ways. But, the most common way is to stage the information obtained from various corporate databases, for example data warehouses, is staged that is stored temporarily into the OLAP multi-dimensional databases for recovery by the front-end systems. The multidimensional database can be optimized for fast recovery. There are several techniques for speeding up the data retrieval and the analysis is implemented on the procedural side of the database management. The OLAP can be implemented by using the following techniques: Consolidation or Roll Up The Consolidation involves data aggregation which can involve simple roll-ups or complex grouping regarding inter-related data. For example, the sales office can be rolled-up to district and the district again to regions. Drill-down

The OLAP can go in the reverse direction and can display detailed data that consists of consolidated data. This is known as drill-down. For example, consider the sales done by individual products or by the sales-representatives that will make up a regions total sales can be easily accessed. Slicing and dicing The slicing and dicing relates to the ability to look at the database from different viewpoints. One slice of the sales database may show all the sales of product type within a region. Another slice may show all the sales by sales channel, present within each product type. The Slicing and dicing is regularly performed along the time axis in order to analyse the trends and the final patterns. Benefits of using OLAP OLAP responsible for several benefits for businesses: which is capable of giving, hence increasing their productivity. databases. use of analysis capabilities. application backlog, leading to faster information retrieval and reduction in the query drag. Q.5 what do you understand by the term statistical analysis? Discuss the most important statistical techniques? [10 Marks] Data mining is a relatively new data analysis technique. It is very different from query and reporting and multidimensional analysis in that is uses what is called a discovery technique. That is, you do not ask a particular question of the data but rather use specific algorithms that analyze the data and report what they have discovered. Unlike query and reporting and multidimensional analysis where the user has to create and execute queries based on hypotheses, data mining searches for answers to questions that may have not been previously asked. This discovery could take the form of finding significance in relationships between certain data elements, a clustering together of specific data elements, or other patterns in the usage of specific sets of data elements. After finding these patterns, the algorithms can infer rules. These rules can then be used to generate a model that can predict a desired behavior, identify relationships among the data, discover patterns, and group clusters of records with similar attributes. Data mining is most typically used for statistical data analysis and knowledge discovery. Statistical data analysis detects unusual patterns in data and applies statistical and Mathematical modeling techniques to explain the patterns. The models are then used to forecast and predict. Types of statistical data analysis techniques include linear and nonlinear analysis, regression analysis, multivariate analysis, and time series analysis. Knowledge discovery extracts implicit, previously unknown information from the data. This often results in uncovering unknown business facts. -sufficient due to the inbuilt flexibility given to the organised

Data mining is data driven (see Figure 4 on page 13). There is a high level of complexity in stored data and data interrelations in the data warehouse that are difficult to discover without data mining. Data mining offers new insights into the business that may not be discovered with query and reporting or multidimensional analysis. Data mining can help discover new insights about the business by giving us answers to questions we might never have thought to ask. Even within the scope of your data warehouse project, when mining data you want to define a data scope, or possibly multiple data scopes. Because patterns are based on various forms of statistical analysis, you must define a scope in which a statistically significant pattern is likely to emerge. For example, buying patterns that show different products being purchased together may differ greatly in different geographical locations. To simply lump all of the data together may hide all of the patterns that exist in each location. Of course, by imposing such a scope you are defining some, though not all, of the business rules. It is therefore important that data scoping be done in concert with someone knowledgeable in both the business and in statistical analysis so that artificial patterns are not imposed and real patterns are not lost. Data architecture modeling and advanced modeling techniques such as those suitable for multimedia databases and statistical databases are beyond the scope Q.6 what are the methods for determining the executive needs? [10 Marks] An Executive Information System (EIS) is a set of management tools supporting the information and decision-making needs of management by combining information available within the organisation with external information in an analytical framework. EIS are targeted at management needs to quickly assess the status of a business or section of business. These packages are aimed firmly at the type of business user who needs instant and up to date understanding of critical business information to aid decision making. The idea behind an EIS is that information can be collated and displayed to the user without manipulation or further processing. The user can then quickly see the status of his chosen department or function, enabling them to concentrate on decision making. Generally an EIS is configured to display data such as order backlogs, open sales, purchase order backlogs, shipments, receipts and pending orders. This information can then be used to make executive decisions at a strategic level. The emphasis of the system as a whole is the easy to use interface and the integration with a variety of data sources. It offers strong reporting and data mining capabilities which can provide all the data the executive is likely to need. Traditionally the interface was menu driven with either reports, or text presentation. Newer systems, and especially the newer Business Intelligence systems, which are replacing EIS, have a dashboard or scorecard type display.

Before these systems became available, decision makers had to rely on disparate spreadsheets and reports which slowed down the decision making process. Now massive amounts of relevant information can be accessed in seconds. The two main aspects of an EIS system are integration and visualisation. The newest method of visualisation is the Dashboard and Scorecard. The Dashboard is one screen that presents key data and organisational information on an almost real time and integrated basis. The Scorecard is another one screen display with measurement metrics which can give a percentile view of whatever criteria the executive chooses. Behind these two front end screens can be an immense data processing infrastructure, or a couple of integrated databases, depending entirely on the organisation that is using the system. The backbone of the system is traditional server hardware and a fast network. The EIS software itself is run from here and presented to the executive over this network. The databases needs to be fully integrated into the system and have real-time connections both in and out. This information then needs to be collated, verified, processed and presented to the end user, so a real-time connection into the EIS core is necessary. Executive Information Systems come in two distinct types: ones that are data driven, and ones that are model driven. Data driven systems interface with databases and data warehouses. They collate information from different sources and presents them to the user in an integrated dashboard style screen. Model driven systems use forecasting, simulations and decision tree like processes to present the data. As with any emerging and progressive market, service providers are continually improving their products and offering new ways of doing business. Modern EIS systems can also present industry trend information and competitor behaviour trends if needed. They can filter and analyse data; create graphs, charts and scenario generations; and offer many other options for presenting data. There are a number of ways to link decision making to organisational performance. From a decision maker's perspective these tools provide an excellent way of viewing data. Outcomes displayed include single metrics, trend analyses, demographics, market shares and a myriad of other options. The simple interface makes it quick and easy to navigate and call the information required. For a system that seems to offer business so much, it is used by relatively few organisations. Current estimates indicate that as few as 10% of businesses use EIS systems. One of the reasons for this is the complexity of the system and support infrastructure. It is difficult to

create such a system and populate it effectively. Combining all the necessary systems and data sources can be a daunting task, and seems to put many businesses off implementing it. The system vendors have addressed this issue by offering turnkey solutions for potential clients. Companies like Actuate and Oracle are both offering complete out of the box Executive Information Systems, and these aren't the only ones. Expense is also an issue. Once the initial cost is calculated, there is the additional cost of support infrastructure, training, and the means of making the company data meaningful to the system. Does EIS warrant all of this expense? Green King certainly thinks so. They installed a Cognos system in 2003 and their first few reports illustrated business opportunities in excess of 250,000. The AA is also using a Business Objects variant of an EIS system and they expect a return of 300% in three years. (Guardian 31/7/03) An effective Executive Information System isn't something you can just set up and leave it to do its work. Its success depends on the support and timely accurate data it gets to be able to provide something meaningful. It can provide the information executives need to make educated decisions quickly and effectively. An EIS can provide a competitive edge to business strategy that can pay for itself in a very short space of time.

S-ar putea să vă placă și