Sunteți pe pagina 1din 17

1.What are non-additive facts in detail? -------------------------------------------------------------------------------A fact may be measure, metric or a dollar value.

Measure and metric are non additive facts. Dollar value is additive fact. If we want to find out the amount for a particular place for a particular period of time, we can add the dollar amounts and come up with the total amount. A non additive fact, for e.g. measure height(s) for 'citizens by geographical location' , when we rollup 'city' data to 'state' level data we should not add heights of the citizens rather we may want to use it to derive 'count' 2.What are non-additive facts? -------------------------------------------------------------------------------# Additive: Additive facts are facts that can be summed up through all of the dimensions in the fact table. # Semi-Additive: Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table, but not the others. # Non-Additive: Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table. 3.What is ODS (operation data source) -------------------------------------------------------------------------------ODS - Operational Data Store. ODS Comes between staging area & Data Warehouse. The data is ODS will be at the low level of granularity. Once data was poopulated in ODS aggregated data will be loaded into into EDW through ODS.

4. What are parameter files? Where do we use them?

-------------------------------------------------------------------------------Parameter file is any text file where u can define a value for the parameter defined in the informatica session, this parameter file can be referenced in the session properties, When the informatica sessions runs the values for the parameter is fetched from the specified file. For e.g.: $$ABC is defined in the infomatica mapping and the value for this variable is defined in the file called abc.txt as [foldername_session_name] ABC='hello world" In the session properties u can give in the parameter file name field abc.txt 5. Can Informatica load heterogeneous targets from heterogeneous sources? -------------------------------------------------------------------------------Informatica can load Heterogeneous Targets from heterogeneous Sources. 6. What is Full load & Incremental or Refresh load? -------------------------------------------------------------------------------Full Load is the entire data dump load taking place the very first time. Gradually to synchronize the target data with source data, there are further 2 techniques:Refresh load - Where the existing data is truncated and reloaded completely. Incremental - Where delta or difference between target and source data is dumped at regular intervals. Timestamp for previous delta load has to be maintained. 7. What is a staging area? Do we need it? What is the purpose of a staging area? ------------------------------------------------------------------------------- Staging area is place where you hold temporary tables on data warehouse server. Staging tables are connected to work area or fact tables. We basically need staging area to hold the data, and perform data cleansing and merging, before loading the data into warehouse. In the absence of a staging area, the data load will have to go from the OLTP system to the OLAP system directly, which in fact will severely hamper the performance of the OLTP system. This is the primary reason for the existence of a staging area. In addition, it also offers a platform for carrying out data cleansing.

Staging area is a temp schema used to 1. Do flat mapping i.e. dumping all the OLTP data in to it without applying any business rules. Pushing data into staging will take less time because there are no business rules or transformation applied on it.

2. Used for data cleansing and validation using First Logic. A staging area is like a large table with data separated from their sources to be loaded into a data warehouse in the required format. If we attempt to load data directly from OLTP, it might mess up the OLTP because of format changes between a warehouse and OLTP. Keeping the OLTP data intact is very important for both the OLTP and the warehouse According to the complexity of the business rule, we may require staging area, the basic need of staging area is to clean the OLTP source data and gather in a place. Basically its a temporary database area. Staging area data is used for the further process and after that they can be deleted. 8. Do we need an ETL tool? When do we go for the tools in the market? -------------------------------------------------------------------------------ETL Tools are meant to extract, transform and load the data into Data Warehouse for decision making. Before the evolution of ETL Tools, the above mentioned ETL process was done manually by using SQL code created by programmers. This task was tedious and cumbersome in many cases since it involved many resources, complex coding and more work hours. On top of it, maintaining the code placed a great challenge among the programmers. These difficulties are eliminated by ETL Tools since they are very powerful and they offer many advantages in all stages of ETL process starting from extraction, data cleansing, data profiling, transformation, debugging and loading into data warehouse when compared to the old method. 9. How can we use mapping variables in Informatica? Where do we use them? -------------------------------------------------------------------------------After creating a variable, we can use it in any expression in a mapping or a mapplet. Also they can be used in source qualifier filter, user defined joins or extract overrides and in expression editor of reusable transformations. Their values can change automatically between sessions.

10. Techniques of Error Handling - Ignore, Rejecting bad records to a flat file, loading the records a --------------------------------------------------------------------------------

Rejection of records either at the database due to constraint key violation or the informatica server when writing data into target table. These rejected records we can find in the bad files folder where a reject file will be created for a session. We can check why a record has been rejected. And this bad file contains first column a row indicator and second column a column indicator. These row indicators or of four types D-valid data, O-overflowed data, N-null data, T- Truncated data, And depending on these indicators we can changes to load data successfully to target. 11. How do we call shell scripts from informatica? -------------------------------------------------------------------------------You can use a Command task to call the shell scripts, in the following ways: 1. Standalone Command task. You can use a Command task anywhere in the workflow or worklet to run shell commands. 2. Pre- and post-session shell command. You can call a Command task as the pre- or post-session shell command for a Session task. For more information about specifying pre-session and post-session shell commands. 12. What is the difference between connected and unconnected stored procedure -------------------------------------------------------------------------------Unconnected: The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. It either runs before or after the session, or is called by an expression in another transformation in the mapping. Connected: The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation. All data entering the transformation through the input ports affects the stored procedure. You should use a connected Stored Procedure transformation when you need data from an input port sent as an input parameter to the stored procedure, or the results of a stored procedure sent as an output parameter to another transformation. 13. What is a junk dimension? --------------------------------------------------------------------------------

A "junk" dimension is a collection of random transactional codes, flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes. A good example would be a trade fact in a company that brokers equity trades. 14. Discuss the advantages & Disadvantages of star & snowflake schema? -------------------------------------------------------------------------------In a star schema every dimension will have a primary key. In a star schema, a dimension table will not have any parent table. Whereas in a snow flake schema, a dimension table will have one or more parent tables. Hierarchies for the dimensions are stored in the dimensional table itself in star schema. Whereas hierarchies are broken into separate tables in snow flake schema. These hierarchies help to drill down the data from topmost hierarchies to the lowermost hierarchies. 15. What is a time dimension? Give an example. ------------------------------------------------------------------------------- Time dimension is one of important in Data warehouse. Whenever u generated the report, that time u accesses all data from thro time dimension. E.g. employee time dimension Fields: Date key, full date, day of week, day, month, quarter, fiscal year. In a relational data model, for normalization purposes, year lookup, quarter lookup, month lookup, and week lookups are not merged as a single table. In a dimensional data modeling (star schema), these tables would be merged as a single table called TIME DIMENSION for performance and slicing data. This dimension helps to find the sales done on date, weekly, monthly and yearly basis. We can have a trend analysis by comparing this year sales with the previous year or this week sales with the previous week. A TIME DIMENSION is a table that contains the detail information of the time at which a particular 'transaction' or 'sale' (event) has taken place. The TIME DIMENSION has the details of DAY, WEEK, MONTH, QUARTER, and YEAR 16. How can U create or import flat file definition in to the warehouse designer?

------------------------------------------------------------------------------- U can create flat file definition in warehouse designer. In the warehouse designer can create new target: select the type as flat file. Save it and u can enter various columns for that created target by editing its properties. Once the target is created, save it. U can import it from the mapping designer. U can not create or import flat file defintion in to warehouse designer directly. Instead U must analyze the file in source analyzer, and then drag it into the warehouse designer. When U drag the flat file source definition into warehouse designer workspace, the warehouse designer creates a relational target definition not a file defintion.If u want to load to a file, configure the session to write to a flat file. When the informatica server runs the session, it creates and loads the flat file.

17. What are 2 modes of data movement in Informatica Server? -------------------------------------------------------------------------------The data movement mode depends on whether Informatica Server should process single byte or multi-byte character data. This mode selection can affect the enforcement of code page relationships and code page validation in the Informatica Client and Server. a) Unicode - IS allows 2 bytes for each character and uses additional byte for each nonASCII character (such as Japanese characters) b) ASCII - IS holds all data in a single byte The IS data movement mode can be changed in the Informatica Server configuration parameters. This comes into effect once you restart the Informatica Server.

18. What is Load Manager? --------------------------------------------------------------------------------

I am providing the answer which I have taken it from Informatica 7.1.1 manual, While running a Workflow, the PowerCenter Server uses the Load Manager process and the Data Transformation Manager Process (DTM) to run the workflow and carry out workflow tasks. When the PowerCenter Server runs a workflow, the Load Manager performs the following tasks: 1. Locks the workflow and reads workflow properties. 2. Reads the parameter file and expands workflow variables. 3. Creates the workflow log file. 4. Runs workflow tasks. 5. Distributes sessions to worker servers. 6. Starts the DTM to run sessions. 7. Runs sessions from master servers. 8. Sends post-session email if the DTM terminates abnormally. When the PowerCenter Server runs a session, the DTM performs the following tasks: 1. Fetches session and mapping metadata from the repository. 2. Creates and expands session variables. 3. Creates the session log file. 4. Validates session code pages if data code page validation is enabled. Checks query Conversions if data code page validation is disabled. 5. Verifies connection object permissions. 6. Runs pre-session shell commands. 7. Runs pre-session stored procedures and SQL. 8. Creates and runs mapping, reader, writer, and transformation threads to extract, transform, and load data. 9. Runs post-session stored procedures and SQL. 10. Runs post-session shell commands. 11. Sends post-session email. 19. What is Data cleansing..? -------------------------------------------------------------------------------This is nothing but polishing of data. For example of one of the sub system store the Gender as M and F. The other may store it as MALE and FEMALE. So we need to polish this data, clean it before it is add to Data warehouse. Other typical example can be Addresses. The all sub systems maintains the customer address can be different. We might need an address cleansing to tool to have the customers addresses in clean and neat form. 20. What r the reusable transformations? ---------------------------------------------------------------------------------

Reusable transformations can be used in multiple mappings. When u need to incorporate this transformation into mapping add an instance of it to maping.Later if U change the definition of the transformation, all instances of it inherit the changes. Since the instance of reusable transformation is a pointer to that transformation can change the transformation in the transformation developer, its instances automatically reflect these changes? This feature can save U great deal of work. 21. Methods for creating reusable transformations? -------------------------------------------------------------------------------Two methods 1. Design it in the transformation developer. By default its a reusable transform. 2. Promote a standard transformation from the mapping designer. After U adds a transformation to the mapping, U can promote it to the status of reusable transformation. Once U promotes a standard transformation to reusable status CANNOT demote it to a standard transformation at any time. If u change the properties of a reusable transformation in mapping can revert it to the original reusable transformation properties by clicking the revert button. 22. What r the unsupported repository objects for a mapplet? ----------------------------------------------------------------------------COBOL source definition Joiner transformations Normalize transformations Non reusable sequence generator transformations. Pre or post session stored procedures Target definitions Power mart 3.5 style Look Up functions XML source definitions IBM MQ source definitions 23. Can u use the mapping parameters or variables created in one mapping into any other reusable transformation? ----------------------------------------------------------------------------Yes. Because reusable transformation is not contained with any maplet or mapping.

24. What is aggregate cache in aggregator transformation? -------------------------------------------------------------------------------------

The aggregator stores data in the aggregate cache until it completes aggregate calculations. When u run a session that uses an aggregator transformation, the informatica server creates index and data caches in memory to process the transformation. If the informatica server requires more space, it stores overflow values in cache files. 25. What r the difference between joiner transformation and source qualifier transformation? -----------------------------------------------------------------------------------U can join heterogeneous data sources in joiner transformation which we can not achieve in source qualifier transformation. U needs matching keys to join two relational sources in source qualifier transformation. Where as u doesnt need matching keys to join two sources. Two relational sources should come from same data source in sourcequalifier.U can join relational sources which r coming from different sources also. 26. In which conditions we can not use joiner transformation (Limitations of joiner transformation)? ------------------------------------------------------------------------------------Both pipelines begin with the same original data source. Both input pipelines originate from the same Source Qualifier transformation. Both input pipelines originate from the same Normalizer transformation. Both input pipelines originate from the same Joiner transformation. Either input pipelines contains an Update Strategy transformation. Either input pipelines contains a connected or unconnected Sequence Generator transformation. 27. What r the joiner caches? --------------------------------------------------When a Joiner transformation occurs in a session, the Informatica Server reads all the records from the master source and builds index and data caches based on the master rows. After building the caches, the Joiner transformation reads records from the detail source and performs joins.

28.Difference between static cache and dynamic cache

Static cache U can not insert or update the cache

Dynamic cache U can insert rows into the cache as u pass

The informatic server returns a value from the lookup table or cache when the condition is true. When the condition is not true, informatica server returns the default value for connected transformations and null for unconnected transformations.

to the target The informatic server inserts rows into cache when the condition is false.This indicates that the the row is not in the cache or target table. U can pass these rows to the target table

29. How the informatica server sorts the string values in Rank transformation? ---------------------------------------------------------------------------------------------------When the informatica server runs in the ASCII data movement mode it sorts session data using Binary sortorder.If U configure the seeion to use a binary sort order,the informatica server caluculates the binary value of each string and returns the specified number of rows with the higest binary values for the string. 30. What is the Rank index in Rank transformation? --------------------------------------------------------------------The Designer automatically creates a RANKINDEX port for each Rank transformation. The Informatica Server uses the Rank Index port to store the ranking position for each record in a group. For example, if you create a Rank transformation that ranks the top 5 salespersons for each quarter, the rank index numbers the salespeople from 1 to 5. 31. What r the types of data that passes between informatica server and stored procedure? -----------------------------------------------------------------------------3 types of data Input/Out put parameters Return Values Status code.

32. What is the status code? ------------------------------------------------Status code provides error handling for the informatica server during the session. The stored procedure issues a status code that notifies whether or not stored procedure

completed sucessfully.This value can not seen by the user. It only used by the informatica server to determine whether to continue running the session or stop. 33. What r the tasks that source qualifier performs? ------------------------------------------------------------------------Join data originating from same source data base. Filter records when the informatica server reads source data. Specify an outer join rather than the default inner join Specify sorted records. Select only distinct values from the source. Creating custom query to issue a special SELECT statement for the informatica server to read source data. 34. What is the target load order? ---------------------------------------------U specify the target load order based on source qualifiers in a maping.If u have the multiple Source qualifiers connected to the multiple targets can designate the order in which informatica Server loads data into the targets. 35. What is update strategy transformation? -------------------------------------------------------------This transformation is used to maintain the history data or just most recent changes in to target table. 36. What is Data driven? -------------------------------------The informatica server follows instructions coded into update strategy transformations with in the session mapping determine how to flag records for insert, update, delete or reject. If u does not choose data driven option setting, the informatica server ignores all update strategy transformations in the mapping. 37. What r the types of mapping wizards that r to be provided in Informatica? ---------------------------------------------------------------------------------------The Designer provides two mapping wizards to help you create mappings quickly and easily. Both wizards are designed to create mappings for loading and maintaining star schemas, a series of dimensions related to a central fact table.

Getting Started Wizard. Creates mappings to load static fact and dimension tables, as well as slowly growing dimension tables. Slowly Changing Dimensions Wizard. Creates mappings to load slowly changing dimension tables based on the amount of historical dimension data you want to keep and the method you choose to handle historical dimension data. 38. What r the types of mapping in Getting Started Wizard? -------------------------------------------------------------------Simple Pass through mapping: Loads a static fact or dimension table by inserting all rows. Use this mapping when you want to drop all existing data from your table before loading new data. Slowly Growing target: Loads a slowly growing fact or dimension table by inserting new rows. Use this mapping to load new data when existing data does not require updates. 39. What r the mappings that we use for slowly changing dimension table? ------------------------------------------------------------------------Type1: Rows containing changes to existing dimensions are updated in the target by overwriting the existing dimension. In the Type 1 Dimension mapping, all rows contain current dimension data. Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table. Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target. Changes are tracked in the target table by versioning the primary key and creating a version number for each dimension in the table. Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. Version numbers and versioned primary keys track the order of changes to each dimension. Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target. Rows containing changes to existing dimensions are updated in the target. When updating an existing dimension, the Informatica Server saves existing data in different columns of the same row and replaces the existing data with the updates. 40. What r the different types of Type2 dimension mapping? --------------------------------------------------------------Type2 Dimension/Version Data Mapping: In this mapping the updated dimension in the source will gets inserted in target along with a new version number. And newly added dimension

In source will inserted into target with a primary key. Type2 Dimension/Flag current Mapping: This mapping is also used for slowly changing dimensions. In addition it creates a flag value for changed or new dimension. Flag indicates the dimension is new or newlyupdated.Recent dimensions will gets saved with current flag value 1. And updated dimensions r saved with the value 0. Type2 Dimension/Effective Date Range Mapping: This is also one flavour of Type2 mapping used for slowly changing dimensions. This mapping also inserts both new and changed dimensions in to the target. And changes r tracked by the effective date range for each version of each dimension. 41. How can u recognize whether or not the newly added rows in the source r gets insert in the target? ----------------------------------------------------------In the Type2 mapping we have three options to recognize the newly added rows Version number Flag value Effective date Range. 42. What r two types of processes that informatica runs the session? ------------------------------------------------------------------Load manager Process: Starts the session, creates the DTM process, and sends post-session email when the session completes. The DTM process. Creates threads to initialize the session, read, write, and transform data, and handle pre- and post-session operations. 43. What is metadata reporter? -----------------------------------------------It is a web based application that enables you to run reports against repository metadata. With a Meta data reporter can access information about Ur repository with out having knowledge of sql, transformation language or underlying tables in the repository.

44. How the informatica server increases the session performance through partitioning the source? -------------------------------------------------------------------------------------For a relational sources informatica server creates multiple connections for each partition of a single source and extracts separate range of data for each connection.Informatica

server reads multiple partitions of a single source concurently.Similarly for loading also informatica server creates multiple connections to the target and loads partitions of data concurrently. For XML and file sources, informatica server reads multiple files concurently.For loading the data informatica server creates a separate file for each partition (of a source file).U can choose to merge the targets. 45. Why u use repository connectivity? ------------------------------------------------------When u edit, schedule the session each time, informatica server directly communicates the repository to check whether or not the session and users r valid. All the metadata of sessions and mappings will be stored in repository. 46. What is DTM process? -------------------------------------After the load manger performs validations for session, it creates the DTM process.DTM is to create and manage the threads that carry out the session tasks.I creates the master thread. Master thread creates and manages all the other threads. 47. What r the different threads in DTM process? -----------------------------------------------------------------Master thread: Creates and manages all other threads Mapping thread: One mapping thread will be creates for each session.Fectchs session and mapping information. Pre and post session threads: This will be created to perform pre and post session operations. Reader thread: One thread will be created for each partition of a source. It reads data from source. Writer thread: It will be created to load data to the target. Transformation thread: It will be created to transform data. 48. What r the out put files that the informatica server creates during the session running? -------------------------------------------------------------------------------------------------

Informatica server log: Informatica server (on UNIX) creates a log for all status and error messages (default name: pm.server.log).It also creates an error log for error messages. These files will be created in informatica home directory. Session log file: Informatica server creates session log file for each session. It writes information about session into log files such as initialization process, creation of sql commands for reader and writer threads, errors encountered and load summary. The amount of detail in session log file depends on the tracing level that u set. Session detail file: This file contains load statistics for each target in mapping. Session detail include information such as table name, number of rows written or rejected can view this file by double clicking on the session in monitor window Performance detail file: This file contains information known as session performance details which helps U where performance can be improved. To generate this file select the performance detail option in the session property sheet. Reject file: This file contains the rows of data that the writer does not write to targets. Control file: Informatica server creates control file and a target file when U runs a session that uses the external loader. The control file contains the information about the target flat file such as data format and loading instructions for the external loader. Post session email: Post session email allows U to automatically communicate information about a session run to designated recipents.U can create two different messages. One if the session completed successfully the other if the session fails. Indicator file: If u uses the flat file as a target, U can configure the informatica server to create indicator file. For each target row, the indicator file contains a number to indicate whether the row was marked for insert, update, delete or reject. Output file: If session writes to a target file, the informatica server creates the target file based on file properties entered in the session property sheet. Cache files: When the informatica server creates memory cache it also creates cache files. For the following circumstances informatica server creates index and data cache files. Aggregator transformation Joiner transformation Rank transformation Lookup transformation 49. What is polling? ---------------------------------

It displays the updated information about the session in the monitor window. The monitor window displays the status of each session when U poll the informatica server 48. Can u copy the session to a different folder or repository? -----------------------------------------------------Yes. By using copy session wizard u can copy a session in a different folder or repository. But that target folder or repository should consist of mapping of that session. If target folder or repository is not having the mapping of copying session, U should have to copy that mapping first before u copy the session 49. What is batch and describe about types of batches? ---------------------------------------------------------------------Grouping of session is known as batch. Batches r two types Sequential: Runs sessions one after the other Concurrent: Runs session at same time. If u have sessions with source-target dependencies u have to go for sequential batch to start the Sessions one after another. If u have several independent sessions u can use concurrent batches. Which runs all the sessions at the same time. 50. What is a command that used to run a batch? ---------------------------------------------------------------pmcmd is used to start a batch.

51. What r the session parameters? -------------------------------------------------------

Session parameters r like mapping parameters, represent values U might want to change between Sessions such as database connections or source files. Server manager also allows U to create user defined session parameters. Following r user defined Session parameters. Database connections Source file names: use this parameter when u want to change the name or location of Session source file between session runs Target file name: Use this parameter when u wants to change the name or location of Session target file between session runs. Reject file name: Use this parameter when u want to change the name or location of Session reject files between session runs

S-ar putea să vă placă și