DWH Interview Q&A

Informatica Interview Questions & Answers
how can we eliminate duplicate rows from flat file? Use Sorter Transformation. When you configure the Sorter Transformation to treat output rows as distinct, it configures all ports as part of the sort key. It therefore discards duplicate rows compared during the sort operation Discuss which is better among incremental load, Normal Load and Bulk load If the database supports bulk load option from Informatica then using BULK LOAD for initial loading the tables is recommended. Depending upon the requirement we should choose between Normal and incremental loading strategies. But Normal load is Better. why dimension tables are DE normalized in nature ? Because in Data warehousing historical data should be maintained, to maintain historical data means suppose one employee details like where previously he worked, and now where he is working, all details should be maintain in one table, if u maintain primary key it wont allow the duplicate records with same employee id. so to maintain historical data we are all going for concept data warehousing by using surrogate keys we can achieve the historical data(using oracle sequence for critical column). so all the dimensions are marinating historical data, they are de normalized, because of duplicate entry means not exactly duplicate record with same employee number another record is maintaining in the table. How to retrive the records from a rejected file. explane with syntax or example Every time u run the session, one reject file will be created and all the reject files will be there in the reject file. u can modify the records and correct the things in the records and u can load them to the target directly from the reject file using Regect loader. how to get the first 100 rows from the flat file into the target? 1. Use test download option if you want to use it for testing. 2. Put counter/sequence generator in mapping and perform it. What is Data cleansing..? This is nothing but polishing of data. For example of one of the sub system store the Gender as M and F. The other may store it as MALE and FEMALE. So we need to polish this data, clean it before it is add to Data warehouse. Other typical example can be Addresses. The all sub systems maintains the customer address can be different. We might need a address cleansing to tool to have the customers addresses in clean and neat form. what is a transformation? A transformation is repository object that pass data to the next stage(i.e to the next transformation or target) with/with out modifying the data. How do you create single lookup transformation using multiple tables? If you want single lookup values to be used in multiple target tables this can be done!!! For this we can use unconnected lookup and can collect the values from source table in any target table depending upon the business rule can we eliminate duplicate rows by using filter and router transformation ? We can eliminate the duplicate rows by checking the distinct option in the properties of the transformation and can use SQL query for uniqueness if the source is Relational

But if the source is Flat file then u should use Shorter or Aggregator transformation
If a session fails after loading of 10,000 records in to the target. How can u load the records from 10001th record when u run the session next time in Informatica 6.1? Running the session in recovery mode will work, but the target load type should be normal. If its bulk then recovery wont work as expected. or Skip the number of initial rows, to skip to 10001 in the source type option from session property. At the max how many transformations can be used in a mapping? In a mapping we can use any number of transformations depending on the project, and the included transformations in the particular related transformations. How many types of dimensions are available in Informatica? One major classification we use in our real time modeling is Slowly Changing Dimensions type1 SCD: If you want to load an updated row of previously existed row the previous data will be replaced. So we lose historical data. type2 SCD: Here we will add a new row for updated data. So we have both current and past records, which aggrees with the concept of datawarehousing maintaining historical data. type3 SCD: Here we will add new columns. but mostly used one is type2 SCD. we have one more type of dimension that is CONFORMED DIMENSION: The dimension which gives the same meaning across different star schemas is called Conformed dimension. Ex: Time dimension. Where ever it was, gives the same meaning What are variable ports and list two situations when they can be used? We have mainly three ports Import, out port, Variable port. Import represents data is flowing into transformation. Out port is used when data is mapped to next transformation. Variable port is used when mathematical calculations are required. What is the look up transformation? Use lookup transformation in our mapping to lookup data in a relational table, view , synonym. Informatica server queries the look up table based on the lookup ports in the transformation. It compares the lookup transformation port values to lookup table column values based on the look up condition. How can you improve the performance of Aggregate transformation? we can improve the aggregator performance in the following ways

1. Send sorted input. 2. Increase aggregator cache size. I.e. Index cache and data cache. 3. Give input/output what you need in the transformation. I.e. reduce number of input and output ports. What is the difference between PowerCenter 6 and PowerCenter 7? 1) Lookup the flat files in Informatica 7.X but we cant lookup flat files in Informatica 6.X 2) External Stored Procedure Transformation is not available in Informatica 7.X but this transformation included in Informatica 6.X, Also custom transformation is not available in 6.x The main difference is the version control available in 7.x Session level error handling is available in 7.x XML enhancements for data integration in 7.x
In my source table 1000 records are there. I want to load 501 rec to 1000 rec into my Target table?how can u do this ? In db2 we write statement as fetch first 500 rows only.in Informatica we can do by using sequence generator and filter out the row when exceeds 500. or You can override the SQL Query in Workflow Manager. Like. select * from tab_name where rownum<=1000 minus select * from tab_name where rownum<=500; how is the union transformation active transformation? Active Transformation: the transformation that change the no. of rows in the Target. Source (100 rows) > Active Transformation > Target (< or > 100 rows) Passive Transformation: the transformation that does not change the no. of rows in the Target. Source (100 rows) > Passive Transformation > Target (100 rows) Union Transformation: in Union Transformation, we may combine the data from two (or) more sources. Assume, Table-1 contains 10 rows and Table-2 contains 20 rows. If we combine the rows of Table-1 and Table-2, we will get a total of 30 rows in the Target. So, it is definetly an Active Transformation. Why we use lookup transformations? Use a Lookup transformation in your mapping to look up data in a relational table, view, or synonym. Import a lookup definition from any relational database to which both the Informatica Client and Server can connect. You can use multiple Lookup transformations in a mapping. what are the UTPS? Utps are done to check the mappings are done according to given business rules.utp is the (unit test plan ) done by deveploper.

what are the transformations that restrict the partitioning of sessions? *Advanced External procedure transformation and External procedure transformation: This Transformation contains a check box on the properties tab to allow partitioning. *Aggregator Transformation: If you use sorted ports you cannot partition the associated source *Joiner Transformation: you can not partition the master source for a joiner transformation *Normalizer Transformation *XML targets. How do u import VSAM files from source to target? In mapping Designer we have direct option to import files from VSAM Navigation : Sources => Import from file => file from COBOL how to get two targets T1 containing distinct values and T2 containing duplicate values from one source S1. Use filter transformation for loading the target with no duplicates. and for the other transformation load it directly from source. If you want to create indexes after the load process which transformation you choose? Its usually not done in the mapping(transformation) level. Its done in session level. Create a command task which will execute a shell script (if Unix) or any other scripts which contains the create index command. Use this command task in the workflow after the session or else, You can create it with a post session command. Differences between Informatica 6.2 and Informatica 7.0. Features in 7.1 are : 1.union and custom transformation 2.lookup on flat file 3.grid servers working on different operating systems can coexist on same server 4.we can use pmcmdrep 5.we can export independent and dependent rep objects 6.we ca move mapping in any web application 7.version controlling 8.data profilling What are main advantages and purpose of using Normalizer Transformation in Informatica? Normalizer Transformation is used mainly with COBOL sources where most of the time data is stored in de-normalized format. Also, Normalizer transformation can be used to create multiple rows from a single row of data. Differences between connected and unconnected lookup? Connected Lookup Receives input values directly from the pipeline. Unconnected Lookup Receives input values from the result of a :LKP expression in another

transformation. You can use a dynamic or static cache. Cache includes all lookup columns used in the mapping (that is, lookup source columns included in the lookup condition and lookup source columns linked as output ports to other transformations). Can return multiple columns from the same row or insert into the dynamic lookup cache. If there is no match for the lookup condition, the PowerCenter Server returns the default value for all output ports. If you configure dynamic caching, the PowerCenter Server inserts rows into the cache or leaves it unchanged. If there is a match for the lookup condition, the PowerCenter Server returns the result of the lookup condition for all lookup/output ports. If you configure dynamic caching, the PowerCenter Server either updates the row the in the cache or leaves the row unchanged. You can use a static cache. Cache includes all lookup/output ports in the lookup condition and the lookup/return port. Designate one return port (R). Returns one column from each row. If there is no match for the lookup condition, the PowerCenter Server returns NULL.
If there is a match for the lookup condition, the PowerCenter Server returns the result of the lookup condition into the return port. Pass one output value to another transformation. The lookup/output/return port passes the value to the transformation calling :LKP expression. Does not support user-defined default values.
Pass multiple output values to another transformation. Link lookup/output ports to another transformation.
Supports user-defined default values.
What r the connected or unconnected transforamations? An unconnected transforamtion is not connected to other transformations in the mapping.Connected transforamation is connected to other transforamtions in the mapping. how we validate all the mappings in the repository at once? You can not validate all the mappings in one go. But you can validate all the mappings in a folder in one go and continue the process for all the folders. For dooing this, log on to the repository manager. Open the folder, then the mapping sub folder, then select all or some of the mappings(by pressing the shift or control key, ctrl+A does not work) and then rightclick and validate. why did u use update stategy in your application? Update Strategy is the most important transformation of all Informatica transformations. (more) what is Partitioning ? where we can use Partition? wht is advantages?Is it necessary? In informatica we can tune performance in 5 different levels that is at source level,target level,mapping level,session level and at network level. So to tune the performance at session level we go for partitioning and again we have 4 types of partitioning

those are pass through, hash,round robin,key range. pass through is the default one. In hash again we have 2 types that is userdefined and automatic. round robin can not be applied at source level, it can be used at some transformation level key range can be applied at both source or target levels. what is the exact meaning of domain? Domain in Informatica means - A central Global repository (GDR) along with the registered Local repositories (LDR) to this GDR. This is possible only in PowerCenter and not PowerMart. Diff between informatica repositry server & informatica server Informatica Repository Server:Its manages connections to the repository from client application. Informatica Server:Its extracts the source data,performs the data transformation,and loads the transformed data into the target what is the new lookup port in look-up transformation? this button creates a port where we can enter the name, datatype,of a port. This is mainly used when using unconnected lookup this reflects the datatype of the input port.
What is the use of incremental aggregation? Explain me in brief with an example. Incremental aggregation is in session properties i have 500 records in my source and again i got 300 records if u r not using incremental aggregation what are calculation r using on 500 records again that calculation will be done on 500+ 300 records, if u r using incremental aggregation calculation will be done one only what are new records (300) that will be calculated dur to this one performance will increasing. What is source qualifier transformation? When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session. (more) What r the types of groups in Router transformation? A Router transformation has the following types of groups: Input Output
Input Group The Designer copies property information from the input ports of the input group to create a set of output ports for each output group. Output Groups There are two types of output groups:

User-defined groups Default group
You cannot modify or delete output ports or their properties. What r the diffrence between joiner transformation and source qualifier transformation? U can join hetrogenious data sources in joiner transformation which we can not achieve in source qualifier transformation. U need matching keys to join two relational sources in source qualifier transformation.Where as u doesnt need matching keys to join two sources. Two relational sources should come from same datasource in sourcequalifier.U can join relatinal sources which r coming from diffrent sources also. How can U improve session performance in aggregator transformation? You can use the following guidelines to optimize the performance of an Aggregator transformation. Use sorted input to decrease the use of aggregate caches. Sorted input reduces the amount of data cached during the session and improves session performance. Use this option with the Sorter transformation to pass sorted data to the Aggregator transformation. Limit connected input/output or output ports. Limit the number of connected input/output or output ports to reduce the amount of data the Aggregator transformation stores in the data cache. Filter before aggregating. If you use a Filter transformation in the mapping, place the transformation before the Aggregator transformation to reduce unnecessary aggregation. How many types of TASKS we have in Workflomanager? What r they? 1) session2) command 3) email4) event-wait5) event-raise6) assignment7) control8) decision9) timer10) worklet3), , 9) are self explanatory. 1) run mappings. 2) run OS commands/scripts. 4 + 5) raise user-defined or pre-defined events and wait for the the event to be raised. 6) assign values to workflow var 10) run worklets. How do we load from PL/SQL script into Informatica mapping? You can use StoredProcedure transformation. There you can specify the pl/sql procedure name. when we run the session containing this transformation the pl/sql procedure will get executed. How to move the mapping from one database to another? here are 2 ways of doing this.1. Open the mapping you want to migrate. Go to File Menu - Select Export Objects and give a name - an XML file will be generated. Connect to the repository where you want to migrate and then select File Menu - Import Objects and select the XML file name. 2. Connect to both the repositories.Go to the source folder and select mapping name from the object navigator and select copy from Edit menu. Now, go to the target folder and select Paste from Edit menu. Be sure you open the target folder. How to join two tables without using the Joiner Transformation.

Itz possible to join the two or more tables by using source qualifier.But provided the tables should have relationship. When u drag n drop the tables u will getting the source qualifier for each table.Delete all the source qualifiers.Add a common source qualifier for all.Right click on the source qualifier u will find EDIT click on it.Click on the properties tab,u will find sql query in that u can write ur sqls. Compare Data Warehousing Top-Down approach with Bottom-up approach? in top down approch: first we have to build dataware house then we will build data marts. which will need more crossfunctional skills and timetaking process also costly. in bottom up approach: first we will build data marts then data warehuse. the data mart that is first build will remain as a proff of concept for the others. less time as compared to above and less cost. How can you say that union Transormation is Active transformation? By Definition, Active transformation is the transformation that changes the number of rows that pass through itin union transformation the number of rows resulting from union can be (are) different from the actual number of rows. How to lookup the data on multiple tabels. When you create lookup transformation that time INFA asks for table name so you can choose either source, target , import and skip. So click skip and the use the sql overide property in properties tab to join two table for lookup. can we modify the data in flat file? Yes Just open the text file with notepad, change what ever you want (but datatype should be the same). Is a fact table normalized or de-normalized? Dimensional models combine normalized and denormalized table structures. The dimension tables of descriptive information are highly denormalized with detailed and hierarchical roll-up attributes in the same table. Meanwhile, the fact tables with performance metrics are typically normalized. While we advise against a fully normalized with snowflaked dimension attributes in separate tables (creating blizzard-like conditions for the business user), a single denormalized big wide table containing both metrics and descriptions in the same table is also ill-advised. What is Router transformation? Router is an Active and Connected transformation. It is similar to filter transformation. But in filter transformation eliminate the data that do not meet the condition whereas router has an option to capture the data that do not meet the condition. It is useful to test multiple conditions. It has input, output and default groups. For example, if we want to filter data like where deptno=10, deptno=20, deptno=30 and all other deptnos. Its easy to route data to different tables. In Dimensional modeling fact table is normalized or denormalized? In Dimensional modeling, Star Schema: A Single Fact table will be surrounded by a group of Dimensional tables comprise of denormalized data Snowflake Schema: A Single Fact table will be surrounded by a group of Dimensional tables comprised of normalized data. (more) what are the difference between view and materialized view? A view is just a stored query and has no physical part. Once a view is instantiated, performance can be quite good, until it is aged out of the cache. A materialized view has a physical table associated with it; it doesnt have to resolve the query each time it is queried. Depending on how large a result set and how complex the query, a materialized view should perform better. How do you transfer the data from data warehouse to flat file?

You can write a mapping with the flat file as a target using a DUMMY_CONNECTION. A flat file target is built by pulling a source into target space using Warehouse Designer tool. what is hash table informatica? In hash partitioning, the Informatica Server uses a hash function to group rows of data among partitions. The Informatica Server groups the data based on a partition key.Use hash partitioning when you want the Informatica Server to distribute rows to the partitions by group. For example, you need to sort items by item ID, but you do not know how many items have a particular ID number.
how do u check the source for the latest records that are to be loaded into the target? a) Create a lookup to target table from Source Qualifier based on primary Key. b) Use and expression to evaluate primary key from target look-up. ( If a new source record look-up primary key port for target table should return null). Trap this with decode and proceed. what is the logic will you implement to load the data in to one factv from n number of dimension tables? To load data into one fact table from more than one dimension tables . firstly u need to create afact table and dimension tables. later load data into individual dimensions by using sources and transformations(aggregator,sequence generator,lookup) in mapping designer then to the fact table connect the surrogate to the foreign key and the columns from dimensions to the fact. Without using Update strategy and sessons options, how we can do the update our target table? if ur database is teradata we can do it with a tpump or mload external loader. update override in target properties is used basically for updating the target table based on a non key column.e.g update by ename.its not a key column in the EMP table.But if u use a UPD or session level properties it necessarily should have a PK. What are the Differences between Informatica Power Center versions 6.2 and 7.1, also between Versions 6.2 and 5.1? the main difference between informatica 5.1 and 6.1 is that in 6.1 they introduce a new thing called repository server and in place of server manager(5.1), they introduce workflow manager and workflow monitor. In ver 7x u have the option of looking up (lookup) on a flat file. U can write to XML target. how to enter same record twice in target table? Declare Target table twice in the mapping and move the output to both the target tables. or use this syntax, insert into table1 select * from table1 (table1 is the name of the table) Informatica Live Interview Questions Explain grouped cross tab? Explain reference cursor What are parallel querys and query hints (more) How do u select duplicate rows using informatica?

You can write SQL override in the source qualifier (to eliminate duplicates). For that we can use distinct keyword.For example : consider a table dept(dept_no, dept_name) and having duplicate records in that. then write this following query in the Source Qualifier sql over ride. 1)select distinct(deptno),deptname from dept_test; 2)select avg(deptno),deptname from dept_test group by deptname; if you want to have only duplicate records, then write the following query in the Source Qualifier SQL Override, select distinct(deptno),deptname from dept_test a where deptno in( select deptno from dept_test b group by deptno having count(1)>1) when we create a target as flat file and source as oracle.. how can u specify first rows as column names in flat files? use a pre sql statement.but this is a hardcoding methodif you change the column names or put in extra columns in the flat file, you will have to change the insert statement. what is the procedure to write the query to list the highest salary of three employees? the following is the query to find out the top three salariesin ORACLE:(take emp table) select * from emp e where 3>(select count (*) from emp where e.sal>emp.sal) order by sal desc. in SQL Server:-(take emp table) select top 10 sal from emp which is better among connected lookup and unconnected lookup transformations in informatica or any other ETL tool? Its not a easy question to say which is better out of connected, unconnected lookups. Its depends upon our experience and upon the requirment. When you compared both basically connected lookup will return more values and unconnected returns one value. conn lookup is in the same pipeline of source and it will accept dynamic caching. Unconn lookup dont have that faclity but in some special cases we can use Unconnected. if o/p of one lookup is going as i/p of another lookup this unconnected lookups are favourable. which objects are required by the debugger to create a valid debug session? We can create a valid debug session even without a single break-point. But we have to give valid database connection details for sources, targets, and lookups used in the mapping and it should contain valid mapplets (if any in the mapping). We are using Update Strategy Transformation in mapping how can we know whether insert or update or reject or delete option has been selected during running of sessions in Informatica. In Designer while creating Update Strategy Transformation uncheck forward to next transformation. If any rejected rows are there automatically it will be updated to the session log file. Update or insert files are known by checking the target file or table only. what does the expression n filter transformations do in Informatica Slowly growing target wizard? Expression finds the Primary key is or not, and calculates new flag
10

Based on that New Flag, filter transformation filters the Data. What is the maplet? Maplet is a set of transformations that you build in the maplet designer and U can use in multiple mapings.A mapplet should have a mapplet input transformation which recives input values, and a output transformation which passes the final modified data to back to the mapping. when the mapplet is displayed with in the mapping only input & output ports are displayed so that the internal logic is hidden from end-user point of view. Total how many joiner transformations needed to join 10 different sources. 9 joiners are required to join 10 different sources. How to get 25 of 100 fields table is there any transformation available in informatica i know router/filter or any other? you can use the SQL overide to get those 25 columns containing fields from a table or you can import the table, then in source definition, cut out all but 25 that you want. what is DTM process? DTM stands for data transformation manager.DTM modifies the data according to the Instructions given by the session Mapping.Output files are:-. Informatica server log: Informatica server (on UNIX) creates a log for all status and error messages (default name: pm.server.log).It also creates an error log for error messages.These files will be created in informatica home directory. Session log file: Session detail file: Performance detail file: Performance detail file: Control file: Post session email: Indicator file: output file: Cache files: How to load time dimension? For loading data in to other dimensions we have respective tables in the oltp systems. But for time dimension we have only one base in the OLTP database. Based on that we have to load time dimension. We can loan the time dimension using ETL procedures which calls the procedure or function created in the database. If the columns are more in the time dimension we have to creat it manually by using Excel sheet. When do u use a unconnected lookup and connected lookup? In static lookup cache, you cache all the lookup data at the starting of the session. in dynamic lookup cache, you go and query the database to get the lookup value for each record which needs the lookup. static lookup cache adds to the session run time.but it saves time as informatica does not need to connect to your databse every time it needs to lookup. depending on how many rows in your mapping needs a lookup, you can decide on thisalso remember that static lookup eats up spaceso remember to select only those columns which are needed. Why and where we are using factless fact table? Factless Fact Tables are the fact tables with no facts or measures(numerical data). It contains only the foriegn keys of corresponding Dimensions. EX: Temperature in fact table will note it as Moderate,Low,High. This type of things are called Non-additive measures. what transformation you can use inplace of lookup? Lookups either we can use first or last value. for suppose lookup have more than one record matching, we need all matching records, in that situation we can use master or detail outer join instead of lookup.(according to logic). Posted in Oracle & Misc, Informatica Interview Qs | No Comments what is the difference between constraind base load ordering and target load plan?
11

Target load order comes in the designer property..Click mappings tab in designer and then target load plan.It will show all the target load groups in the particular mapping. You specify the order there the server will loadto the target accordingly. A target load group is a set of source-source qulifier-transformations and target. Where as constraint based loading is a session proerty. Here the multiple targets must be generated from one source qualifier. The target tables must posess primary/foreign key relationships. So that, the server loads according to the key relation irrespective of the Target load order plan. How to Generate the Metadata Reports in Informatica? You can generate PowerCenter Metadata Reporter from a browser on any workstation, even a workstation that does not have PowerCenter tools installed. how can we join the tables if the tables have no primary and forien key relation and no matchig port to join? without common column or common data type we can join two sources using dummy ports. 1.Add one dummy port in two sources. 2.In the expression trans assing 1 to each port. 2.Use Joiner transformation to join the sources using dummy port(use join conditions). what is rank transformation?where can we use this transformation? Rank transformation is used to find the status.ex if we have one sales table and in this if we find more employees selling the same product and we are in need to find the first 5 0r 10 employee who is selling more products.we can go for rank transformation. What is Transaction? Transaction is any event that indicates some action.In DB terms any commited changes occured in the database is said to be transaction. what is the difference between stop and abort The PowerCenter Server handles the abort command for the Session task like the stop command, except it has a timeout period of 60 seconds. If the PowerCenter Server cannot finish processing and committing data within the timeout period, it kills the DTM process and terminates the session. How do you handle decimal places while importing a flatfile into informatica? while importing the flat file, the flat file wizard helps in configuring the properties of the file so that select the numeric column and just enter the precision value and the scale. precision includes the scale for example if the number is 98888.654, enter precision as 8 and scale as 3 and width as 10 for fixed width flat file What is the procedure to load the fact table? usually source records are looked up with the records in the dimension table.DIM tables are called lookup or reference table. all the possible values are stored in DIM table. e.g product, all the existing prod_id will be in DIM table. when data from source is looked up against the dim table, the corresponding keys are sent to the fact table.this is not the fixed rule to be followed, it may vary as per ur requirments and methods u follow.some times only the existance check will be done and the prod_id itself will be sent to the fact. whats the diff between Informatica powercenter server, repositoryserver and repository?
12

Repository is a database in which all informatica componets are stored in the form of tables. The reposiitory server controls the repository and maintains the data integrity and Consistency across the repository when multiple users use Informatica. Powercenter Server/Infa Server is responsible for execution of the components (sessions) stored in the repository. what is a time dimension? give an example. Time dimension is one of important in Datawarehouse. Whenever u genetated the report , that time u access all data from thro time dimension. eg. employee time dimension Fields : Date key, full date, day of wek, day , month,quarter,fiscal year Why we use stored procedure transformation? You might use stored procedures to do the following tasks: Check the status of a target database before loading data into it. Determine if enough space exists in a database. Perform a specialized calculation. Drop and recreate indexes.
What r the types of lookup caches? Persistent cache: U can save the lookup cache files and reuse them the next time the informatica server processes a lookup transformation configured to use the cache. Recache from database: If the persistent cache is not synchronized with he lookup table,U can configure the lookup transformation to rebuild the lookup cache. Static cache: U can configure a static or readonly cache for only lookup table.By default informatica server creates a static cache.It caches the lookup table and lookup values in the cache for each row that comes into the transformation.when the lookup condition is true,the informatica server does not update the cache while it prosesses the lookup transformation. Dynamic cache: If u want to cache the target table and insert new rows into cache and the target,u can create a look up transformation to use dynamic cache.The informatica server dynamically inerts data to the target table. shared cache: U can share the lookup cache between multiple transactions.U can share unnamed cache between transformations inthe same maping. what are semi additve measures and fully additive measures? there are three types of facts 1.additive 2.semi additive 3. non-additive additve means when a any measure is queried of the fact table if the result relates to all the diemension table which are linked to the fact semi-additve when a any measure is queried from the fact table the results relates to some of the diemension table non-additive when a any measure is queried from the fact table if it does nt relate to any of the diemension and the result is driectly from the measures of the same fact table ex: to calculate the total percentage of loan just we take the value from the fact measure(loan) divide it with 100 we get it without the diemension.
13

In Source Qualifier and in Target I have 1,2,3 row and I want to insert 4,5 records in source and I want them in target also? What is the procedure to load them? If Im not wrong, if you want to load records at source and target at a time, we consider source also as target. If you want to insert 4,5 records at target that is fine, how come you will load data at source it self. Source only having that 4,5 records. I If you want to check in target before inserting 4,5, just use target table as lookup transformation. if you have any primary key at target, you can use update stratergy with option update or insert instead of using lookup table. what r the mapping specifications? how versionzing of repository objects? Mapping SpecificationIt is a metadata document of a mapping. (more) what is the repository agent? The Repository Agent is a multi-threaded process that fetches, inserts, and updates metadata in the repository database tables. The Repository Agent uses object locking to ensure the consistency of metadata in the repository. What about rapidly changing dimensions?Can u analyze with an example? a rapidly changing dimensions are those in which values are changing continuously and giving lot of difficulty to maintain them. i am giving one of the best real world example which i found in some website while browsing. go through it. i am sure you like it. description of a rapidly changing dimension by that person: Im trying to model a retailing case . Im having a SKU dimension of around 150,000 unique products which is already a SCD Type 2 for some attributes. In addition Im willing to track changes of the sales and purchase price. However these prices change almost daily for quite a lot of these products leading to a huge dimensional table and requiring continuous updations. so a better option would be shift those attributes into a fact table as facts, which solves the problem. Why did you use stored procedure in your ETL Application? Using of stored procedures plays important role.Suppose ur using oracle database where ur doing some ETL changes you may use informatica .In this every row of the table pass should pass through informatica and it should undergo specified ETL changes mentioned in transformations. If use stored procedure i..e..oracle pl/sql package this will run on oracle database(which is the databse where we need to do changes) and it will be faster comapring to informatica because it is runing on the oracle databse.Some things which we cant do using tools we can do using packages.Some jobs make take hours to run ..in order to save time and database usage we can go for stored procedures. What is the main difference between Power Centre and Power Mart? Power Centre have Multiple Repositories,where as Power mart have single repository(desktop repository)Power Centre again linked to global repositor to share between users. What is worklet and what is use of worklet and in which situation we can use it? Worklet is a set of tasks. If a certain set of task has to be reused in many workflows then we use worklets. To execute a Worklet, it has to be placed inside a workflow. The use of worklet in a workflow is similar to the use of mapplet in a mapping. what are mapping parameters and varibles in which situation we can use it ?
14

If we need to change certain attributes of a mapping after every time the session is run, it will be very difficult to edit the mapping and then change the attribute. So we use mapping parameters and variables and define the values in a parameter file. Then we could edit the parameter file to change the attribute values. This makes the process simple. Mapping parameter values remain constant. If we need to change the parameter value then we need to edit the parameter file . But value of mapping variables can be changed by using variable function. If we need to increment the attribute value by 1 after every session run then we can use mapping variables . In a mapping parameter we need to manually edit the attribute value in the parameter file after every session run. what is mystery dimension? Also known as Junk Dimensions Making sense of the rogue fields in your fact table.. What are the steps required for type2 dimension/version data mapping. how can we implement it? 1. Determine if the incoming row is 1) a new record 2) an updated record or 3) a record that already exists in the table using two lookup transformations. Split the mapping into 3 separate flows using a router transformation. 2. If 1) create a pipe that inserts all the rows into the table. 3. If 2) create two pipes from the same source, one updating the old record, one to insert the new. Can anyone explain error handling in Informatica with examples so that it will be easy to explain the same in the interview? You can create some generalized transformations to handle the errors and use them in your mapping. For example for data types create one generalized transformation and include in your mapping then you will know the errors where they are occurring. Can we use aggregator/active transformation after update strategy transformation? You can use aggregator after update strategy. The problem will be, once you perform the update strategy, say you had flagged some rows to be deleted and you had performed aggregator transformation for all rows, say you are using SUM function, then the deleted rows will be subtracted from this aggregator transformation. What is the limit to the number of sources and targets you can have in a mapping. Monday, November 27th, 2006 As per my knowledge there is no such restriction to use this number of sources or targets inside a mapping. Question is if you make N number of tables to participate at a time in processing what is the position of your database. I orginzation point of view it is never encouraged to use N number of tables at a time, It reduces database and informatica server performance Briefly explian the Versioning Concept in Power Center 7.1. In power center 7.1 use 9 Tem server i.e add in Look up. But in power center 6.x use only 8 tem server.and add 5 transformation . in 6.x anly 17 transformation but 7.x use 22 transformation. When you create a version of a folder referenced by shortcuts, all shortcuts continue to reference their original object in the original version. They do not automatically update to the current folder version. For example, if you have a shortcut to a source definition in the Marketing folder, version 1.0.0, then you create a new folder version, 1.5.0, the shortcut continues to point to the source definition in version 1.0.0.
15

Maintaining versions of shared folders can result in shortcuts pointing to different versions of the folder. Though shortcuts to different versions do not affect the server, they might prove more difficult to maintain. To avoid this, you can recreate shortcuts pointing to earlier versions, but this solution is not practical for much-used objects. Therefore, when possible, do not version folders referenced by shortcuts. how to create the staging area in your database? A Staging area in a DW is used as a temporary space to hold all the records from the source system. So more or less it should be exact replica of the source systems except for the laod startegy where we use truncate and reload options. So create using the same layout as in your source tables or using the Generate SQL option in the Warehouse Designer tab. what is the basic language of informatica? The basic language of Informatica is SQL plus.Then only it will under stand the data base language. Posted in Oracle & Misc, Informatica Interview Qs | No Comments What is ODS ?what data loaded from it ? What is DW architecture? ODSOperational Data Source, Normally in 3NF form. Data is stored with least redundancy. General architecture of DWH OLTP System> ODS > DWH( Denomalised Star or Snowflake, vary case to case) Assume that i have one 24/7 company. Peak hours is 9-9. ok in this one per day around 40000 records are added or modified. Now at 9o clock i had taken a backup and left. after 9 to 9 again instead of stroing the data in the same server i will keep from 9 p.m to 9 a.m. data i will store saperatly. Assume that 10000 records are added in this time. so that next day moring when i am dumping the data there is no need to take 40000+10000. It is very slow in performance wise. so i can take directly 10000 records. this consepct what we call as ODS. Operational Data Source. Arch can be in two ways. ODS to WH ODS StagingArea wh Wht is incremental loading?Wht is versioning in 7.1? Incremental loading in DWH means to load only the changed and new records.i.e. not to load the ASIS records which already exist. Versioning in Informatica 7.1 is like a confugration management system where you have every version of the mapping you ever worked upon. Whenever you have checkedin and created a lock, noone else can work on the same mapping version. This is very helpful in n environment where you have several users working on a single feature. What is the difference between materialized view and a data mart? Are they same? A materialized view provides indirect access to table data by storing the results of a query in a separate schema object unlike an ordinary view, which does not take up any storage space or contain data. Materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. E.g. to construct a data warehouse. The definition of materialized view is very near to the concept of Cubes where we keep summarized data. But cubes occupy space. Coming to datamart that is completely different concept. Datawarehouse contains overall view of the organization. But datamart is specific to a subjectarea like Finance etc
16

we can combine different data marts of a compnay to form datawarehouse or we can split a datawarehouse into different data marts. In update strategy target table or flat file which gives more performance ? why? Pros: Loading, Sorting, Merging operations will be faster as there is no index concept and Data will be in ASCII mode. Cons: There is no concept of updating existing records in flat file. As there is no indexes, while lookups speed will be lesser. How can u get distinct values while mapping in Informatica in insertion? There are two methods to get distinct values: If the sources are databases then we can go for SQL-override in source qualifier by changing the default SQL query. I mean selecting the check box called select distinct. and if the sources are heterogeneous, i mean from different file systems, then we can use SORTER Transformation and in transformation properties select the check box called select distinct same as in source qualifier, we can get distinct values. What is change data capture? Change data capture (CDC) is a set of software design patterns used to determine the data that has changed in a database so that action can be taken using the changed data. Basically Changed Data Capture (CDC) helps identify the data in the source system that has changed since the last extraction. With CDC, data extraction takes place at the same time the insert, update, or delete operations occur in the source tables, and the change data is stored inside the database in change tables. The change data, thus captured, is then made available to the target systems in a controlled manner.
how u will create header and footer in target using informatica? you can always create a header and a trailer in the target file using an aggregator transformation. Take the number of records as count in the aggregator transformation. create three separate files in a single pipeline. One will be your header and other will be your trailer coming from aggregator. The third will be your main file. Concatenate the header and the main file in post session command usnig shell script. What is meant by EDW? Its a big data warehouses OR centralized data warehousing OR the old style of warehouse. Its a single enterprise data warehouse (EDW) with no associated data marts or operational data store (ODS) systems. What will happen if you are using Update Strategy Transformation and your session is configured for insert?
17

if u r using a update strategy in any of ur mapping, then in session properties u have to set treat source rows as Data Driven. if u select insert or udate or delete, then the info server will not consider UPD for performing any DB operations. ELSE u can use the UPD session level options. instead of using a UPD in mapping just select the update in treat source rows and update else insert option. this will do the same job as UPD. but be sure to have a PK in the target table. 2) for oracle : SQL loader for teradata:tpump,mload. 3) if u pass only 5 rows to rank, it will rank only the 5 records based on the rank port. Tell me what would the size of ur warehouse project? U can say 600-900GB including Ur Marts.It varies depending up on ur Project Structure and How many data marts and EDWH. How to work with pmcmd on windows platform? In workflow manager take pmcmd command. establish link between session and pmcmd. if session executes successfully pmcmd command executes. Suppose I have one source which is linked into 3 targets.When the workflow runs for the first time only the first target should be populated and the rest two(second and last) should not be populated.When the workflow runs for the second time only the second target should be populated and the rest two(first and last) should not be populated.When the workflow runs for the third time only the third target should be populated and the rest two(first and second) should not be populated. heres one of the solutions i think would wrk, Target Load order will work only for a single run. But here the we have to control the data flow accross the runs. For that we have cature the iteration number in flat file, and doing (more) what is Powermart and Power Center? power center will support global and local repositories and also it supports ERP packages. but, Powermart will support local repositories only and it doesnt support ERP packages . Power center is High Cost where as power mart it is Low. Power center normally used for enterprise data warehouses where as power mart will use for Low/Mid range data warehouses. What TOAD and for what purpose it will be used? Toad is a application development tool built around an advanced SQL and PL/SQL editor. Using Toad, you can build and test PL/SQL packages, procedures, triggers, and functions. (more) In a flat file sql override will work r not? what is the extension of flatfile. In Flat file SQL Override will not work. We have different properties to set for a flat file. If you are talking about the flat file as a source it can be of any extension like .dat,.doc etc. Yes if it is an Target file it will have extension as .out. Which can be altered in the Target Properties. what is sql override where do we use and which transformations?
18

THe defualt Sql Return by the source qualifier can be over writne by using source qualifier.Transfermation is Source qualifier transfermation. Besides the above, in Session properties -> transformations tab -> source qualifier ->SQL query - by default it will be the query from the mapping. If you over write it, this is also called SQL override what r the types of target loads? There are two types of target load typesthey are1. bulk2. normal What Bulk & Normal load? Where we use Bulk and where Normal? when we try to load data in bulk mode, there will be no entry in database log files, so it will be tough to recover data if session got failed at some point. where as in case of normal mode entry of every record will be with database log file and with the informatica repository. so, if the session got failed it will be easy for us to start data from last committed point. Bulk mode is very fast compartively with normal mode. we use bulk mode to load data in databases, it wont work with text files using as target, where as normal mode will work fine with all type of targets. For homogeneous sources we use Source Qualifier to join the sources? Why not the joiner transformation? if you are using tables ot flat files as sources we need to use source qualifier, for cobol sources we use normalizer. Informatica server doesnt recognize source files with out source qualifiers, so we need to use. If you are using flat files as sources we have to use joiner to join the two sources after source qualifier. Can U use the maping parameters or variables created in one maping into another maping? NO. We can use mapping parameters or variables in any transformation of the same maping or mapplet in which U have created maping parameters or variables. What r the mapping paramaters and maping variables? Maping parameter represents a constant value that U can define before running a session.A mapping parameter retains the same value throughout the entire session. When u use the maping parameter ,U declare and use the parameter in a maping or maplet.Then define the value of parameter in a parameter file for the session. Unlike a mapping parameter,a maping variable represents a value that can change throughout the session.The informatica server saves the value of maping variable to the repository at the end of session run and uses that value next time U run the session. What r the unsupported repository objects for a mapplet? You cannot include the following objects in a mapplet: Normalizer transformations Cobol sources XML Source Qualifier transformations XML sources Target definitions Pre- and post- session stored procedures Other mapplets
Posted in Oracle & Misc, Informatica Interview Qs | No Comments What r the methods for creating reusable transforamtions?
19

Two methods 1.Design it in the transformation developer. by default its a reusable transform. 2.Promote a standard transformation from the mapping designer.After U add a transformation to the mapping , U can promote it to the status of reusable transformation. Once U promote a standard transformation to reusable status,U CANNOT demote it to a standard transformation at any time. If u change the properties of a reusable transformation in mapping,U can revert it to the original reusable transformation properties by clicking the revert button. How can U create or import flat file definition in to the warehouse designer? U can not create or import flat file defintion in to warehouse designer directly.Instead U must analyze the file in source analyzer,then drag it into the warehouse designer.When U drag the flat file source defintion into warehouse desginer workspace,the warehouse designer creates a relational target defintion not a file defintion.If u want to load to a file,configure the session to write to a flat file.When the informatica server runs the session,it creates and loads the flatfile. what is a source qualifier? When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session. What are 2 modes of data movement in Informatica Server? The data movement mode depends on whether Informatica Server should process single byte or multi-byte character data. This mode selection can affect the enforcement of code page relationships and code page validation in the Informatica Client and Server. a) Unicode - IS allows 2 bytes for each character and uses additional byte for each non-ascii character (such as Japanese characters) b) ASCII - IS holds all data in a single byte The IS data movement mode can be changed in the Informatica Server configuration parameters. This comes into effect once you restart the Informatica Server. What are Aggregate transformation? Aggregator transform is m uch like the Group by clause in traditional SQL. this particular transform is a connected/active transform which can take the incoming data form the mapping pipeline and group them based on the group by ports specified and can caculated aggregate funtions like ( avg, sum, count, stddev.e.tc) for each of those groups. From a performanace perspective if your mapping has an AGGREGATOR transform use filters and sorters very early in the pipeline if there is any need for them. What are Target Types on the Server? Target Types are File, Relational and ERP. What is dynamic lookup? Dynamic Lookup generally used for connect lkp transformation, when the data is cgenged then is updating, insert or its leave without changing .. Explain the informatica Architecture in brief informatica server connects source data and target data using native
20

odbc drivers again it connect to the repository for running sessions and retriveing metadata information source>informatica server>target | | REPOSITORY how can we partition a session in Informatica? The Informatica PowerCenter Partitioning option optimizes parallel processing on multiprocessor hardware by providing a thread-based architecture and built-in data partitioning. GUI-based tools reduce the development effort necessary to create data partitions and streamline ongoing troubleshooting and performance tuning tasks, while ensuring data integrity throughout the execution process. As the amount of data within an organization expands and real-time demand for information grows, the PowerCenter Partitioning option enables hardware and applications to provide outstanding performance and jointly scale to handle large volumes of data and users. After draging the ports of three sources(sql server,oracle,informix) to a single source qualifier, can u map these three ports directly to target? if u drag three hetrogenous sources and populated to target without any join means you are entertaining Carteisn product. If you dont use join means not only diffrent sources but homegeous sources are show same error. If you are not interested to use joins at source qualifier level u can add some joins sepratly. What is difference between partioning of relatonal target and partitioning of file targets? Partitions can be done on both relational and flat files. Informatica supports following partitions 1.Database partitioning 2.RoundRobin 3.Pass-through 4.Hash-Key partitioning 5.Key Range partitioning All these are applicable for relational targets.For flat file only database partitioning is not applicable. Informatica supports Nway partitioning.U can just specify the name of the target file and create the partitions, rest will be taken care by informatica session. Can u generate reports in Informatica? Yes. By using Metadata reporter we can generate reports in informatica. What r the mapings that we use for slowly changing dimension table?
21

Type1: Rows containing changes to existing dimensions are updated in the target by overwriting the existing dimension. In the Type 1 Dimension mapping, all rows contain current dimension data. Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do not need to keep any previous versions of dimensions in the table. Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into the target. Changes are tracked in the target table by versioning the primary key and creating a version number for each dimension in the table. Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. Version numbers and versioned primary keys track the order of changes to each dimension. Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons and inserts only those found to be new dimensions to the target. Rows containing changes to existing dimensions are updated in the target. When updating an existing dimension, the Informatica Server saves existing data in different columns of the same row and replaces the existing data with the updates What is Datadriven? The Informatica Server follows instructions coded into Update Strategy transformations within the session mapping to determine how to flag rows for insert, delete, update, or reject. If the mapping for the session contains an Update Strategy transformation, this field is marked Data Driven by default. What r the basic needs to join two sources in a source qualifier? Two sources should have primary and Foreign key relation ships. Two sources should have matching data types. What r the joiner caches? When a Joiner transformation occurs in a session, the Informatica Server reads all the records from the master source and builds index and data caches based on the master rows. After building the caches, the Joiner transformation reads records from the detail source and perform joins. What r the join types in joiner transformation? Normal (Default) only matching rows from both master and detail Master outer all detail rows and only matching rows from master Detail outer all master rows and only matching rows from detail Full outer all rows from both master and detail ( matching or non matching) (more) If you have four lookup tables in the workflow. How do you troubleshoot to improve performance? There r many ways to improve the mapping which has multiple lookups. 1) we can create an index for the lookup table if we have permissions(staging area). 2) divide the lookup mapping into two (a) dedicate one for insert means: source - target,, these r new rows . only the new rows will come to mapping and the process will be fast . (b) dedicate the second one to update : source=target,, these r existing rows. only the rows which exists allready will come into the mapping. 3)we can increase the chache size of the lookup. If you are workflow is running slow in informatica. Where do you start trouble shooting and what are the steps you follow? when the work flow is running slowly u have to find out the bottlenecks in this order target
22

source mapping session system In a sequential Batch how can we stop single session? we can stop it using PMCMD command or in the monitor right click on that perticular session and select stop.this will stop the current session and the sessions next to it. how to use mapping parameters and what is their use? Mapping parameters and variables make the use of mappings more flexible.and also it avoids creating of multiple mappings. it helps in adding incremental data.mapping parameters and variables has to create in the mapping designer by choosing the menu option as Mapping -> parameters and variables and the enter the name for the variable or parameter but it has to be preceded by $$. and choose type as parameter/variable, datatypeonce defined the variable/parameter is in the any expression for example in SQ transformation in the source filter prop[erties tab. just enter filter condition and finally create a parameter file to assgn the value for the variable / parameter and configigure the session properties. however the final step is optional. if ther parameter is npt present it uses the initial value which is assigned at the time of creating the variable. How to delete duplicate rows in flat files source is there any option in informatica? u can use a dynamic lookup or an aggregator or a sorter for doing this. i.e. use a sorter transformation , in that u will have a distinct option make use of it . Can Informatica be used as a Cleansing Tool? Yes, we can use Informatica for cleansing data. some time we use stages to cleansing the data. It depends upon performance again else we can use expression to cleasing data. For example an feild X have some values and other with Null values and assigned to target feild where target feild is notnull column, inside an expression we can assign space or some constant value to avoid session failure. The input data is in one format and target is in another format, we can change the format in expression. we can assign some default values to the target to represent complete set of data in the target. What is the difference between connected and unconnected stored procedures. Unconnected: The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. It either runs before or after the session, or is called by an expression in another transformation in the mapping. connected: The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation. All data entering the transformation through the input ports affects the stored procedure. You should use a connected Stored Procedure transformation when you need data from an input port sent as an input parameter to the stored procedure, or the results of a stored procedure sent as an output parameter to another transformation. basic difference between summary filter and details filter? Summary Filter we can apply records group by that contain common values.
23

Detail Filter we can apply to each and every record in a database. Can we lookup a table from a source qualifer transformation-unconnected lookup? No. we cant do. I will explain you why. 1) Unless you assign the output of the source qualifier to another transformation or to target no way it will include the feild in the query. 2) source qualifier dont have any variables feilds to utalize as expression. what is a junk dimension? A junk dimension is a collection of random transactional codes, flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes. A good example would be a trade fact in a company that brokers equity trades. what are the enhancements made to Informatica 7.1.1 version when compared to 6.2.2 version? 1.union & custom transformation 2.lookup on flatfile 3.we can use pmcmd command 4.we can export independent&dependent repository objects 5.version controlling 6.data proffiling 7.supporting of 64mb architecture 8.ldap authentication What is lookup transformation and update strategy transformation? Look up transformation is used to lookup the data in a relationa table,view,Synonym and Flat file. The informatica server queries the lookup table based on the lookup ports used in the transformation. It compares the lookup transformation port values to lookup table column values based on the lookup condition By using lookup we can get the realated value,Perform a caluclation and Update SCD. Two types of lookups Connected Unconnected Update strategy transformation This is used to control how the rows are flagged for insert,update ,delete or reject.
24

To define a flagging of rows in a session it can be insert,Delete,Update or Data driven. In Update we have three options Update as Update Update as insert Update else insert
what is difference between dimention table and fact table and what are different dimention tables and fact tables? In the fact table contain measurable data and less columns and meny rows, Its contain primarykey Diffrent types of fact tables: additive,non additive, semi additive In the dimensions table contain textual descrption of data and also contain meny columns,less rows It contains primary key. What is the use of update strategy transformation. To flag source records as INSERT, DELETE, UPDATE or REJECT for target database. Default flag is Insert. This is must for Incremental Data Loading. what is meant by complex mapping? Complex maping means involved in more logic and more business rules. for ex, in a bank project, I involved in construct a 1 dataware house Many customera r there in my bank project, They r after taking loans relocated in to another place that time i feel to diffcult maintain both prvious and current adresses in the sense i am using scd2 This is an simple example of complex mapping. what is difference b/w Informatica 7.1 and Abinitio? There is a lot of diffrence between informatica an Ab Initio In Ab Initio we r using 3 parllalisim but Informatica using 1 parllalisim In Ab Initio no scheduling option we can scheduled manully or pl/sql script but informatica contains 4 scheduling options Ab Inition contains co-operating system
25

but informatica is not Ramp time is very quickly in Ab Initio campare than Informatica Ab Initio is userfriendly than Informatica what is the best way to show metadata(number of rows at source, target and each transformation level, error related data) in a report format? When your workflow get completed go to workflow monitor right click the session .then go to transformation statistics ther we can see number of rows in source and target.if we go for session properties we can see errors related to data .
Two relational tables are connected to SQ Trans,what are the possible errors it will be thrown? The only two possibilities could be 1. 2. Both the table should have primary key/foreign key relation ship Both the table should be available in the same schema or same database
How to import oracle sequence into Informatica. heres how u go about it, CREATE ONE PROCEDURE AND DECLARE THE SEQUENCE INSIDE THE PROCEDURE,FINALLY CALL THE PROCEDURE IN INFORMATICA WITH THE HELP OF STORED PROCEDURE TRANSFORMATION. What is IQD file? IQD file is nothing but Impromptu Query Definetion,This file is maily used in Cognos Impromptu tool after creating a imr( report) we save the imr as IQD file which is used while creating a cube in power play transformer.In data source type we select Impromptu Query Definetion. Can Informatica load heterogeneous targets from heterogeneous sources? Yes it can. For exampleFlat File and Relations sources are joined in the mapping, and later, Flat File and relational targets are loaded. In realtime which one is better star schema or snowflake star schema? In real time only star schema will implement because it will take less time and surrogate key will there in each and every dimension table in star schema and this surrogate key will assign as foreign key in fact table. Where is the cache stored in informatica? Cache is stored in the Informatica server memory and over flowed data is stored on the disk in file format which will be automatically deleted after the successful completion of the session run. If you want to store that data you have to use a persistant cache. In a joiner trasformation, you should specify the source with fewer rows as the master source. Why? In joinner transformation informatica server reads all the records from master source builds index and data caches based on master table rows.after building the caches the joiner transformation reads records from the detail source and perform joins. how do we remove the staging area?
26

This question is logically not correct. staging area is just a set of intermediate tables.u can create or maintain these tables in the same database as of ur DWH or in a different DB.These tables will be used to store data from the source which will be cleaned, transformed and undergo some business logic.Once the source data is done with the above process, data from STG will be populated to the final Fact table through a simple one to one mapping. Where do we use MQ series source qualifier, application multi group source qualifier? We can use a MQSeries SQ when we have a MQ messaging system as source(queue). When there is need to extract data from a Queue, which will basically have messages in XML format, we will use a JMS or a MQ SQ depending on the messaging system. If you have a TIBCO EMS Queue, use a JMS source and JMS SQ and an XML Parser, or if you have a MQ series queue, then use a MQ SQ which will be associated with a Flat file or a Cobal file. What are the various test procedures used to check whether the data is loaded in the backend, performance of the mapping, and quality of the data loaded in INFORMATICA? Some of the steps could be : 1) Check in the workflow monitor status, whether no. of records in source and no. of actual records loaded are equal 2) Check for the duration for a workflow to suceed 3)Check in the session logs for data loaded. If you want to know the performance of a mapping at transformation level, then select the option in the session properties-> collect performance data. At the run time in the monitor you can see it in the performance tab or you can get it from a file. The PowerCenter Server names the file session_name.perf, and stores it in the same directory as the session log. If there is no session-specific directory for the session log, the PowerCenter Server saves the file in the default log files directory. Quality of the data loaded depends on the quality of data in the source. If cleansing is required then have to perform some data cleansing operations in informatica. Final data will always be clean if followed. What is meant by Junk Attribute in Informatica? Junk Dimension A Dimension is called junk dimension if it contains attribute which are rarely changed ormodified. example In Banking Domain , we can fetch four attributes accounting to a junk dimensions like from the Overall_Transaction_master table tput flag tcmp flag del flag advance flag all these attributes can be a part of a junk dimensions. Can anyone explain about incremental aggregation with an example? When you use aggregator transformation to aggregate it creates index and data caches to store the data 1.Of group By columns 2. Of aggreagte columns the incremental aggreagtion is used when we have historical data in place which will be used in aggregation incremental aggregation uses the cache which contains the historical data and for each group by column value already present in cache it add the data value to its corresponding data cache value and outputs the row , in case of a incoming value having no match in index cache the new values for group by and output ports are inserted into the cache . how do you create a mapping using multiple lookup transformation? Use unconnected lookup if same lookup repeats multiple times. How can we join 3 database like Flat File, Oracle, Db2 in Informatrica? You have to use two joiner transformations.fIRST one will join two tables and the next one will join the third with the resultant of the first joiner. what is tracing level?
27

Tracing level determines the amount of information that informatcia server writes in a session log. How to go for SCDs and its types.Where do we use them mostly? The Slowly Changing Dimension problem is a common one particular to data warehousing. In a nutshell, this applies to cases where the attribute for a record varies over time. We give an example below: Christina is a customer with ABC Inc. She first lived (more) What is a view? How it is related to data independence? As per definition, View is just an query which is parsed and stored in SGA. So whenever this view is referred in a query it can be executed with no lost of time for parsing. Through views we can hide the complex and big names of the tables. One bigger advantage, just by creating a view once we can use it at many places. Materialize View which is introduce in Oracle 8. It actually stores data like table. In certain mapping there are four targets tg1,tg2,tg3 and tg4. tg1 has a primary key,tg2 foreign key referencing the tg1s primary key,tg3 has primary key that tg2 and tg4 refers as foreign key,tg2 has foreign key referencing primary key of tg4 ,the order in which the informatica will load the target? T1 and T3 are being the master table and dont have any foreign key refrence to other table will be loaded first. Then T3 will be loaded as its master table T3 is already been loaded. and at the end T2 will be loaded as its all master table T1, T3, and T2 to witch it refers has been already loaded. what is surrogatekey ? A surrogate key is system genrated/artificial key /sequence number or A surrogate key is a substitution for the natural primary key.It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the tableI it is useful because the natural primary key (i.e. Customer Number in Customer table) can change and this makes updates more difficult.but In my project, I felt that the primary reason for the surrogate keys was to record the changing context of the dimension attributes.(particulaly for scd )The reason for them being integer and integer joins are faster. What is the diff b/w Stored Proc (DB level) & Stored proc trans (INFORMATICA level) ? First of all stored procedures (at DB level) are series of SQL statement. And those are stored and compiled at the server side.In the Informatica it is a transformation that uses same stored procedures which are stored in the database. Stored procedures are used to automate time-consuming tasks that are too complicated for standard SQL statements.if you dont want to use the stored procedure then you have to create expression transformation and do all the coding in it. what is the diff b/w STOP & ABORT in INFORMATICA session level ? Stop:We can Restart the session Abort:WE cant restart the session.We should truncate all the pipeline after that start the session. if the workflow has 5 session and running sequentially and 3rd session hasbeen failed how can we run again from only 3rd to 5th session? If multiple sessions in a concurrent batch fail, you might want to truncate all targets and run the batch again. However, if a session in a concurrent batch fails and the rest of the sessions complete successfully, you can recover the session as a standalone session.To recover a session in a concurrent batch:1.Copy the failed session using Operations-Copy Session.2.Drag the copied session outside the batch to be a standalone session.3.Follow the steps to recover a standalone session.4.Delete the standalone copy. What are the properties should be notified when we connect the flat file source definition to relational database target definition? 1.File is fixed width or delimited
28

2.Size of the file. If its can be executed without performance issues then normal load will work If its huge in GB they NWAY partitions can be specified at the source side and the target side. 3.File reader,source file name etc .. while Running a Session, what are the two files it will create? Session Log file and Session Detail file. Besides it also creates the following files if applicable - reject files, target output file, incremental aggregation file, cache file. what is the use of Factless Facttable? Factless Fact table are fact table, which is not having any measures. For example - You want to store the attendance information of the student. This table will give you datewise, whether the student has attended the class or not. But there is no measures, because fees paid etc is not daily. Where persistent cache will be stored? The informatica server saves the cache files for every session and reuses it for next session, by that the query on the table will be reduced, so that there will be some performance increment will be there. COMMITS: What is the use of Source-based commits ? PLease tell with an example ? If you selected commit type is target, once the cache is holding some 10000 records server will commit the records, here server will be least bother of the number of source records processed. If you selected commit type is source, once after 10000 records are queried immedialty server will commit that, here server will be least bother how many records inserted in the target. how to write a filter condition to get all the records of employees hired between any two given dates. IFF(DATE_DIFF(HIREDATE,DATE1,DATEFORMAT)>0) AND IFF(DATE_DIFF(DATE2,HIREDATE,DATEFORMAT)>0) EX: DATEFORMAT DD, MM etc. Target file has duplicate records, eventhough the source tables has single records. Selecting data from 5 different SQL tables using the user defined join. Generate the SQL in SQ Transformation and try to run the same in TOAD. If the join condition between the tables is wrong, you will get cartesian product. In workflow can we send multiple email ? Yes, only on the UNIX version of Workflow and not Windows based version. What is incremental loading in informatica(that is used to load only updated information in the source)?how and where u use it in informatica? The incremental loading is done in 3 ways by using Transformations. 1.Aggregate Transformation,
29

2.Dynamic Lookup Transformation, 3.Update Strategy Transformation, In the mapping we can use either Aggregate Transformation or Dynamic Lookup Transformation with Updatestrategy or filter Transformtion for performing update or insert the newly captured records. Session level in the property tab in the performane have option like incremantal Aggragetion.If u enable this property the seession captures only newly records from the source. can u explain one critical mapping? it depends on your data and the type of operation u r doing. If u need to calculate a value for all the rows or for the maximum rows coming out of the source then go for a connected lookup. Or,if it is not so then go for unconnectd lookup. Specially in conditional case like, we have to get value for a field customer from order tabel or from customer_data table,on the basis of following rule: If customer_name is null then ,customer=customer_data.ustomer_Id otherwise customer=order.customer_name. so in this case we will go for unconnected lookup how can we store previous session logs? Just run the session in time stamp mode then automatically session log will not overwrite current session log.
Is it possible to run one loading session with one particular target and multiple types of data sources? Yes, Use joiner transformation to join heterogeneous sources Develop the mapping In the Session enter the respective locations or details of the sources
You transfer 100000 rows to target but some records get discard. How wil you trace them? and where its get loaded? In the session - target properties, the last two attributes reject file directory and reject filename tells where the bad file (file with rejected records) is located. How many types of flatfiles available in Informatica? Monday, November 27th, 2006 There are two types of flate files: 1.Delimtedwidth
30

2.Fixdwidth in which particular situation we use dynamic lookup? If the no. of records are in hundreds, one doesnt see much difference whether a static cache is used or dynamic cache. If there are thousands of records dynamic cache kills time because it commits the database for each insert or update it makes. In which particular situation we use unconnected lookup transformation? We can use the unconnected lookup transformation when i need to return the only one port at that time i will use the unconnected lookup transformation instead of connected. We can also use the connected to return the one port but if u r taking unconnected lookup transformation it is not connected to the other transformation and it is not a part of data flow that why performance will increase. what is surrogate key ? Surrogate key is used to replacement of the primary key, DWH does not depends upon the primary key it is used to identify the internal records, each diemension should have atleast one surrogate key. what r the advantages and disadvantages of a star schema and snoflake schema. Schemas are two types starflake and snowsflake schema, In starflake fact table is in normalized format and dimention table is in denormalized format In snowsflake both fact and dimention tables are in normalized format only if u r taking snowsflake it requires more dimention table and more foreign keys, it will reduse the query performance. It will reduce the redundency. Main advantage in starschema 1)it supports drilling and drill down options 2)fewer tables 3)less database In snowflake schema 1)degrade query peformance because more joins is there 2)savings small storage spaces Can we use lookup instead of join? If the relationship to other table is a one to one join or many to one join then we can use lookup to get the required fields. In case the relationship is outer join, joining both the tables will give correct results as Lookup returns only one row if multiple rows satisfy the join condition. Give a scenario where flat files are used? Loading data from flatfiles to database is faster. You are receiving a data from a remote location. At remote location the required data can be converted into flatfile and same you can use at target location for loading. This minimizes the requirement of bandwidth, faster transmission. Is Router Transformation Active or Passive?
31

An active transformation can change the number of rows that pass through it.In Router you have the default group . If you want we can drop all rows in the default group, by not connecting it to a transformation or a target in a mapping. In which case it is changing the Number of rows passing through it . How do you identify existing rows of data in the target table using lookup transformation? Can identify existing rows of data using unconnected lookup transformation. How many mappings you have done in your project(in a banking)? Depends on the dimensions for suppose if u r taking the banking porject it requires the dimension like time, primary account holder, branch like we can take any no of dimension depending on the project and we can create the mapping for cleancing the data or scrubbing the data for this also we can create the mappings we cant exact this many. why do u use a reusable sequence genator tranformation in mapplets? If u r not using reusable sequence generator transformation duplicasy will occur inorder to reduce this we can use reusable sequence generator transformation. How many staging areas are there in your project? There are 4 different staging areas in a project these are 1. Extract 2. Cleansing 3. Transform 4. Load Where did you implement SCD2 and what was the transformations used.What were the complex transformations you used? If i want to maintain the full historical data then i will use the SCD2 here SCD2 are catagorised into three types 1) current flat 2) effective date range 3) version no. we can use any of these three. In this we can use look up transformation, filters, update transformation and expression transformation so like that we can use depending on requirement. what are the differences between informatica6.1 and informatica7.1? in informatica7.1 1. we can take flatfile as a target 2.flatfile as lookup table 3.dataprofiling&versioning 4.union transformation.it works like union all. what is DTM process? DTM means data transformation manager.in informatica this is main back ground process.it run after complition of load manager.in this process informatica server search source and tgt connection in repository,if it correct then informatica server fetch the data from source and load it to target. How to extract 10 records out of 100 records in a flat file?
32

1. create external directory 2. store the file in this external directory 3. create a external table corresponding to the file 4. query the external table to access records like u would do a normal table What is target load order ? In a mapping if there are more than one target table,then we need to give in which order the target tables should be loaded example: suppose in our mapping there are 2 target table 1. customer 2. Audit table first customer table should be populated than Audit table for that we use target load order what can we do in mapplets,that we cannot do in mappings? in maplets the tranformation can be reused but in mapings the transformation s can not be reused. How many ways u create ports? Two ways 1.Drag the port from another transforamtion 2.Click the add buttion on the ports tab. What r the active and passive transforamtions? An active transforamtion can change the number of rows that pass through it.A passive transformation does not change the number of rows that pass through it. Which transformation should u need while using the cobol sources as source defintions? Normalizer transformaiton which is used to normalize the data.Since cobol sources r oftenly consists of Denormailzed data. While importing the relational source definition from database, what are the meta data of source U import? Source name Database location Column names Datatypes Key constraints
What is Session and Batches? Session - A Session Is A set of instructions that tells the Informatica Server How And When To Move Data From Sources To Targets. After creating the session, we can use either the server manager or the command line program pmcmd to start or stop the session.Batches - It Provides A Way to Group Sessions For Either Serial Or Parallel Execution By The Informatica Server. There Are Two Types Of Batches :Sequential - Run Session One after the Other.Concurrent - Run Session At The Same Time. How many ways you can update a relational source defintion and what r they?
33

Two ways 1. Edit the definition 2. Reimport the defintion Where should U place the flat file to import the flat file defintion to the designer? There is no such restrication to place the source file. In performance point of view its better to place the file in server local src folder. if you need path please check the server properties availble at workflow manager. It doesnt mean we should not place in any other folder, if we place in server src folder by default src will be selected at time session creation. What is Load Manager? While running a Workflow,the PowerCenter Server uses the Load Manager process and the Data Transformation Manager Process (DTM) to run the workflow and carry out workflow tasks.When the PowerCenter Server runs a workflow, the Load Manager performs the following tasks: (more) To provide support for Mainframes source data,which files r used as a source definitions? COBOL Copy-book files What are various types of Aggregation? Various types of aggregation are SUM, AVG, COUNT, MAX, MIN, FIRST, LAST, MEDIAN, PERCENTILE, STDDEV, and VARIANCE. What is the Router transformation? A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. However, a Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. A Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group. If you need to test the same input data based on multiple conditions, use a Router Transformation in a mapping instead of creating multiple Filter transformations to perform the same task. What is the Rankindex in Ranktransformation? The Designer automatically creates a RANKINDEX port for each Rank transformation. The Informatica Server uses the Rank Index port to store the ranking position for each record in a group. For example, if you create a Rank transformation that ranks the top 5 salespersons for each quarter, the rank index numbers the salespeople from 1 to 5. How the informatica server sorts the string values in Ranktransformation? When the informatica server runs in the ASCII data movement mode it sorts session data using Binary sortorder.If U configure the seeion to use a binary sort order,the informatica server caluculates the binary value of each string and returns the specified number of rows with the higest binary values for the string. Which transformation should we use to normalize the COBOL and relational sources? The Normalizer transformation normalizes records from COBOL and relational sources, allowing you to organize the data according to your own needs. A Normalizer transformation can appear anywhere in a data flow when you normalize a relational source. Use a Normalizer transformation instead of the Source Qualifier transformation when you normalize a COBOL source. When you drag a COBOL source into the Mapping Designer workspace, the Normalizer transformation automatically appears, creating input and output ports for every column in the source.
Difference between static cache and dynamic cache.
34

Static cache >U can not insert or update the cache >The informatic server returns a value from the lookup table or cache when the condition is true.When the condition is not true, informatica server returns the default value for connected transformations and null for unconnected transformations. Dynamic cache >U can insert rows into the cache as u pass to the target >The informatic server inserts rows into cache when the condition is false.This indicates that the the row is not in the cache or target table. U can pass these rows to the target table. what is meant by lookup caches? The informatica server builds a cache in memory when it processes the first row af a data in a cached look up transformation.It allocates memory for the cache based on the amount u configure in the transformation or session properties.The informatica server stores condition values in the index cache and output values in the data cache. what r the settiings that u use to cofigure the joiner transformation? # Master and detail source # Type of join # Condition of the join the Joiner transformation supports the following join types, which you set in the Properties tab: * Normal (Default) * Master Outer * Detail Outer * Full Outer In which condtions we can not use joiner transformation(Limitaions of joiner transformation)? Both pipelines begin with the same original data source. Both input pipelines originate from the same Source Qualifier transformation. Both input pipelines originate from the same Normalizer transformation. Both input pipelines originate from the same Joiner transformation. Either input pipelines contains an Update Strategy transformation. Either input pipelines contains a connected or unconnected Sequence Generator transformation. What is aggregate cache in aggregator transforamtion? The aggregator stores data in the aggregate cache until it completes aggregate calculations.When u run a session that uses an aggregator transformation,the informatica server creates index and data caches in memory to process the transformation.If the informatica server requires more space,it stores overflow values in cache files. What r the reusable transforamtions? Reusable transformations can be used in multiple mappings.When u need to incorporate this transformation into maping,U add an instance of it to maping.Later if U change the definition of the transformation ,all instances of it inherit the changes.Since the instance of reusable transforamation is a pointer to that transforamtion,U can change the transforamation in the transformation developer,its instances automatically reflect these changes.This feature can save U great deal of work. What is difference between maplet and reusable transformation? Maplet consists of set of transformations that is reusable.A reusable transformation is a single transformation that can be reusable. If u create a variables or parameters in maplet that can not be used in another maping or maplet.Unlike the variables that r created in a reusable transformation can be usefull in any other maping or maplet.
35

We can not include source definitions in reusable transformations.But we can add sources to a maplet. Whole transformation logic will be hided in case of maplet.But it is transparent in case of reusable transformation. We cant use COBOL source qualifier,joiner,normalizer transformations in maplet.Where as we can make them as a reusable transformations. What is parameter file? Parameter file is to define the values for parameters and variables used in a session.A parameter file is a file created by text editor such as word pad or notepad. U can define the following values in parameter file Maping parameters Maping variables session parameters Can u start a batches with in a batch? U can not. If u want to start batch that resides in a batch,create a new independent batch and copy the necessary sessions into the new batch. What is batch and describe about types of batches? Grouping of session is known as batch.Batches r two types Sequential: Runs sessions one after the other Concurrent: Runs session at same time. If u have sessions with source-target dependencies u have to go for sequential batch to start the sessions one after another.If u have several independent sessions u can use concurrent batches. Whch runs all the sessions at the same time. Can u copy the session to a different folder or repository? Yes. By using copy session wizard u can copy a session in a different folder or repository.But that target folder or repository should consists of mapping of that session. If target folder or repository is not having the maping of copying session , u should have to copy that maping first before u copy the session. What r the different threads in DTM process? Master thread: Creates and manages all other threads Maping thread: One maping thread will be creates for each session.Fectchs session and maping information. Pre and post session threads: This will be created to perform pre and post session operations. Reader thread: One thread will be created for each partition of a source.It reads data from source. Writer thread: It will be created to load data to the target. Transformation thread: It will be created to tranform data. What r the tasks that Loadmanger process will do? Manages the session and batch scheduling: Whe u start the informatica server the load maneger launches and queries the repository for a list of sessions configured to run on the informatica server.When u configure the session the loadmanager maintains list of list of sessions and session start times.When u sart a session loadmanger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process.
36

Locking and reading the session: When the informatica server starts a session lodamaager locks the session from the repository.Locking prevents U starting the session again and again. Reading the parameter file: If the session uses a parameter files,loadmanager reads the parameter file and verifies that the session level parematers are declared in the file Verifies permission and privelleges: When the sesson starts load manger checks whether or not the user have privelleges to run the session. Creating log files: Loadmanger creates logfile contains the status of session. How the informatica server increases the session performance through partitioning the source? For a relational sources informatica server creates multiple connections for each parttion of a single source and extracts seperate range of data for each connection.Informatica server reads multiple partitions of a single source concurently.Similarly for loading also informatica server creates multiple connections to the target and loads partitions of data concurently. For XML and file sources,informatica server reads multiple files concurently.For loading the data informatica server creates a seperate file for each partition(of a source file).U can choose to merge the targets. Why we use partitioning the session in informatica? Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline. Informatica server can achieve high performance by partitioning the pipleline and performing the extract , transformation, and load for each partition in parallel. Which tool U use to create and manage sessions and batches and to monitor and stop the informatica server? Informatica server manager. How can u recognise whether or not the newly added rows in the source r gets insert in the target ? In the Type2 maping we have three options to recognise the newly added rows Version number Flagvalue Effective date Range What r the different types of Type2 dimension maping? Type2 Dimension/Version Data Maping: In this maping the updated dimension in the source will gets inserted in target along with a new version number.And newly added dimension in source will inserted into target with a primary key. Type2 Dimension/Flag current Maping: This maping is also used for slowly changing dimensions.In addition it creates a flag value for changed or new dimension. Flag indiactes the dimension is new or newlyupdated.Recent dimensions will gets saved with cuurent flag value 1. And updated dimensions r saved with the value 0. Type2 Dimension/Effective Date Range Maping: This is also one flavour of Type2 maping used for slowly changing dimensions.This maping also inserts both new and changed dimensions in to the target.And changes r tracked by the effective date range for each version of each dimension. What r the types of maping in Getting Started Wizard? Simple Pass through maping : Loads a static fact or dimension table by inserting all rows. Use this mapping when you want to drop all existing data from your table before loading new data.
37

Slowly Growing target : Loads a slowly growing fact or dimension table by inserting new rows. Use this mapping to load new data when existing data does not require updates. What r the types of mapping wizards that r to be provided in Informatica? The Designer provides two mapping wizards to help you create mappings quickly and easily. Both wizards are designed to create mappings for loading and maintaining star schemas, a series of dimensions related to a central fact table. Getting Started Wizard. Creates mappings to load static fact and dimension tables, as well as slowly growing dimension tables. Slowly Changing Dimensions Wizard. Creates mappings to load slowly changing dimension tables based on the amount of historical dimension data you want to keep and the method you choose to handle historical dimension data. What r the options in the target session of update strategy transformation? Insert Delete Update Update as update Update as insert Update else insert Truncate table what is update strategy transformation ? The model you choose constitutes your update strategy, how to handle changes to existing rows. In PowerCenter and PowerMart, you set your update strategy at two different levels: Within a session. When you configure a session, you can instruct the Informatica Server to either treat all rows in the same way (for example, treat all rows as inserts), or use instructions coded into the session mapping to flag rows for different database operations. Within a mapping. Within a mapping, you use the Update Strategy transformation to flag rows for insert, delete, update, or reject.
What is the default join that source qualifier provides? The Joiner transformation supports the following join types, which you set in the Properties tab: * Normal (Default) * Master Outer * Detail Outer * Full Outer What is the target load order? U specify the target load order based on source qualifiers in a maping.If u have the multiple source qualifiers connected to the multiple targets,U can designatethe order in which informatica server loads data into the targets. What r the tasks that source qualifier performs? # Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier. # Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query. # Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query. # Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query. # Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT
38

statement to the default SQL query. # Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure. Whats difference betweeen operational data stage (ODS) & data warehouse? A dataware house is a decision support database for organisational needs.It is subject oriented,non volatile,integrated ,time varient collect of data. ODS(Operational Data Source) is a integrated collection of related information . it contains maximum 90 days information. Why we use lookup transformations? Use a Lookup transformation in your mapping to look up data in a relational table, view, or synonym. Import a lookup definition from any relational database to which both the Informatica Client and Server can connect. You can use multiple Lookup transformations in a mapping.
what are the UTPS? Utps are done to check the mappings are done according to given business rules.utp is the (unit test plan ) done by deveploper. what are the transformations that restrict the partitioning of sessions? *Advanced External procedure transformation and External procedure transformation: This Transformation contains a check box on the properties tab to allow partitioning. *Aggregator Transformation: If you use sorted ports you cannot partition the associated source *Joiner Transformation: you can not partition the master source for a joiner transformation *Normalizer Transformation *XML targets. How do u import VSAM files from source to target? In mapping Designer we have direct option to import files from VSAM Navigation : Sources => Import from file => file from COBOL. What is difference between IIF and DECODE function? You can use nested IIF statements to test multiple conditions. The following example tests for various conditions and returns 0 if sales is zero or negative: IIF( SALES > 0, IIF( SALES < 50, SALARY1, IIF( SALES < 100, SALARY2, IIF( SALES < 200, SALARY3, BONUS))), 0 ) You can use DECODE instead of IIF in many cases. DECODE may improve readability. The following shows how you can use DECODE instead of IIF : SALES > 0 and SALES < 50, SALARY1, SALES > 49 AND SALES < 100, SALARY2, SALES > 99 AND SALES < 200, SALARY3, SALES > 199, BONUS) How do you decide whether you need to do aggregations at database level or at Informatica level?
39

It depends upon our requirment only.If you have good processing database you can create aggregation table or view at database level else its better to use informatica. Here im explaing why we need to use informatica.what ever it may be informatica is a thrid party tool, so it will take more time to process aggregation compared to the database, but in Informatica an option we called Incremental aggregation which will help you to update the current values with current values +new values. No necessary to process entire values again and again. Unless this can be done if nobody deleted that cache files. If that happend total aggregation we need to execute on informatica also. In database we dont have Incremental aggregation facility. See informatica is basically a integration tool.it all depends on the source u have and ur requirment.if u have a EMS Q, or flat file or any source other than RDBMS, u need info to do any kind of agg functions. if ur source is a RDBMS, u r not only doing the aggregation using informatica right?? there will be a bussiness logic behind it. and u need to do some other things like looking up against some table or joining the agg result with the actual source. etc if in informatica if u r asking whether to do it in the mapping level or at DB level, then fine its always better to do agg at the DB level by using SQL over ride in SQ, if only aggr is the main purpose of ur mapping. it definetly improves the performance. What r the types of lookup? Wednesday, September 27th, 2006 i> connected ii> unconnected iii> cached iv> uncached What are Dimensions and various types of Dimensions? Wednesday, September 27th, 2006 They are a set of level properties that describe a specific aspect of a business, used for analyzing the factual measures of one or more cubes, which use that dimension. Egs. Geography, time, customer and product. Dimensions are classified to 3 types. 1.SCD TYPE 1(Slowly Changing Dimension): which contains current data. 2.SCD TYPE 2(Slowly Changing Dimension): which contains current data,+complete historical data. 3.SCD TYPE 3(Slowly Changing Dimension): which contains current data. What is CDC? Changed Data Capture (CDC) helps identify the data in the source system that has changed since the last extraction. With CDC, data extraction takes place at the same time the insert, update, or delete operations occur in the source tables, and the change data is stored inside the database in change tables. The change data, thus captured, is then made available to the target systems in a controlled manner. How can u recognise whether or not the newly added rows in the source r gets insert in the target ? In the Type2 maping we have three options to recognise the newly added rows Version number Flagvalue Effective date Range What r the different types of Type2 dimension maping?
40

Type2 Dimension/Version Data Maping: In this maping the updated dimension in the source will gets inserted in target along with a new version number.And newly added dimension in source will inserted into target with a primary key. Type2 Dimension/Flag current Maping: This maping is also used for slowly changing dimensions.In addition it creates a flag value for changed or new dimension. Flag indiactes the dimension is new or newlyupdated.Recent dimensions will gets saved with cuurent flag value 1. And updated dimensions r saved with the value 0. Type2 Dimension/Effective Date Range Maping: This is also one flavour of Type2 maping used for slowly changing dimensions.This maping also inserts both new and changed dimensions in to the target.And changes r tracked by the effective date range for each version of each dimension. What r the types of maping in Getting Started Wizard? Simple Pass through maping : Loads a static fact or dimension table by inserting all rows. Use this mapping when you want to drop all existing data from your table before loading new data. Slowly Growing target : Loads a slowly growing fact or dimension table by inserting new rows. Use this mapping to load new data when existing data does not require updates. What r the types of maping wizards that r to be provided in Informatica? The Designer provides two mapping wizards to help you create mappings quickly and easily. Both wizards are designed to create mappings for loading and maintaining star schemas, a series of dimensions related to a central fact table. Getting Started Wizard. Creates mappings to load static fact and dimension tables, as well as slowly growing dimension tables. Slowly Changing Dimensions Wizard. Creates mappings to load slowly changing dimension tables based on the amount of historical dimension data you want to keep and the method you choose to handle historical dimension data. What r the options in the target session of update strategy transsformatioin? Insert Delete Update Update as update Update as insert Update esle insert Truncate table What is the default source option for update stratgey transformation? Data driven. what is update strategy transformation ? The model you choose constitutes your update strategy, how to handle changes to existing rows. In PowerCenter and PowerMart, you set your update strategy at two different levels: Within a session. When you configure a session, you can instruct the Informatica Server to either treat all rows in the same way (for example, treat all rows as inserts), or use instructions coded into the session mapping to flag rows for different database operations. Within a mapping. Within a mapping, you use the Update Strategy transformation to flag rows for insert, delete, update, or reject.
What is the default join that source qualifier provides?
41

The Joiner transformation supports the following join types, which you set in the Properties tab: * Normal (Default) * Master Outer * Detail Outer * Full Outer What is the target load order? U specify the target loadorder based on source qualifiers in a maping.If u have the multiple source qualifiers connected to the multiple targets,U can designatethe order in which informatica server loads data into the targets. What r the tasks that source qualifier performs? # Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier. # Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query. # Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the join information specified by the metadata in the SQL query. # Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query. # Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query. # Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure. How do you decide whether you need to do aggregations at database level or at Informatica level? It depends upon our requirment only.If you have good processing database you can create aggregation table or view at database level else its better to use informatica. Here im explaing why we need to use informatica.what ever it may be informatica is a thrid party tool, so it will take more time to process aggregation compared to the database, but in Informatica an option we called Incremental aggregation which will help you to update the current values with current values +new values. No necessary to process entire values again and again. Unless this can be done if nobody deleted that cache files. If that happend total aggregation we need to execute on informatica also. In database we dont have Incremental aggregation facility. See informatica is basically a integration tool.it all depends on the source u have and ur requirment.if u have a EMS Q, or flat file or any source other than RDBMS, u need info to do any kind of agg functions. if ur source is a RDBMS, u r not only doing the aggregation using informatica right?? there will be a bussiness logic behind it. and u need to do some other things like looking up against some table or joining the agg result with the actual source. etc if in informatica if u r asking whether to do it in the mapping level or at DB level, then fine its always better to do agg at the DB level by using SQL over ride in SQ, if only aggr is the main purpose of ur mapping. it definetly improves the performance. What r the types of lookup? Wednesday, September 27th, 2006 i> connected ii> unconnected iii> cached iv> uncached
42

What are Dimensions and various types of Dimensions? They are a set of level properties that describe a specific aspect of a business, used for analyzing the factual measures of one or more cubes, which use that dimension. Egs. Geography, time, customer and product. Dimensions are classified to 3 types. 1.SCD TYPE 1(Slowly Changing Dimension): which contains current data. 2.SCD TYPE 2(Slowly Changing Dimension): which contains current data,+complete historical data. 3.SCD TYPE 3(Slowly Changing Dimension): which contains current data. What is CDC? Changed Data Capture (CDC) helps identify the data in the source system that has changed since the last extraction. With CDC, data extraction takes place at the same time the insert, update, or delete operations occur in the source tables, and the change data is stored inside the database in change tables. The change data, thus captured, is then made available to the target systems in a controlled manner. Can i use a session Bulk loading option that time can i make a recovery to the session? f the session is configured to use in bulk mode it will not write recovery information to recovery tables. So Bulk loading will not perform the recovery as required. IF u had to split the source level key going into two seperate tables. One as surrogate and other as primary. Since informatica does not gurantee keys are loaded properly(order!) into those tables. What are the different ways you could handle this type of situation? Ur sending data into two different tbls of course by setting primary key in the target tables ..the thing is the surrogate key generation requires seperate mapping which suits for scd concept.. Can we revert back reusable transformation to normal transformation? no, it is not reversible when you open a transformation in edit mode, there is a check box named REUSABLE if you tick , it will give you a message saying that making reusable is not reversible How many types of facts and what are they? There are Factless Facts:Facts without any measures. Additive Facts:Fact data that can be additive/aggregative. Non-Additive facts: Facts that are result of non-additon Semi-Additive Facts: Only few colums data can be added. Periodic Facts: That stores only one row per transaction that happend over a period of time. Accumulating Fact: stores row for entire lifetime of event. why sorter transformation is an active transformation? This is type of active transformation which is responsible for sorting the data either in the ascending order or descending order according to the key specifier. the port on which the sorting takes place is called as sortkeyport properties if u select distinct eliminate duplicates
43

case sensitive valid for strings to sort the data null treated low null values are given least priority One more thing isAn active transformation can also behave like a passive. What are variable ports and list two situations when they can be used? We have mainly tree ports Inport, Outport, Variable port. Inport represents data is flowing into transformation. Outport is used when data is mapped to next transformation. Variable port is used when we mathematical caluculations are required. If any addition i will be more than happy if you can share. what is the look up transformation? Use lookup transformation in ur mapping to lookup data in a relational table,view,synonym. Informatica server queries the look up table based on the lookup ports in the transformation.It compares the lookup transformation port values to lookup table column values based on the look up condition. How can you improve the performance of Aggregate transformation? we can improve the agrregator performence in the following ways1.send sorted input. 2.increase aggregator cache size.i.e Index cache and data cache. 3.Give input/output what you need in the transformation.i.e reduce number of input and output ports. What is the difference between PowerCenter 6 and powercenter 7? 1)lookup the flat files in informatica 7.X but we cannt lookup flat files in informatica 6.X 2) External Stored Procedure Transformation is not available in informatica 7.X but this transformation included in informatica 6.X, Also custom transformation is not available in 6.x The main difference is the version control available in 7.x Session level error handling is available in 7.x XML enhancements for data integration in 7.x
In my source table 1000 recs r there.I want to load 501 rec to 1000 rec into my Target table ?how can u do this ? In db2 we write statement as fetch first 500 rows only.in informatica we can do by using sequence generator and filter out the row when exceds 500. or You can overide the sql Query in Wofkflow Manager. LIke select * from tab_name where rownum<=1000 minus select * from tab_name where rownum<=500; how is the union transformation active transformation? Active Transformation: the transformation that change the no. of rows in the Target.
44

Source (100 rows) > Active Transformation > Target (< or > 100 rows) Passive Transformation: the transformation that does not change the no. of rows in the Target. Source (100 rows) > Passive Transformation > Target (100 rows) Union Transformation: in Union Transformation, we may combine the data from two (or) more sources. Assume, Table-1 contains 10 rows and Table-2 contains 20 rows. If we combine the rows of Table-1 and Table-2, we will get a total of 30 rows in the Target. So, it is definetly an Active Transformation. Why we use lookup transformations? Use a Lookup transformation in your mapping to look up data in a relational table, view, or synonym. Import a lookup definition from any relational database to which both the Informatica Client and Server can connect. You can use multiple Lookup transformations in a mapping. what are the UTPS? Utps are done to check the mappings are done according to given business rules.utp is the (unit test plan ) done by deveploper. what are the transformations that restrict the partitioning of sessions? *Advanced External procedure transformation and External procedure transformation: This Transformation contains a check box on the properties tab to allow partitioning. *Aggregator Transformation: If you use sorted ports you cannot partition the associated source *Joiner Transformation: you can not partition the master source for a joiner transformation *Normalizer Transformation *XML targets. How do u import VSAM files from source to target? In mapping Designer we have direct option to import files from VSAM Navigation : Sources => Import from file => file from COBOL. How does the server recognise the source and target databases? by using ODBC connection.if it is relational.if is flat file FTP connection..see we can make sure with connection in the properties of session both sources && targets. Suppose session is configured with commit interval of 10,000 rows and source has 50,000 rows. Explain the commit points for Source based commit and Target based commit. Source based commit will commit the data into target based on commit interval.so,for every 10,000 rows it will commit into target. Target based commit will commit the data into target based on buffer size of the target.i.e., it commits the data into target when ever the buffer fills.Let us assume that the buffer size is 6,000.So,for every 6,000 rows it commits the data. Identifying bottlenecks in various components of Informatica and resolving them. The best way to find out bottlenecks is writing to flat file and see where the bottle neck is . What is the difference between summary filter and detail filter? summary filter can be applieid on a group of rows that contain a common value.where as detail filters can be applied on each and every rec of the data base.
45

How to read rejected data or bad data from bad file and reload it to target? Correct the rejected data and send to target relational tables using loadorder utility. Find out the rejected data by using column indicatior and row indicator. If u have done any modifications for table in back end does it reflect in informatca warehouse or maping desginer or source analyzer? NO. Informatica is not at all concern with back end data base.It displays u all the information that is to be stored in repository.If want to reflect back end changes to informatica screens, again u have to import from back end to informatica by valid connection.And u have to replace the existing files with imported files. How can u work with remote database in informatica?did u work directly by using remote connections? To work with remote datasource u need to connect it with remote connections.But it is not preferable to work with that remote source directly by using remote connections .Instead u bring that source into U r local machine where informatica server resides.If u work directly with remote source the session performance will decreases by passing less amount of data across the network in a particular time. What is power center repository? The PowerCenter repository allows you to share metadata across repositories to create a data mart domain. In a data mart domain, you can create a single global repository to store metadata used across an enterprise, and a number of local repositories to share the global metadata as needed. What r the types of metadata that stores in repository? Following r the types of metadata that stores in the repository Database connections Global objects Mappings Mapplets Multidimensional metadata Reusable transformations Sessions and batches Short cuts Source definitions Target defintions Transformations Define informatica repository? The Informatica repository is a relational database that stores information, or metadata, used by the Informatica Server and Client tools. Metadata can include information such as mappings describing how to transform source data, sessions indicating when you want the Informatica Server to perform the transformations, and connect strings for sources and targets. The repository also stores administrative information such as usernames and passwords, permissions and privileges, and product version. Use repository manager to create the repository.The Repository Manager connects to the repository database and runs the code needed to create the repository tables.Thsea tables stores metadata in specific format the informatica server,client tools use. What is difference between maplet and reusable transformation? Maplet consists of set of transformations that is reusable.A reusable transformation is a single transformation that can be reusable. If u create a variables or parameters in maplet that can not be used in another maping or maplet.Unlike the variables that r created in a reusable transformation can be usefull in any other maping or maplet.
46

We can not include source definitions in reusable transformations.But we can add sources to a maplet. Whole transformation logic will be hided in case of maplet.But it is transparent in case of reusable transformation. We cant use COBOL source qualifier,joiner,normalizer transformations in maplet.Where as we can make them as a reusable transformations. What is parameter file? Parameter file is to define the values for parameters and variables used in a session.A parameter file is a file created by text editor such as word pad or notepad. U can define the following values in parameter file Maping parameters Maping variables session parameters Can u start a batches with in a batch? U can not. If u want to start batch that resides in a batch,create a new independent batch and copy the necessary sessions into the new batch. What is batch and describe about types of batches? Grouping of session is known as batch.Batches r two types Sequential: Runs sessions one after the other Concurrent: Runs session at same time. If u have sessions with source-target dependencies u have to go for sequential batch to start the sessions one after another.If u have several independent sessions u can use concurrent batches. Whch runs all the sessions at the same time. Can u copy the session to a different folder or repository? Yes. By using copy session wizard u can copy a session in a different folder or repository.But that target folder or repository should consists of mapping of that session. If target folder or repository is not having the maping of copying session , u should have to copy that maping first before u copy the session. What r the different threads in DTM process? Master thread: Creates and manages all other threads Maping thread: One maping thread will be creates for each session.Fectchs session and maping information. Pre and post session threads: This will be created to perform pre and post session operations. Reader thread: One thread will be created for each partition of a source.It reads data from source. Writer thread: It will be created to load data to the target. Transformation thread: It will be created to tranform data.
What r the tasks that Loadmanger process will do? Manages the session and batch scheduling: Whe u start the informatica server the load maneger launches and queries the repository for a list of sessions configured to run on the informatica server.When u configure the session the loadmanager
47

maintains list of list of sessions and session start times.When u sart a session loadmanger fetches the session information from the repository to perform the validations and verifications prior to starting DTM process. Locking and reading the session: When the informatica server starts a session lodamaager locks the session from the repository.Locking prevents U starting the session again and again. Reading the parameter file: If the session uses a parameter files,loadmanager reads the parameter file and verifies that the session level parematers are declared in the file Verifies permission and privelleges: When the sesson starts load manger checks whether or not the user have privelleges to run the session. Creating log files: Loadmanger creates logfile contains the status of session. How the informatica server increases the session performance through partitioning the source? For a relational sources informatica server creates multiple connections for each parttion of a single source and extracts seperate range of data for each connection.Informatica server reads multiple partitions of a single source concurently.Similarly for loading also informatica server creates multiple connections to the target and loads partitions of data concurently. For XML and file sources,informatica server reads multiple files concurently.For loading the data informatica server creates a seperate file for each partition(of a source file).U can choose to merge the targets. Why we use partitioning the session in informatica? Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline. Informatica server can achieve high performance by partitioning the pipleline and performing the extract , transformation, and load for each partition in parallel. Which tool U use to create and manage sessions and batches and to monitor and stop the informatica server? Informatica server manager. How can u recognise whether or not the newly added rows in the source r gets insert in the target ? In the Type2 maping we have three options to recognise the newly added rows Version number Flagvalue Effective date Range What r the different types of Type2 dimension maping? Type2 Dimension/Version Data Maping: In this maping the updated dimension in the source will gets inserted in target along with a new version number.And newly added dimension in source will inserted into target with a primary key. Type2 Dimension/Flag current Maping: This maping is also used for slowly changing dimensions.In addition it creates a flag value for changed or new dimension. Flag indiactes the dimension is new or newlyupdated.Recent dimensions will gets saved with cuurent flag value 1. And updated dimensions r saved with the value 0. Type2 Dimension/Effective Date Range Maping: This is also one flavour of Type2 maping used for slowly changing dimensions.This maping also inserts both new and changed dimensions in to the target.And changes r tracked by the effective date range for each version of each dimension. What r the types of maping in Getting Started Wizard?
48

Simple Pass through maping : Loads a static fact or dimension table by inserting all rows. Use this mapping when you want to drop all existing data from your table before loading new data. Slowly Growing target : Loads a slowly growing fact or dimension table by inserting new rows. Use this mapping to load new data when existing data does not require updates. What r the types of maping wizards that r to be provided in Informatica? The Designer provides two mapping wizards to help you create mappings quickly and easily. Both wizards are designed to create mappings for loading and maintaining star schemas, a series of dimensions related to a central fact table. Getting Started Wizard. Creates mappings to load static fact and dimension tables, as well as slowly growing dimension tables. Slowly Changing Dimensions Wizard. Creates mappings to load slowly changing dimension tables based on the amount of historical dimension data you want to keep and the method you choose to handle historical dimension data. What r the options in the target session of update strategy transsformatioin? Insert Delete Update Update as update Update as insert Update esle insert Truncate table what is update strategy transformation ? The model you choose constitutes your update strategy, how to handle changes to existing rows. In PowerCenter and PowerMart, you set your update strategy at two different levels: Within a session. When you configure a session, you can instruct the Informatica Server to either treat all rows in the same way (for example, treat all rows as inserts), or use instructions coded into the session mapping to flag rows for different database operations. Within a mapping. Within a mapping, you use the Update Strategy transformation to flag rows for insert, delete, update, or reject.
What is the default join that source qualifier provides? The Joiner transformation supports the following join types, which you set in the Properties tab: * Normal (Default) * Master Outer * Detail Outer * Full Outer What is the target load order? U specify the target loadorder based on source qualifiers in a maping.If u have the multiple source qualifiers connected to the multiple targets,U can designatethe order in which informatica server loads data into the targets. What r the tasks that source qualifier performs? # Join data originating from the same source database. You can join two or more tables with primary-foreign key relationships by linking the sources to one Source Qualifier. # Filter records when the Informatica Server reads source data. If you include a filter condition, the Informatica Server adds a WHERE clause to the default query. # Specify an outer join rather than the default inner join. If you include a user-defined join, the Informatica Server replaces the
49

join information specified by the metadata in the SQL query. # Specify sorted ports. If you specify a number for sorted ports, the Informatica Server adds an ORDER BY clause to the default SQL query. # Select only distinct values from the source. If you choose Select Distinct, the Informatica Server adds a SELECT DISTINCT statement to the default SQL query. # Create a custom query to issue a special SELECT statement for the Informatica Server to read source data. For example, you might use a custom query to perform aggregate calculations or execute a stored procedure.
50

DWH Interview Q&A

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

DWH Interview Q&A

Încărcat de

Drepturi de autor:

Formate disponibile

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Supports user-defined default values.

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Difference between static cache and dynamic cache.

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

What is the default join that source qualifier provides?

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

Informatica Interview Questions & Answers

S-ar putea să vă placă și