Showing posts with label etl tools. Show all posts
Showing posts with label etl tools. Show all posts

ETL Dataware housing Interview

What are the various methods of getting incremental records or delta records from the ource systems?
getting incremental records from source systems to target can be done
by using incremental aggregation transformation

How can we use mapping variables in Informatica? Where do we use them?
After creating a variable, we can use it in any expression in a mapping or a mapplet. Als they can be used in source qualifier filter, user defined joins or extract overrides and in expression editor of reusable transformations.
Their values can change automatically between sessions.

Do we need an ETL tool? When do we go for the tools in the market?
ETL Tools are meant to extract, transform and load the data into Data Warehouse for decision making. Before the evolution of ETL Tools, the above mentioned ETL process was done manually by using SQL code created by programmers. This task was tedious and cumbersome in many cases since it involved many resources, complex coding and more work hours. On top of it, maintaining the code placed a great challenge among the programmers.
These difficulties are eliminated by ETL Tools since they are very powerful and they offer many advantages in all stages of ETL process starting from extraction, data cleansing, data profiling, transformation, debuggging and loading into data warehouse when compared to the old method.

How do we extract SAP data Using Informatica? What is ABAP?
What are IDOCS?
Go to source analser ,click on source,now u will get option ‘Import from SAP’
click on this now give your SAP access user,client,password and filter criteria as table name(so it will take lessertime).Afetr connecting ,import the sap source.
Now one important thing after finishing the map save it and generate ABAP Code for the map.Then only workflow will be running fine.

What are active transformation / Passive transformations?
ACTIVE Transformations:Transformations those can affect the number of records between Input and Output.
Passive Transformations:Transformations those do not  affect the number of records between Input and Output.

What is Informatica Metadata and where is it stored?
Informatica Metadate is nothing but the source definition, target definition and the transformations that have been built. They are stored in the respository server and is managed by Repository manager of the informatica client tool.

How do we call shell scripts from informatica?
You can use a Command task to call the shell scripts, in the following ways:
1. Standalone Command task. You can use a Command task anywhere in the workflow or worklet to run shell commands.
2. Pre- and post-session shell command. You can call a Command task as the pre- or post-session shell command for a Session task. For more information about specifying pre-session and post-session shell commands

What is latest version of Power Center / Power Mart?
Latest Ver of Power Center is 7.1 and Informatica stopped marketing powermart product.So right now powermart is not available in market.

Can we override a native sql query within Informatica? Where do we do it? How do we do it?
we can override a sql query in the sql override property of a source qualifier

Can we use procedural logic inside Infromatica? If yes how , if now how can we use external procedural logic in informatica?
We can use External Procedure Transformation to use external procedures. Both COM and Informatica Procedures are supported using External procedure Transformation

Techniques of Error Handling – Ignore , Rejecting bad records to a flat file , loading the records and reviewing them (default values)
Rejection of records either at the database due to constraint key violation or the informatica server when writing data into target table.These rejected records we can find in the badfiles folder where a reject file will be created for a session.we can check why a record has been rejected.And this bad file contains first column a row indicator and second column a column indicator.
These row indicators or of four types
D-valid data,
O-overflowed data,
N-null data,
T- Truncated data,
And depending on these indicators we can changes to load data successfully to target.


ETL Dataware housing Interview questions

Explain why and where do we exactly use the lookup tranformations.
You can use the Lookup transformation to perform many tasks, including:
 Get a related value. For example, your source includes employee ID, but you want to
include the employee name in your target table to make your summary data easier to read.
  Perform a calculation. Many normalized tables include values used in a calculation, such as gross sales per invoice or sales tax, but not the calculated value (such as net sales).
 Update slowly changing dimension tables. You can use a Lookup transformation to
determine whether rows already exist in the target.

How do you tell aggregator stage that input data is already sorted
By enablign sorted input property in Aggregator Properties

What are push and pull etl strategies?Push and Pull strategies determine how data comes from source system to ETL server.
Push : In this case the Source system pushes data i.e.(sends data) to the ETL server.
Pull : In this case the ETL server pulls data i.e.(gets data) from the source system.

What is the Difference between a ODS and Staging Area
ODS :-Operational Data Store which contains data .
ods comes after the staging area
eg:-
In  our e.g lets consider that we have day level Granularity in the OLTP & Year level Granularity in the Data warehouse.
If  the business(manager) asks for  week level Granularity then we have to go to the oltp and summarize  the day level to the week level which would be pain taking.So wat we do is that we maintain  week level Granularity in the ods for the data,for  abt  30 to 90 days.
Note : Ods information would contain cleansed data only. ie after staging area
Staging Area :-
It comes after the etl has finished.Staging Area consists of
1.Meta Data .
2.The  work area where we apply our complex business rules.
3.Hold the data and do calculations.
In other words we can say that its a temp work area.
  
How you capture changes in data if the source system does not have option of storing date/time field in source table from where you need to extract the data?
The DW database can be Oracle or Teradata. The requirement here is to pull data from source system and ETL need to device a mechanism to identify the changes or new records. The source system can be a legacy system like AS400 application or Mainframe application. List out all such methods of data capture. The ETL can be Informatica, data stage or custom etl code.
If LKP on target table is taken, can we update the rows without update strategy transformation?
yes, by using dynamic lookup

In what scenario ETL coding is preferred than Database level SQL,PL/SQL coding?
Data scrubbing process is difficult. That is, file contains date column like 20070823 but data warehouse requires date as 08/23/2007 in that case it is difficult.