Showing posts with label dataware housing faqs. Show all posts
Showing posts with label dataware housing faqs. Show all posts

ETL Dataware housing Interview

What are the various methods of getting incremental records or delta records from the ource systems?
getting incremental records from source systems to target can be done
by using incremental aggregation transformation

How can we use mapping variables in Informatica? Where do we use them?
After creating a variable, we can use it in any expression in a mapping or a mapplet. Als they can be used in source qualifier filter, user defined joins or extract overrides and in expression editor of reusable transformations.
Their values can change automatically between sessions.

Do we need an ETL tool? When do we go for the tools in the market?
ETL Tools are meant to extract, transform and load the data into Data Warehouse for decision making. Before the evolution of ETL Tools, the above mentioned ETL process was done manually by using SQL code created by programmers. This task was tedious and cumbersome in many cases since it involved many resources, complex coding and more work hours. On top of it, maintaining the code placed a great challenge among the programmers.
These difficulties are eliminated by ETL Tools since they are very powerful and they offer many advantages in all stages of ETL process starting from extraction, data cleansing, data profiling, transformation, debuggging and loading into data warehouse when compared to the old method.

How do we extract SAP data Using Informatica? What is ABAP?
What are IDOCS?
Go to source analser ,click on source,now u will get option ‘Import from SAP’
click on this now give your SAP access user,client,password and filter criteria as table name(so it will take lessertime).Afetr connecting ,import the sap source.
Now one important thing after finishing the map save it and generate ABAP Code for the map.Then only workflow will be running fine.

What are active transformation / Passive transformations?
ACTIVE Transformations:Transformations those can affect the number of records between Input and Output.
Passive Transformations:Transformations those do not  affect the number of records between Input and Output.

What is Informatica Metadata and where is it stored?
Informatica Metadate is nothing but the source definition, target definition and the transformations that have been built. They are stored in the respository server and is managed by Repository manager of the informatica client tool.

How do we call shell scripts from informatica?
You can use a Command task to call the shell scripts, in the following ways:
1. Standalone Command task. You can use a Command task anywhere in the workflow or worklet to run shell commands.
2. Pre- and post-session shell command. You can call a Command task as the pre- or post-session shell command for a Session task. For more information about specifying pre-session and post-session shell commands

What is latest version of Power Center / Power Mart?
Latest Ver of Power Center is 7.1 and Informatica stopped marketing powermart product.So right now powermart is not available in market.

Can we override a native sql query within Informatica? Where do we do it? How do we do it?
we can override a sql query in the sql override property of a source qualifier

Can we use procedural logic inside Infromatica? If yes how , if now how can we use external procedural logic in informatica?
We can use External Procedure Transformation to use external procedures. Both COM and Informatica Procedures are supported using External procedure Transformation

Techniques of Error Handling – Ignore , Rejecting bad records to a flat file , loading the records and reviewing them (default values)
Rejection of records either at the database due to constraint key violation or the informatica server when writing data into target table.These rejected records we can find in the badfiles folder where a reject file will be created for a session.we can check why a record has been rejected.And this bad file contains first column a row indicator and second column a column indicator.
These row indicators or of four types
D-valid data,
O-overflowed data,
N-null data,
T- Truncated data,
And depending on these indicators we can changes to load data successfully to target.


ETL Datawarehousing Interview questions 2

What is a staging area? Do we need it? What is the purpose of a staging area?
Staging area is place where you hold temporary tables on data warehouse server. Staging tables are connected to work area or fact tables. We basically need staging area to hold the data , and perform data cleansing and merging , before loading the data into warehouse.

What are the modules in Power Mart?
  1. PowerMart Designer
    2. Server
    3. Server Manager
    4. Repository
    5. Repository Manager 
  2. What is a three tier data warehouse?
    The 3 tiers are:
    1. Data tier – bottom tier – consists of the database
    2. Application tier – middle tier – consists of the analytical server
    3. Presentation tier – tier that interacts with the end-user 

    What are the various transformation available?
    The various type of transformation in informatica
    source qualifier
    aggregate
    sequence generator
    sorter
    router
    filter
    lookup
    update strategy
    joiner
    normalizer
    expression
    How to determine what records to extract?
    When addressing a table some dimension key must reflect the need for a record to get extracted. Mostly it will be from time dimension (e.g. date >= 1st of current mth) or a transaction flag (e.g. Order Invoiced Stat). Foolproof would be adding an archive flag to record which gets reset when record changes.

Dataware housing Interview question and answers

What is Data warehousing?
A data warehouse can be considered as a storage area where interest specific or relevant data is stored irrespective of the source. What actually is required to create a data warehouse can be considered as Dataware housing Data warehousing merges data from multiple sources into an easy and complete form.
What are fact tables and dimension tables?
As mentioned, data in a warehouse comes from the transactions. Fact table in a data warehouse consists of facts and/or measures. The nature of data in a fact table is usually numerical.
On the other hand, dimension table in a data warehouse contains fields used to describe the data in fact tables. A dimension table can provide additional and descriptive information (dimension) of the field of a fact table.
e.g. If I want to know the number of resources used for a task, my fact table will store the actual measure (of resources) while my Dimension table will store the task and resource details.
Hence, the relation between a fact and dimension table is one to many.
What is ETL process in data warehousing?
ETL stands for Extraction, transformation and loading. That means extracting data from different sources such as flat files, databases or XML data, transforming this data depending on the application’s need and loads this data into data warehouse.
Explain the difference between data mining and data warehousing?
Data mining is a method for comparing large amounts of data for the purpose of finding patterns.Data mining is normally used for models and forecasting. Data mining is the process of correlations, patterns by shifting through large data repositories using pattern recognition techniques.
Data warehousing is the central repository for the data of several business systems in an enterprise. Data from various resources extracted and organized in the data warehouse selectively for analysis and accessibility.
What is an OLTP system and OLAP system?
OLTP stands for OnLine Transaction Processing.Application that supports and manges transactions which involve high volumes of data are supported by OLTP system. OLTP is based on client-server architecture and supports transactions across networks.
OLAP stands for OnLine Analytical Processing. Business data analysis and complex calculations on low volumes of data are performed by OLAP. An insight of data coming from various resources can be gained by a user with the support of OLAP.
What are cubes?
Multi dimensional data is logically represented by Cubes in data warehousing. The dimension and the data are represented by the edge and the body of the cube respectively. OLAP environments view the data in the form of hierarchical cube. A cube typically includes the aggregations that are needed for business intelligence queries.