Data Warehousing Interview Questions and Answers
A. Data warehousing means querying, reporting and analyzing the information from the database.
2. What is the benefits of data warehouse?
A. A data warehouse helps to integrate data (see Data integration) and store them historically so that we can analyze different aspects of business including, performance analysis, trend, prediction etc. over a given time frame and use the result of our analysis to improve the efficiency of business processes.
3. What is ODS?
A. ODS means Operational Data Store. A collection of operation or bases data that is extracted from operation databases and standardized, cleansed, consolidated, transformed, and loaded into an enterprise data architecture. An ODS is used to support data mining of operational data, or as the store for base data that is summarized for a data warehouse.
4. What is a lookup table?
A. A lookUp table is the one which is used when updating a warehouse. When the lookup is placed on the target table (fact table / warehouse) based upon the primary key of the target, it just updates the table by allowing only new records or updated records based on the lookup condition.
5. What is a dimension table?
A. A dimensional table is a collection of hierarchies and categories along which the user can drill down and drill up. it contains only the textual attributes.
6. Why should you put your data warehouse on a different system than your OLTP system?
A. A OLTP system is basically “data oriented” (ER model) and not “Subject oriented “(Dimensional Model) .That is why we design a separate system that will have a subject oriented OLAP system…
Moreover if a complex query is fired on a OLTP system will cause a heavy overhead on the OLTP server that will affect the day to day business directly.
7. Why is Data Modeling Important?
A. The goal of the data model is to make sure that the all data objects required by the database are completely and accurately represented. Because the data model uses easily understood notations and natural language, it can be reviewed and verified as correct by the end-users.
8. What is data mining?
A. Data mining is a process of extracting hidden trends within a data warehouse. For example an insurance dataware house can be used to mine data for the most high risk people to insure in a certain geographical area.
9. What is ETL?
A. ETL stands for extraction, transformation and loading. ETL provide developers with an interface for designing source-to-target mappings, transformation and job control parameter.
10. What is the Difference between OLTP and OLAP?
A. Main Differences between OLTP and OLAP are:-
1. User and System Orientation
OLTP: customer-oriented, used for data analysis and querying by clerks, clients and IT professionals.
OLAP: market-oriented, used for data analysis by knowledge workers( managers, executives, analysis).
2. Data Contents
OLTP: manages current data, very detail-oriented.
OLAP: manages large amounts of historical data, provides facilities for summarization and aggregation, stores information at different levels of granularity to support decision making process.
11. What are Data Marts?
A. Data Marts are designed to help manager make strategic decisions about their business. Data Marts are subset of the corporate-wide data that is of value to a specific group of users.
12. Types of Data Marts?
A. 1. Independent data marts – sources from data captured form OLTP system, external providers or from data generated locally within a particular department or geographic area.
2. Dependent data mart – sources directly form enterprise data warehouses.
13. What is a CUBE in Data Warehousing concept?
A. Cubes are logical representation of multidimensional data. The edge of the cube contains dimension members and the body of the cube contains data values.
14. What is the Datatype of Surrgate key?
A. Datatype of the surrgate key is either integer or numeric or number.
15. What is Bus Schema?
A.BUS Schema is composed of a master suite of confirmed dimension and standardized definition if facts. In a BUS schema we would eventually have conformed dimensions and facts defined to be shared across all enterprise data marts. This way all Data Marts can use the conformed dimensions and facts without having them locally.
16. What is the difference between star and snowflake schemas?
A. Star schema: A single fact table with N number of Dimension. Star schema is a logical structure that can be arranged with fact and dimension tables in a star formation. It looks like a star with fact table at the core of the star and the dimension tables along the spikes of the star. The dimension model is hence called a star schema.
SNOWFLAKING is a method of normalizing the dimension tables in a star schema. Snowflake schema: Any dimensions with extended dimensions are know as snowflake schema.
17. What is data cleaning? How can we do that?
Data Cleaning is a process of avoiding the unnecessary information in the process of data maintenance. Data Cleaning can be done by using clustering Data cleaning, technically called “Data Cleansing” is a group of methods for making data more reliable and accurate.
For Data Warehousing Online training please log on to http://www.revanthtechnologies.com