Data warehouses have been defined in many ways, making it difficult to formulate a rigorous definition. Loosely speaking, A data warehouse refers to a database that is maintained separately from an organization’s operational databases. Data warehouse systems allow for the integration of a variety of application systems. They support information processing by providing a solid platform of consolidated historical data for analysis.
According to William H. Inmon, a leading architect in the construction of a data warehouse system, “A data warehouse is a subject-oriented, integrated, time-variant and nonvolatile collection of data in support of management’s decision-making process”. Let’s take a closer look at each of them. these four keywords that: distinguish data warehouses from other data repository `systems.
Subject Oriented—
A data warehouse is organized around major subjects such as a customer, supplier, product, and sales. A data warehouse focuses on the modeling and analysis of data for decision-makers. Hence data warehouses typically provide a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process.
Integrated —
A data warehouse is usually constructed by integrating Multiple heterogeneous sources, such as relational databases, flat files., and Online transaction records. Data cleaning and data integration techniques are applied to ensure consistency in naming conventions, encoding structures, attribute measures, and so on.
Time-variant —
Data is stored to provide information from a ‘historical perspective. Every key structure in the data warehouse contains, either implicitly or explicitly, an element of time.
Nonvolatile —
A data warehouse is always a physically separate store of data transformed from the application data found in an operational environment. Due to this separation, a data warehouse does not require transaction processing, recovery, and concurrency control mechanisms.
In sum, a data warehouse is a semantically consistent data store that serves as a physical implementation of a decision support data model and stores the information on which an enterprise needs to make strategic decisions. A data warehouse is also often viewed as an architecture, constructed by integrating data from multiple heterogeneous sources to support structured and/or Adhoc queries, analytical reporting, and decision making.
Based on this information, we view data warehousing as A process of constructing and using data warehouses. The construction – of a data warehouse requires data cleaning, data integration, and data consolidation. The utilization of a data warehouse Often necessitates a collector of decision’ support, technologies. This; allows “knowledge workers” (e.g., managers; analysts, and executives). to use the Warehouse to quickly- and conveniently obtain an overview- of data, and to make sound decisions based on the information in the warehouse. Some authors use the term “data warehousing” to ‘refer only to the process of, data warehouse construction.