Abstract Data warehousing is the technology trend most often associated with enterprise computing in today's business environment. The data warehouse, in fact, is a culmination of new developments in database technology, including entity-relationship modeling, heuristic searches, mass data storage, neural networks, multiprocessing, and natural-language interfaces. The data warehouse is a centralized, integrated repository of information, one which can provide a vital competitive edge for corporate decision- making and product development. Types of data warehouses include the operational data store (ODS), the data mart, which is of great value in analyzing sales information, and the enterprise data warehouse, which can take either a centralized or distributed approach, PC Week, Feb 8, 1999. Data Warehouses The type of data warehouse an organization adopts should depend on the way the business operates and the types of decision support it needs.
One of the simplest types of data warehouse, an operational data store (ODS) is a replicated production database that has been adjusted for errors. An ODS is used primarily to generate standard operations reports and to provide transaction detail for summary-level analysis. Since an ODS replicates an OLTP system, some experts do not consider it a true data warehouse type. However, because ODS fit the broad definition and many data warehouses contain them. Depending on an organization's reporting needs, an ODS may be updated monthly, weekly, or more frequently, sometimes almost in real time (PC Week, Feb 8 1999). The main advantage is that it enhances production system performance, since reporting and query functions are off-loaded from the OLTP system to the ODS (PC Week, Feb 8 1999).
Another type of data warehouse is the data mart. Data marts are limited in scope, usually taking their information from a single department or business process. Data marts may be used for analyzing sales information in a specific region or for a particular product line, for example. Data marts usually contain only summary data but they can be linked to operational data stores for drilling down to transaction details if necessary. Data marts can be managed by IT departments, but only as often as they are managed directly by users in a department or work group (PC Week, Feb 8 1999).
While many OLAP applications can be performed on data marts, cross- departmental analysis, executive information systems, and data-mining applications need information gathered from the entire enterprise to be most effective. The enterprise data warehouse is used for this type of extensive data collection and analysis. Because of its scope and complexity, the enterprise data warehouse is usually managed by the central IT group. As its name implies, an enterprise data warehouse contains information taken from throughout an organization. This is the most complex type of warehouse to build and maintain, since data must be merged from multiple systems into common subject areas (PC Week, Feb 8 1999). Data-mining tools work with various statistical techniques for modeling data and for estimating and predicting outcomes based on what they have learned.
Data-mining work best with large data sets (PC Week, Feb 8 1999). Data Warehouse Components Although a data warehouse sounds like a single entity, it is really a multi-tiered, multi-application conglomerate that comprises several components. Each component may be handled by one or more pieces of hardware or software. No vendor has a complete data warehouse package, (PC Week, Feb 8 1999). Functionally, a data warehouse extracts data from operational systems and loads it into a holding area where it is "scrubbed", which means it is made to conform with warehouse standards. Then the data will be merged, time-stamped and dated in the right order, and loaded into databases for use by data access tools.
Since the data goes through a number of transformations and it is ultimately placed in data structures different from the ones it came from, those changes are mapped in catalogs or dictionaries. Such catalogs are managed with metadata tools. Data that defines or describes data in the warehouse is called metadata. There are typically two kinds of metadata. Information that users need to know, such as table and column names and definitions, are called frontend metadata. Everything else, such as how a particular data element maps to its original database, is backend metadata.
Security. Security considerations for data warehouses are different from those for OLTP systems. For a data warehouse to pay for itself, lots of users have to be able to benefit from it, and therefore more users will need access to data than are traditionally authorized by OLTP security (Computerworld, Feb 15, 1999). Conclusion Today, many corporations come to appreciate that the information they gather each day is an asset, they will rely more and more on data warehousing. But while a data warehouse can provide managers with the means to ask questions of their data and get back meaningful answers, it can not automatically make a company more profitable. "Good technology can not substitute for good management." References 1.
Computerworld, Feb 15, 1999 p 14, "Human side key to data warehousing." By: Stewart Deck. 2. PC Week, Feb 8 1999 v 16 i 6 p 56, "Data Warehouses in Need of Renovation." By: John Taschen. 3.
PC Week, Feb 8, 1999 v 16 i 6 p 71, "100 Top Data Warehousing Leaders Step Up." By: Jeff Mad.