Data warehouse management is a set of business processes, techniques, and technology used to create, support, and utilize a data warehouse. A data warehouse stores all of an organization's electronic data and is a resource dedicated to reporting and data analysis. There are three primary concepts in data warehouse management: data retrieval and analysis, data extraction, and transformation.
The interest in data warehouses has grown substantially in the past few years and is forecast to experience above average growth for the next decade. As businesses and organizations expanded their use of technology, they began to collect huge amounts of transactional data. This information provides a unique opportunity to identify trends and patterns. For many firms, there is a wealth of information and analysis that can be completed using this information. However, there is a significant investment required before the firm can yield any benefits from this type of information.
The staff responsible for data warehouse management must have completed training programs in database architecture, relational databases, data integrity, and transformations. Additional background in statistics, advanced mathematics, or information systems is extremely beneficial. The skills in this area are constantly evolving, and staff will need to complete ongoing training programs to successfully manage the complex tasks required.
Data retrieval is the entire process of identifying the type of data that should populate the data warehouse. It requires significant effort and a deep understanding of the transaction data available, as well as the type of reports that will be required by the organization. Typically, business analysts work to identify the necessary data and define the data queries users will require. All business analysts will be required to learn data warehouse management techniques as part of their core job requirement.
In order to retrieve the appropriate data, each table must be reviewed and the appropriate fields included in the data extract tool. Typically, this type of work is completed by a data architect, together with a system developer or programmer. The extraction tool is used to select the information properly and to populate the data warehouse tables while maintaining the same relationships. This is necessary to create accurate data within the warehouse; without these inter-relationships, any analysis will be meaningless.
Transforming the data often involves a process of running several programs or queries against the data. The primary purpose is to clean or scrub the data to eliminate mismatched records and improve the overall quality of the information. The data must have consistent formatting to properly match records together.