There are four primary factors to consider when you are looking to maintain data warehouse quality: data integrity, data input source and methodology used, frequency of data import and audience. A data warehouse is an electronic repository of large quantities of data and is used increasingly by businesses and other larger organizations to store data in a tool that facilitates reporting and data output requirements. The usefulness of a data warehouse is driven primarily by the quality of the data and the responsiveness to user requirements.
Data integrity is a concept common to data warehouse quality as it relates to the rules governing the relationships between the data, dates, definitions and business rules that shape the relevance of the data to the organization. Keeping the data consistent and reconcilable is the foundation of data integrity. Steps used to maintain data warehouse quality must include a cohesive data architecture plan, regular inspection of the data and the use of rules and processes to keep the data consistent whenever possible.
The data input source for a data warehouse is typically an import tool or program. The easiest way to maintain data warehouse quality is to implement rules and checkpoints in the data import program itself. Data that does not follow the appropriate pattern will not be added to the data warehouse but will require user intervention to correct, reconcile or change the program. In many organizations, these types of changes can be implemented only by the data warehouse architect, which greatly increases the data warehouse quality.
The accuracy and relevance of the data is essential to maintaining data warehouse quality. The timing of the import and frequency has a large impact on the overall usefulness of the tool, as well as the quality. For example, if purchase order information is entered into the warehouse but invoices are updated only intermittently, the ability to report accurately on purchase-related activity is compromised.
Data warehouse quality is easiest to maintain and support if the users are knowledgeable and have a solid understanding of the business processes. Training the users to not only understand how to build queries, but on the underlying data warehouse structure enables them to identify inconsistencies much faster and to highlight potential issues early in the process. Any changes to the data tables, structure or linkages and the addition of new data fields must be reviewed with the entire team of users and support staff members in order to ensure a consistent understanding of the risks and challenges that might occur.