When it comes to the benefits of having a Data Warehouse (DW from here on), most of our clients are already cognizant of its critical role as part of a business system. What many still don’t realize is that there are good DWs and not so good ones. Of course everyone is aware of their existence (good vs not so useful), but not everyone has the time thinking of the ways to distinguish between those two.
A Good Data Warehouse Goes (Far) Beyond a Database
The number one question that I get when I talked to anyone about the services we provide to our customers is the difference between a Database and a Data Warehouse. A lot of people associate a data warehouse with storing data (just like a physical warehouse) — and that’s about it.
Truth is, a good DW serves the business in at least five areas in addition to hosting business data:
- Automatically apply business rules
- Automatically generate reports (both on-demand and periodical)
- Automatically notify parties of data events of interest
- Allows data owners to look into their data in multiple dimensions
- Allows for data error detection, investigation, and corrections
When you are building a DW, the above functions must be present in the planning, implementations, and testing phases. Otherwise, you’d be left with basically just a database.
A Good Data Warehouse Grows into Maturity
In the next blog entry, I’ll be discussing in specifics, the different levels of DW maturity. A lot of misconception out there regarding how to assess the reliability and usefulness of a DW. Most people expect a DW to become mature at the end of the initial implementation project. This is not realistic given the typical complexity of meta-data, data, and business rules surrounding it. The correct way to proceed with a DW-building project is to realize that there are certain phases that a DW goes through which brings it along the path into maturity.
A Good Data Warehouse Is Redundant
An experienced BI consultant would build one or more copies of the DW that can be switched at any given time. Not only will this provide a good development and testing environment for analytic outputs (reports, dashboards, ETLs), but it would also
A Good Data Warehouse Highlights Data Problems
Due to its proximity with people who did the accounting, and forecasting in the organization, a good DW is usually the first place where these people notice data inconsistencies. A good way to take advantage of this characteristics, is to put in place monitoring ETLs that runs periodically, executing queries that is designed to check the consistencies of data.
A Good Data Warehouse Reduces the Load on the Transactional System
A lot of the systems that maintains the transactional data input from users are inundated with processes that shouldn’t belong there in the first place. These processes typically does checking, applying business rules, and monitor the validity of the data, which takes away resources from what Transactional Systems should be mainly about: Collecting data and preserving its integrity.
Now the follow up question is: How do we know which part of the Transactional System should be offload to the DW?
A good boundary rule-of-thumb is whenever a process can be executed against any data in batch-mode. This is an indication that it should be taken off the transaction system and moved to the DW. And by the way, it is not a taboo for a DW to pump data back into the Transactional System when a valid use-case can be made.
There are many other ways to assess a DW to see if it’s “good” or “not good enough,” but all of them revolves around the relationship between the DW and the existing Transaction Systems and the other flip side of the coin, the decision maker team. A good DW should always bring clarity and bridge the communication barrier caused by incongruent view of data held by different parts of the business. Hopefully this can get you started in the correct line of thinking as you embark on your DW projects.