In my last blog (Leadership Lessons in Data Quality - Part 1) I talked about some useful techniques that I have learned by delivering improved data quality in a number of Australian organisations. In this post I want to talk about how to help business people to care more about DQ.
This is vital because it is the key to making good data quality sustainable in your organisation.
The short answer - like a lot of analytic challenges faced in the real world - is to measure the problem.
Once you can measure data quality, as I mentioned in my earlier post: I also recommend that data quality improvement of the business owners data be made a part of their performance incentive program. In simpler language: link it to their bonus.
Data quality has to be measured like any other KPI so that non-experts can understand two things:
- What we all mean by poor data quality - define it
- The cost of poor data quality - quantify it.
This is easiest to see in an example:
_______________
Don't Set as a KPI:
Currently 6,753 customer records, or 2.0834% of all main customer records in EDW2 contain errors. Our objective is to reduce this by 50% within 6 months.
Do Set as a KPI:
Currently almost 7,000 clients have incorrect data that results an average of 2,800 direct mailings being returned by Australia Post. This adds $26,000 directly (printing and posting) and indirectly (administration) to each marketing campaign. Our objective is to reduce this average cost by 50% within 6 months.
_______________
A useful technique I have developed to help me understand data and quality is to create a taxonomy that classifies data in the following three ways:
- 'Classic' data quality criteria. Things like accuracy, relevance, availability, etc.
- The impact data quality has on the business. Things like dollars lost if data is wrong, the value of increased sales if data is correct, etc.
- What action(s) they initiate in an organisation when the data is wrong. Things like intervening at the source of data entry, remedial actions after data entry, etc.
So what metrics can you associate with data quality in your organisation? Here are six I use:
- Accuracy: the degree of confidence that data is free of error/defect
- Completeness: the extent to which data is not missing and is of sufficient breadth and depth for the task at hand
- Consistency: the degree to which common data across different sources follows the same definitions, codes and formats
- Timeliness: the degree to which data is up to date
- Security: the degree to which data confidentiality, integrity and availability has been maintained
- Fit for Purpose: the degree to which data is relevant, appropriate and meets business specifications.
Thanks for the input James.
Business people are just as likely to be oblivious to the technical implications of poor data quality as technical people are of the business perspective.
Isn't it nice to have a problem where blame can be so evenly apportioned?!
Posted by: OzAnalytics | Thursday, August 20, 2009 at 02:49 PM
Steve, appreciate this series - as a DW body poor Data Quality is the bane of my existence - especially on first cut projects where people simply don't believe they have a problem.
I appreciate you framing it in business terms - a mistake us techy folks often make is to see it as a barrier to *us* doing our jobs properly - and thus fail to see how to 'sell' it to the business as their problem as well.
Posted by: James Beresford | Thursday, August 20, 2009 at 10:28 AM