« Information Is An Asset | Main | Leadership Lessons in Data Quality - Part 2 »

Tuesday, August 18, 2009


Feed You can follow this conversation by subscribing to the comment feed for this post.

bum's software gear

Yes data quality is a very important factor that is why data quality software is extremely needed.


Sanjay, thanks for the comment. I agree that information is almost always imperfect and you almost always have to accept a certain level of error.

As to fuzzy logic - it's an interesting idea that I have only ever applied to data scrubbing.

For example I frequently need to scrub customer address details in the data integration layer of a data warehouse. Usually it is cheaper and easier to have the addresses scrubbed by a third party piece of software or SaaS. They will use fuzzy logic to improve the quality of the scrubbed addresses they return. It works well and is cheaper than reinventing the wheel and having custom software written.

More broadly I use 'hard' measures - such as the concept of banding to trigger alerts.

For example:

Set a dashboard alert so that if the frequency of rejected customer records in data integration process X is:

- less than 1%: then show status as green (OK)
- equal to or more than 1% but less than 1.5%:
then show status as yellow (Warning)
- equal to or more than 1.5%: then show status as red (Alert)
and send email requesting action to process owner.

Nothing fuzzy about that.

I can see that fuzzy logic is very useful in situations where complex conditions arise or where the volume of realtime transactions is massive - like credit card transactions in Visa or Mastercard.

My warehouses to date are mostly running data integration as a batch process so the need doesn't arise.

Check out NAFIPS (North American Fuzzy Information Processing Society!) at http://nafips.ece.ualberta.ca/ if you want to dig deeper.

I also remember that Business Objects purchased FUZZY! Informatik a couple of years ago. FUZZY! Informatik sold EU friendly data scrubbing software.

Do you have an example in DQ or IM?

Sanjay M Kabe

Very interesting and illuminating post, Steve!. Feel that organizations need lot of discipline and management to have their data quality right. Was wondering if there is a concept of fuzzy data/fuzzy logic that can be applied to data in an organization, as the reality is an organization's data quality is realistically imperfect.

The comments to this entry are closed.

My Photo

Site Meter

Become a Fan