Garbage In ...

| No Comments | No TrackBacks
By Elliot King

Elliot King
We all know that dirty data is not really dirty; it is just incorrect. Data cleansing consists of correcting mistakes in the data.

Mistakes make their way into contact data in several different ways. It may just be wrong or incomplete; it may not be updated; and it may be duplicated if small variations are entered into the contact information each time a customer gets in touch your organization. I know that my name shows up in at least half a dozen variations in more than one company database.

There are several strategies for ensuring that a high percentage of your customer contact data is correct (some errors will inevitably creep in) but one of the most important steps you can take is right at the very beginning. Before you even start collecting data, you should ask yourself how much information do you actually need to capture about each customer, and what field or fields define a unique record?

Do you really need to capture somebody's fax number? Do you need the honorifics like Mr. or Dr.? (Honorifics were on a form I recently had to complete to buy an airline ticket. In fact, they were a required field). Are there other pieces of data that can be eliminated from your contact record?

And while it is obvious that the name field should not be used to determine a unique record, what should be? With Web-based forms, for example, many people enter incorrect email addresses to avoid getting spam.

The fact is that the more information required on a contact information form, the more mistakes it will have. It is much more efficient to collect data correctly at the beginning of the process, than to locate and fix incorrect data later.



No TrackBacks

TrackBack URL: http://blog.melissadata.com/mt-tb.cgi/127

Leave a comment