Centricity and Connections: Clearing the Air

| No Comments | No TrackBacks
By David Loshin

There are opportunities for adjusting your strategy for customer centricity based on understanding the grouping relationships that bind individuals together (either tightly or loosely). And in the last post, we looked at some examples in which linking customer records into groups was straightforward when the values to be compared and weighted for similarity are exact matches. When the values are not exact, it introduces some level of doubt into the decision process for including a record into a group.

Let's revisit our example from my last post by adding in a new record for evaluation:

John Hansen, 1824 Polk Ave., Memphis TN 38177
Emily S. Hansen, 1824 Polk Ave., Memphis, TN 38177
Emily Stoddard, 1824 Polk Avenue, Memphis, TN
We had already decided that John and Emily shared a household, but all of a sudden we have a third record with a name that shares some similarity, with one of the existing names, and an almost exact street address match (note that the third record is missing a ZIP code).

We could speculate that "Emily Stoddard" changed her name after she got married to "John Hansen," or that she changed an address somewhere as she moved form her bachelorette pad to their newlywed home. But without exact knowledge of the facts, it is only speculation, and one must exercise some care when relying on speculation for business decisions.

If a few small differences pose a challenge to linkage, what would you think of dozens, or even hundreds of variations for names, locations, or other data values?

Just as a case in point: in a hallway conversation at the recent Data Governance Conference, a colleague mentioned that one of his customers' databases had over one hundred variations for a certain big-box retailer's name! The conclusion you can draw from this is that a key part of the record linkage process involves some traditional data quality tactics, namely appending a standardized version of the data to help your linkage algorithms score record similarity as a prelude to establishing connectivity.

No TrackBacks

TrackBack URL: http://blog.melissadata.com/mt-tb.cgi/231

Leave a comment