Inferred Knowledge and Customer Intelligence through Matching and Linkage

| No Comments | No TrackBacks
By David Loshin

What I have found to be the most interesting byproduct of record linkage is the ability to infer explicit facts about individuals that are obfuscated as a result of distribution of data. As an example, consider these records, taken from different data sets:

A:
David
Loshin
301-754-6350
1163 Kersey Rd
Silver Spring
MD
20902

B:
Knowledge Integrity, Inc
1163 Kersey Rd
Silver Spring
MD
20902

C:
H David
Lotion
1163 Kersey Rd
Silver Spring
MD
20902

D:
Knowledge Integrity, Inc.
301
7546350
7546351
MD
20902

We could establish a relationship between record A and records B and C because they share the same street address. We could establish a relationship between record B and record D because the company names are the same.

Therefore, by transitivity, we can infer a relationship between "David Loshin" and the company "Knowledge Integrity, Inc" (A links to B, B links to D, therefore A links to D). However, none of these records alone explicitly shows the relationship between "David Loshin" and "Knowledge Integrity, Inc" - that is inferred knowledge.

You can probably see the opportunity here - basically, by merging a number of data sets together, you can enrich all the records as a byproduct of exposed transitive relationships.

This provides us with one more valuable type of enhancements that record linkage provides. And this is particularly valuable, since the exposure of embedded knowledge can in turn contribute to our other enhancement techniques for cleansing, enrichment, and merge/purge.

No TrackBacks

TrackBack URL: http://blog.melissadata.com/mt-tb.cgi/154

Leave a comment

Authors