Modeling Issues and Entity Inheritance

| No Comments | No TrackBacks
By David Loshin

In our last set of posts, we looked at matching and record linkage and how approximate matching could be used to improve the organization's view of "customer centricity." Data quality tools such as parsing, standardization, and business-rule based record linkage and similarity scoring can help in assessing the similarity between two records. The result of the similarity analysis is a score that can be used to advise about the likelihood of two records referring to the same real-life individual or organization.

One last thought: this approach is largely a "data-centric" activity. What I mean is that it looks at and compares two records regardless of where those records came from. They might have come from the same data set (as part of a duplicate analysis) or from different data sets (for consolidation or general linkage).

But it does not take into consideration whether one data set models "customer" data and another models "employee" data. While you may link a customer record with an employee record based on a similarity analysis of a set of corresponding data attributes, the contexts are slightly different.

A match across the two data sets is a bit of a hybrid: we have matched the individual but one playing different roles. That introduces a different kind of question: are the identifying attributes associated with the "customer" or the individual acting in the role of "customer"? The same question applies for individual vs. employee.

And finally, are there attributes of the roles that each individual plays that can be used for unique identification within the role context? The answers to these questions become important when matching and linkage are integrated as part and parcel of a business application (such as the consolidation of data being imported into a business intelligence framework).


No TrackBacks

TrackBack URL: http://blog.melissadata.com/mt-tb.cgi/162

Leave a comment

Authors