The Need and the Mechanics of Address Standardization

| No Comments | No TrackBacks
By David Loshin

In my previous post, I introduced the topic of address standards, and we conjectured that the existence of a standard for addresses not only simplifies the processes of delivery, it helps to ensure delivery accuracy. Ultimately, delivery accuracy saves money, since it reduces the amount of effort to find the location and it eliminates rework and extra costs of failed delivery.

This is all well and good as long as you use the standard. The problem occurs when, for some reason, the address does not conform to the standard. If the address is slightly malformed (e.g. it is missing a postal code), the chances are still good that the location can be identified. If the address has serious problems (e.g. the street number is missing, there is no street, the postal code is inconsistent with the city and state, or other components are missing), resolving the location becomes much more difficult (and therefore, costly).

There are two ways to try to deal with this problem. The first is to bite the bullet and treat each non-standard address as an exception, forcing the delivery agent to deal with it. The other approach attempts to fix the problem earlier in the process by trying to transform a non-standard address into one that conforms to the standard.

Address standardization is actually not that difficult, especially when you have access to a good standard. At the highest level, the process is to first determine where the address does not conform to the standard, then to standardize the parts that did not conform.

If you recall from my previous post, an address captures the incremental knowledge to resolve the location, and we can use this fact plus the information provided in the standard to consider ways to fix non-standard addresses. Each component has its specific place inside the address, and there are standards for abbreviations (such as ST for "street," or AVE for "avenue") as well as for common terms (such as ATTN for "attention").

One can define a set of rules to check if the address has all the right pieces, if they are in the right place, and if they use the officially-sanctioned abbreviations. You can also use rules to move parts around, to map commonly-used terms to the standard ones, and use lookup tables to fill in the blanks when data is missing. So in many cases, it is straightforward to rely on tools and methods to automatically transform non-standard addresses into standardized ones.

No TrackBacks

TrackBack URL: http://blog.melissadata.com/mt-tb.cgi/126

Leave a comment