April 2010 Archives

This video provides an overview of MatchUp, a powerful merge/purge application to eliminate duplicate records from multiple source databases. Available as a standalone software application, 32/64-bit programmable API or an SSIS transform, MatchUp can identify matching records, consolidate different databases, and create master lists.

Request your free trial of MatchUp here: http://www.melissadata.com/products/matchup-free-trial.htm

Watch the video for Matchcode Editor here: http://www.melissadata.com/multimedia/matchupdemo.html

 

Data Quality Tips from Our Experts

| No Comments | No TrackBacks

By Admound Chou

DQT Assistant Manager

 

Increasing the Speed of Melissa Data Libraries

 

There are several architectural/language/optimization changes that can speed up processing. They are listed here in order with suggestions most likely to increase speed at the top. Of course, these are not the only measures one can take to increase speed but from our experience, these are the most effective.

1. Make sure you are using 1 object per batch process.
You only need to initialize one instance of an object to process a batch list. Make sure that you are not creating a new instance and re-initializing for every record unless you absolutely have to.

2. Move to a more optimized programming language.

The fastest language to use is C++ because our components are written in C++. However, any modern Object Oriented language will provide fairly similar speeds. The main exception is T-SQL. SQL Server was not designed to use third party components as fast, and will process contact data anywhere from three to 10 times slower. If your data is stored in SQL, we recommend you invoke and use our components in another programming language (like C#, VB.NET, C++, Java) and create a connection to the SQL Server to retrieve and store the data. Additionally, sometimes the time taken for selects, inserts, and/or updates gets confused for processing time of our components when they alone account for a significant portion of the overall time. 

3. Order the data by ZIP Codeā„¢.
The source data we used to verify and look up addresses information is stored in ZIP Code order. So, if your data is also in ZIP Code order, there will be less data being moved in and out of cache, speeding up processing time.

4. Increase memory (RAM) or improve other hardware.
If you have 1 GB of memory or less, increasing to 2 or 4 GB can significantly increase processing speed. This is the easiest and most effective way to increase speed from hardware. Another hardware upgrade options are is to use a faster hard drive (SCSI or solid state). 

5. Multi-threading
Our components are thread safe and can have multiple instances running in multiple threads. Having additional threads will allow you to take full advantage of CPU time as well as multiple cores. Adding threads will increase processing (up to a certain point) but each additional thread will provide diminishing value. We recommend 2-3 threads per core. 

Disclaimer: Do not use multi-threading until you are comfortable and experienced with it. Data access of our components are thread safe but you must maintain thread safety for accessing our libraries. Rule of thumb: one thread per instance. 

6. Cut out COM Interop.
Our windows components are available in two flavors, COM and standard dll. They both have the same core verification engine but the COM version has a COM Interop layer to facilitate communication with many popular programming languages. If you are experienced with COM vs non-COM, you may look into using the standard dll to remove the COM Interop layer and reduce the amount of data marshalling. For .NET users, both the COM (samples directory) and the standard dll (interfaces/NET directory) sample codes are available.

 

 

 

 

 

 

4 Critical Principles of Data Governance Success

| No Comments | No TrackBacks

How effectively you manage the quality, consistency, usability, security, and availability of your organization's data will play a large part in how successful your business ultimately is. Data is the lifeblood of any business, and if the data isn't healthy ... well, you know the rest.

Jane Griffin of Deloitte Consulting LLP describes four critical principles to a successful data governance implementation effort.

Read her article here: http://www.melissadata.com/enews/articles/03112010/2.htm

 

Data Appending: Dive Deeper

| No Comments | No TrackBacks

The need to append a prospect or customer data file with third-party data is a necessity for most intelligence-driven marketers. Missing data, stale or inactive files, data kept in silos within enterprises, inaccurate data entry- all of these "sins" can creep into the database and wear down the success of prospect and customer engagement initiatives. Depending on the application, such data mishaps can even undermine a brand.

Is your data performing up to par? Read the full article by DMNews here: http://www.dmnews.com/data-appending--dive-deeper/printarticle/166444/

 

Authors