How Intelligent is BizTalk 2010’s Intelligent Mapper?

One of the interesting new features of the BizTalk Server 2010 Mapper (and corresponding Windows Workflow shape) is the “suggestive matching” which helps the XSLT map author figure out which source (or destination) nodes are most likely related.  The MSDN page for suggestive matching has some background material on the feature.  I thought I’d run a couple quick tests to see just how smart this new mapper is.

Before the suggestive match feature was introduced, we could do bulk mapping through the “link by” feature.  With that feature, you could connect two parent nodes and choose to map the children nodes based on the structure (order), exact names or through the mass copy function.  However, this is a fairly coarse way to map that doesn’t take into account the real semantic differences in a map.  It also doesn’t help you find any better destination candidates that may be in a different section of the schema.

2010.08.15mapper01

Through Suggestive Matching, I should have an easier time finding matching nodes with similar, but non-exact naming.  However, per the point of this post, I wasn’t sure if the Mapper just did a simple comparison or anything further.

Simple Name Matching

In this scenario, we are simply checking to see if the Mapper looks for the same textual value from the source in the destination.  In my source schema I have a field called “ID.”  In my destination schema I have a field called “ItemID.”  As you’d expect, the suggestive match points this relationship out.

2010.08.15mapper02

In that case, the name of the source node is a substring of the destination.  What if the destination node is a substring of the source?  To demonstrate that, I have a source field named “PhoneNumber” and the destination node is named “Phone.”  Sure enough, a match is still made.

2010.08.15mapper03

Also, it doesn’t matter where in the node name that a matching value is found.  If I have a “Code” field in the source tree and both a “ZipCode” and “OrderCodeIdentifier” in the destination, both nodes are considered possible matches.  The word “code” in the latter field, although between other text, is still identified as a match.  Not revolutionary of course, but nice.

2010.08.15mapper04

Complex Name Matching

In this scenario, I was looking to see if the Mapper detected any differences based on more than just the substrings.  That is, could it figure out that “FirstName” and “FName” are the same?  Unfortunately, the “FirstName” field below resulted in a match to all name fields in the destination.

2010.08.15mapper05

The highlighted link is considered the best match, and I noticed that as I added more characters to the “FName” node, I got a different “best match.”

2010.08.15mapper06

You see that “FirName” is considered a close match to “FirstName.”  Has anyone else found any cases where similar but inexact worded is still marked as a match?

Node Positioning

I was hoping that via intelligent mapping that an address with a similar structure could be matched across.  That is, if in one map I had certain identically named nodes before an after one, that it might guess that the middle ones matched.  For instance, what if I have “City” between “Street” and “State” in the source and “Town” between “Street” and “State” in the destination, that maybe it would detect a pattern.  But alas, that is apparently a dream.

2010.08.15mapper07

Summary

It looks like our new intelligent mapper, with the help of Suggestive Match, does a decent job of textual matching between a source and destination schema.  I have yet to see any examples of advanced conditions outside of that.  Still, if all we get is textual matching, that still provides developers a bit of help when traversing monstrous schemas with multiple destination candidates for a source node.

If you have any additional experiences with this, I’d love to hear it.

Share

Author: Richard Seroter

Richard Seroter is Director of Developer Relations and Outbound Product Management at Google Cloud. He’s also an instructor at Pluralsight, a frequent public speaker, the author of multiple books on software design and development, and a former InfoQ.com editor plus former 12-time Microsoft MVP for cloud. As Director of Developer Relations and Outbound Product Management, Richard leads an organization of Google Cloud developer advocates, engineers, platform builders, and outbound product managers that help customers find success in their cloud journey. Richard maintains a regularly updated blog on topics of architecture and solution design and can be found on Twitter as @rseroter.

2 thoughts

  1. Hey Richard, good post. I would have expected FirstName to find FName as the suggested match too…

    When I was researching the subject for my blog post on the new mapper I found the two original papers from Microsoft Research on the subject by Eddie Churchill et al. Here are the links if anyone is interested, makes for an interesting read.

    “Visualization of Mappings Between Schemas”
    http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.87.2767&rep=rep1&type=pdf

    “Incremental Schema Matching”
    http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.61.4221&rep=rep1&type=pdf

    Cheers,
    Thiago

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.