One of the interesting new features of the BizTalk Server 2010 Mapper (and corresponding Windows Workflow shape) is the “suggestive matching” which helps the XSLT map author figure out which source (or destination) nodes are most likely related. The MSDN page for suggestive matching has some background material on the feature. I thought I’d run a couple quick tests to see just how smart this new mapper is.
Before the suggestive match feature was introduced, we could do bulk mapping through the “link by” feature. With that feature, you could connect two parent nodes and choose to map the children nodes based on the structure (order), exact names or through the mass copy function. However, this is a fairly coarse way to map that doesn’t take into account the real semantic differences in a map. It also doesn’t help you find any better destination candidates that may be in a different section of the schema.
Through Suggestive Matching, I should have an easier time finding matching nodes with similar, but non-exact naming. However, per the point of this post, I wasn’t sure if the Mapper just did a simple comparison or anything further.
Simple Name Matching
In this scenario, we are simply checking to see if the Mapper looks for the same textual value from the source in the destination. In my source schema I have a field called “ID.” In my destination schema I have a field called “ItemID.” As you’d expect, the suggestive match points this relationship out.
In that case, the name of the source node is a substring of the destination. What if the destination node is a substring of the source? To demonstrate that, I have a source field named “PhoneNumber” and the destination node is named “Phone.” Sure enough, a match is still made.
Also, it doesn’t matter where in the node name that a matching value is found. If I have a “Code” field in the source tree and both a “ZipCode” and “OrderCodeIdentifier” in the destination, both nodes are considered possible matches. The word “code” in the latter field, although between other text, is still identified as a match. Not revolutionary of course, but nice.
Complex Name Matching
In this scenario, I was looking to see if the Mapper detected any differences based on more than just the substrings. That is, could it figure out that “FirstName” and “FName” are the same? Unfortunately, the “FirstName” field below resulted in a match to all name fields in the destination.
The highlighted link is considered the best match, and I noticed that as I added more characters to the “FName” node, I got a different “best match.”
You see that “FirName” is considered a close match to “FirstName.” Has anyone else found any cases where similar but inexact worded is still marked as a match?
I was hoping that via intelligent mapping that an address with a similar structure could be matched across. That is, if in one map I had certain identically named nodes before an after one, that it might guess that the middle ones matched. For instance, what if I have “City” between “Street” and “State” in the source and “Town” between “Street” and “State” in the destination, that maybe it would detect a pattern. But alas, that is apparently a dream.
It looks like our new intelligent mapper, with the help of Suggestive Match, does a decent job of textual matching between a source and destination schema. I have yet to see any examples of advanced conditions outside of that. Still, if all we get is textual matching, that still provides developers a bit of help when traversing monstrous schemas with multiple destination candidates for a source node.
If you have any additional experiences with this, I’d love to hear it.
Hey Richard, good post. I would have expected FirstName to find FName as the suggested match too…
When I was researching the subject for my blog post on the new mapper I found the two original papers from Microsoft Research on the subject by Eddie Churchill et al. Here are the links if anyone is interested, makes for an interesting read.
“Visualization of Mappings Between Schemas”
“Incremental Schema Matching”