Author: Richard Seroter

  • Interview Series: Four Questions With … Johan Hedberg

    Hi there and welcome to the 24th interview with someone who doesn’t have the good sense to ignore my email.  This month we are chatting with Johan Hedberg who is an architect, Microsoft MVP, blogger, and passable ship captain.  Let’s jump in.

    Q: In the near future you are switching companies and tasked with building up a BizTalk practice.  What are the most important initial activities for establishing such expertise from scratch?  How do you prioritize the tasks?

    A: There is a couple that comes to mind. Some of them are catch-22. What comes first, the task or the consultant to perform the task? Generating business and balancing that with attracting and educating resources is core. Equally important will be to help adjust the baseline the company has today for the BizTalk platform – how we go about marketing, architecting and building our solutions and converting that from theory to BizTalk practice. The company I’m switching to (Enfo Zystems) already has a reputation of being expert integrators, but they are new to the Microsoft platform. So gaining visibility and credibility in that area is also high on the agenda. If I need to pick a first task I’d say that the internal goals are my top priority. Likely that will happen during a time where I will also have one or more customers (getting work is seldom the problem), which is why it must be prioritized to happen at all. As a consultant – customer assignments have a tendency to take over from internal tasks if you don’t stand fast.

    Q: I recently participated in the European BizTalk Summit that you hosted and I am always impressed by the deep BizTalk expertise and awareness in your area of the world.  Why do you think that BizTalk has such a strong presence in Sweden and the surrounding countries? Does it have to do with the types of companies there,  Microsoft marketing/sales people who articulate the value proposition well, or something else?

    A: I believe that we (Swedes) in general are a technology friendly and interested bunch and generally adopt new technology trends quite rapidly. Back in the day we were early with adopting things like mass availability of broadband connections and the web. At that time much of it was consumer targeted. I don’t think we adopted integration platforms in a broad sense very early. And those that did didn’t have BizTalk as an obvious first choice. Even though I wasn’t really in the business of integration five years ago I can’t remember it being a hot topic. That has picked up a lot lately. Sweden has also gotten out of the economic downturn reasonably good and finances still hold the possibility of investment within IT – especially for things that in themselves might add to cost savings. And there is a huge potential for that in companies all around Sweden where many still have the “spaghetti integration” scenario as their reality. Also, in the last couple of years, there has been an increased movement from other (more expensive) platforms to BizTalk as a first choice and even a replacer of existing technology. The technology interest is very much still there, and now to a much larger extent includes integration. And now the business is on it as well; a recent study among Swedish CIOs shows that integration today is considered a key enabler for both business and IT.

    Q: In a pair of your recent blog posts, you mention the “it depends” aspect of BizTalk infrastructure sizing, as well as learning and leveraging the Azure cloud.  What are things in BizTalk (e.g. design, development, management) that you consider absolute “must do” and “it depends” doesn’t apply?

    A: The last couple of years at Logica we’ve been delivering integration as a service and the experience from that is that there are two points of interaction that’s crucial to get right if you want to minimize trouble during development and subsequent release and support. They are both about communication, and to some smaller part about documentation. It starts with requirements. To ask the right questions, interpret the answers, document and agree upon what needs to be done. To have a contract. You still need to be aware of and flexible enough to handle change, but it needs to be clear that it is a change. It makes the customer/supplier relationship easier and more structured. The next checkpoint is that from integration development to the operations group that will subsequently support the solution in production. It’s equally important to give them what they need so that they can do a good job. In the end it’s the full lifecycle of the solution that decides whether the implementation was successful and not just the two days where actual development took place. I guess the message is that the processes around the development work is just as important, if not more so.

    With development it’s easier to state do not’s than must do’s. Don’t do orchestrations if you don’t need them. Don’t tick all tracking checkboxes just because you might need them someday. Don’t do XmlDocument or intensive stream seek operations. Don’t say ok to handling 500mb xml messages in BizTalk without putting up a fight. If BizTalk serves as an integration platform – don’t implement business logic in BizTalk that belongs to the adjoining systems; don’t create an operations and management nightmare. Don’t reinvent solutions that already have samples available. Don’t be too smart, be simple. And it can go on and on… But it is what it is (right? 😉 )

    Q [stupid question]: Google (or Bing) auto-complete gives an interesting (and frightening) look into the popular questions being asked of our search engines.  It’s amusing to ask the beginning of questions and see what comes back.  Give us a few fun auto-complete searches that worry or amuse you.

    2010.10.04interview01

    2010.10.04interview02

    A: Since you mention Sweden as being a place you recognize as having a strong technical community let’s see what people in general want to know about what it’s like to be Swedish …

    2010.10.04interview03

    Food, medical aid, pirates and massage seems to be on top.

    Also, since we both have sons, let’s see what we can find out about sons…

    2010.10.04interview04

    A fanatic bullying executioner who hates me. Not good.

    But let’s move the focus back to me…

    2010.10.04interview05

    That pretty much sums it up I guess. No need to go any further.

    Thanks Johan, and good luck with the new job.

  • Comparing AWS SimpleDB and Windows Azure Table Storage – Part II

    In my last post, I took an initial look at the Amazon Web Services (AWS) SimpleDB product and compared it to the Microsoft Windows Azure Table storage.  I showed that both solutions are relatively similar in that they embrace a loosely typed, flexible storage strategy and both provide a bit of developer tooling.  In that post, I walked through a demonstration of SimpleDB using the AWS SDK for .NET.

    In this post, I’ll perform a quick demonstration of the Windows Azure Table storage product and then conclude with a few thoughts on the two solution offerings.  Let’s get started.

    Windows Azure Table Storage

    First, I’m going to define a .NET object that represents the entity being stored in the Azure Table storage.  Remember that, as pointed out in the previous post, the Azure Table storage is schema-less so this new .NET object is just a representation used for creating and querying the Azure Table.   It has no bearing on the underlying Azure Table structure. However, accessing the Table through a typed object differs from the AWS SimpleDB which has a fully type-less .NET API model.

    I’ve built a new WinForm .NET project that will interact with the Azure Table.  My Azure Table will hold details about different conferences that are available for attendance.  My “conference record” object inherits from TableServiceEntity.

    public class ConferenceRecord: TableServiceEntity
        {
            public ConferenceRecord()
            {
                PartitionKey = "SeroterPartition1";
                RowKey = System.Guid.NewGuid().ToString();
    
            }
    
            public string ConferenceName { get; set; }
            public DateTime ConferenceStartDate { get; set; }
            public string ConferenceCategory { get; set; }
        }
    

    Notice that I have both a partition key and row key value.  The PartitionKey attribute is used to identify and organize data entities.  Entities with the same PartitionKey are physically co-located which in turn, helps performance.  The RowKey attribute uniquely defines a row within a given partition.  The PartitionKey + RowKey must be a unique combination.

    Next up, I built a table context class which is used to perform operations on the Azure Table.  This class inherits from TableServiceContext and has operations to get, add and update ConferenceRecord objects from the Azure Table.

    public class ConferenceRecordDataContext : TableServiceContext
        {
            public ConferenceRecordDataContext(string baseAddress, StorageCredentials credentials)
                : base(baseAddress, credentials)
            {}
    
            public IQueryable<ConferenceRecord> ConferenceRecords
            {
                get
                {
                    return this.CreateQuery<ConferenceRecord>("ConferenceRecords");
                }
            }
    
            public void AddConferenceRecord(ConferenceRecord confRecord)
            {
                this.AddObject("ConferenceRecords", confRecord);
                this.SaveChanges();
            }
    
            public void UpdateConferenceRecord(ConferenceRecord confRecord)
            {
                this.UpdateObject(confRecord);
                this.SaveChanges();
            }
        }
    

    In my WinForm code, I have a class variable of type CloudStorageAccount which is used to interact with the Azure account.  When the “connect” button is clicked on my WinForm, I establish a connection to the Azure cloud.  This is where Microsoft’s tooling is pretty cool.  I have a local “fabric” that represents the various Azure storage options (table, blob, queue) and can leverage this fabric without ever provisioning a live cloud account.

    2010.10.04storage01

    Connecting to my development storage through the CloudStorageAccount looks like this:

    string connString = "UseDevelopmentStorage=true";
    
    storageAcct = CloudStorageAccount.Parse(connString);
    

    After connecting to the local (or cloud) storage, I can create a new table using the ConferenceRecord type definition, URI of the table, and my cloud credentials.

     CloudTableClient.CreateTablesFromModel(
                    typeof(ConferenceRecordDataContext),
                    storageAcct.TableEndpoint.AbsoluteUri,
                    storageAcct.Credentials);
    

    Now I instantiate my table context object which will add new entities to my table.

    string confName = txtConfName.Text;
    string confType = cbConfType.Text;
    DateTime confDate = dtStartDate.Value;
    
    var context = new ConferenceRecordDataContext(
          storageAcct.TableEndpoint.ToString(),
          storageAcct.Credentials);
    
    ConferenceRecord rec = new ConferenceRecord
     {
           ConferenceName = confName,
           ConferenceCategory = confType,
           ConferenceStartDate = confDate,
      };
    
    context.AddConferenceRecord(rec);
    

    Another nice tool built into Visual Studio 2010 (with the Azure extensions) is the Azure viewer in the Server Explorer window.  Here I can connect to either the local fabric or the cloud account.  Before I run my application for the first time, we can see that my Table list is empty.

    2010.10.04storage02

    If I start up my application and add a few rows, I can see my new Table.

    2010.10.04storage03

    I can do more than just see that my table exists.  I can right-click that table and choose to View Table, which pulls up all the entities within the table.

    2010.10.04storage04

    Performing a lookup from my Azure Table via code is fairly simple and I can either loop through all the entities via a “foreach” and conditional, or, I can use LINQ.  Here I grab all conference records whose ConferenceCategory is equal to “Technology”.

    var val = from c in context.ConferenceRecords
                where c.ConferenceCategory == "Technology"
                select c;
    

    Now, let’s prove that the underlying storage is indeed schema-less.  I’ll go ahead and add a new attribute to the ConferenceRecord object type and populate it’s value in the WinForm UI.  A ConferenceAttendeeLimit of type int was added to the class and then assigned a random value in the UI.  Sure enough, my underlying table was updated with the new “column’” and data value.

    2010.10.04storage05

    I can also update my LINQ query to look for all conferences where the attendee limit is greater than 100, and only my latest column is returned.

    Summary of Part II

    In this second post of the series, we’ve seen that the Windows Azure Table storage product is relatively straightforward to work with.  I find the AWS SimpleDB documentation to be better (and more current) than the Windows Azure storage documentation, but the Visual Studio-integrated tooling for Azure storage is really handy.  AWS has a lower cost of entry as many AWS products don’t charge you a dime until you reach certain usage thresholds.  This differs from Windows Azure where you pretty much pay from day one for any type of usage.

    All in all, both of these products are useful for high-performing, flexible data repositories.  I’d definitely recommend getting more familiar with both solutions.

  • Comparing AWS SimpleDB and Windows Azure Table Storage – Part I

    We have a multitude of options for storing data in the cloud.  If you are looking for a storage mechanism for fast access to non-relational data, then both the Amazon Web Service (AWS) SimpleDB product and the Microsoft Windows Azure Table storage are viable choices.  In this post, I’m going to do a quick comparison of these two products, including how to leverage the .NET API provided by both.

    First, let’s do a comparison of these two.

    Amazon SimpleDB Windows Azure Table
    Feature
    Storage Metaphor Domains are like worksheets, items are rows, attributes are column headers, items are each cell Tables, properties are columns, entities are rows
    Schema None enforced None enforced
    “Table” size Domain up to 10GB, 256 attributes per item, 1 billion attributes per domain 255 properties per entity, 1MB per entity, 100TB per table
    Cost (excluding transfer) Free up to 1GB, 25 machine hours (time used for interactions); $0.15 GB/month up to 10TB, $0.14 per machine hour 0.15 GB/month
    Transactions Conditional put/delete for attributes on single item Batch transactions in same table and partition group
    Interface mechanism REST, SOAP REST
    Development tooling AWS SDK for .NET Visual Studio.NET, Development Fabric

    These platforms are relatively similar in features and functions, with each platform also leveraging aspects of their sister products (e.g. AWS EC2 for SimpleDB), so that could sway your choice as well.

    Both products provide a toolkit for .NET developers and here is a brief demonstration of each.

    Amazon Simple DB using AWS SDK for .NET

    You can download the AWS SDK for .NET from the AWS website.  You get some assemblies in the GAC, and also some Visual Studio.NET project templates.

    2010.09.29storage01

    In my case, I just built a simple Windows Forms application that creates a domain, adds attributes and items and then adds new attributes and new items.

    After adding a reference to the AWSSDK.dll in my .NET project, I added the following “using” statements in my code:

    using Amazon;
    using Amazon.SimpleDB;
    using Amazon.SimpleDB.Model;
    

    Then I defined a few variables which will hold my SimpleDB domain name, AWS credentials and SimpleDB web service container object.

    NameValueCollection appConfig;
    AmazonSimpleDB simpleDb = null;
    string domainName = "ConferenceDomain";
    

    I next read my AWS credentials from a configuration file and pass them into the AmazonSimpleDB object.

    appConfig = ConfigurationManager.AppSettings;
    simpleDb = AWSClientFactory.CreateAmazonSimpleDBClient(appConfig["AWSAccessKey"],
                    appConfig["AWSSecretKey"]);
    

    Now I can create a SimpleDB domain (table) with a simple command.

    CreateDomainRequest domainreq = (new CreateDomainRequest()).WithDomainName(domainName);
    simpleDb.CreateDomain(domainreq);
    

    Deleting domains looks like this:

    DeleteDomainRequest deletereq = new DeleteDomainRequest().WithDomainName(domainName);
    simpleDb.DeleteDomain(deletereq);
    

    And listing all the domains under an account can be done like this:

    string results = string.Empty;
    ListDomainsResponse sdbListDomainsResponse = simpleDb.ListDomains(new ListDomainsRequest());
    if (sdbListDomainsResponse.IsSetListDomainsResult())
    {
       ListDomainsResult listDomainsResult = sdbListDomainsResponse.ListDomainsResult;
       
       foreach (string domain in listDomainsResult.DomainName)
       {
            results += domain + "\n";
        }
     }
    

    To create attributes and items, we use a PutAttributeRequest object.  Here, I’m creating two items, adding attributes to them, and setting the value of the attributes.  Notice that we use a very loosely typed process and don’t work with typed objects representing the underlying items.

    //first item
    string itemName1 = "Conference_PDC2010";
    PutAttributesRequest putreq1 = 
         new PutAttributesRequest().WithDomainName(domainName).WithItemName(itemName1);
    List<ReplaceableAttribute> item1Attributes = putreq1.Attribute;
    item1Attributes.Add(new ReplaceableAttribute().WithName("ConferenceName").WithValue("PDC 2010"));
    item1Attributes.Add(new ReplaceableAttribute().WithName("ConferenceType").WithValue("Technology"));
    item1Attributes.Add(new ReplaceableAttribute().WithName("ConferenceDates").WithValue("09/25/2010"));
    simpleDb.PutAttributes(putreq1);
    
    //second item
    string itemName2 = "Conference_PandP";
    PutAttributesRequest putreq2 = 
        new PutAttributesRequest().WithDomainName(domainName).WithItemName(itemName2);
    List<ReplaceableAttribute> item2Attributes = putreq2.Attribute;
    item2Attributes.Add(new ReplaceableAttribute().WithName("ConferenceName").
         WithValue("Patterns and Practices Conference"));
    item2Attributes.Add(new ReplaceableAttribute().WithName("ConferenceType").WithValue("Technology"));
    item2Attributes.Add(new ReplaceableAttribute().WithName("ConferenceDates").WithValue("11/10/2010"));
    simpleDb.PutAttributes(putreq2);
    

    If we want to update an item in the domain, we can do another PutAttributeRequest and specify which item we wish to update, and with which new attribute/value.

    //replace conference date in item 2
    ReplaceableAttribute repAttr = 
        new ReplaceableAttribute().WithName("ConferenceDates").WithValue("11/11/2010").WithReplace(true);
    PutAttributesRequest putReq = 
        new PutAttributesRequest().WithDomainName(domainName).WithItemName("Conference_PandP").
        WithAttribute(repAttr);
    simpleDb.PutAttributes(putReq);
    

    Querying the domain is done with familiar T-SQL syntax.  In this case, I’m asking for all items in the domain where the ConferenceType attribute equals ‘Technology.”

    string query = "SELECT * FROM ConferenceDomain WHERE ConferenceType='Technology'";
    SelectRequest selreq = new SelectRequest().WithSelectExpression(query);
    SelectResponse selresp = simpleDb.Select(selreq);
    

    Summary of Part I

    Easy stuff, eh?  Because of the non-existent domain schema, I can add a new attribute to an existing item (or new one) with no impact on the rest of the data in the domain.  If you’re looking for fast, highly flexible data storage with high redundancy and no need for the rigor of a relational database, then AWS SimpleDB is a nice choice.  In part two of this post, we’ll do a similar investigation of the Windows Azure Table storage option.

  • Lesson Learned: WCF Routing Service and the BasicHttpBinding

    Two weeks before submitting the final copy of my latest book, I reran all my chapter demonstrations that had been built during the year.  Since many demos were written with beta versions of products, I figured I should be smart and verify that everything was fine with the most recent releases.  But alas, everything was not fine.

    The demo for my Chapter 8 on content based routing (which can mostly be read online at the Packt site) all of a sudden wouldn’t run.  This demo uses the WCF Routing Service to call Workflow Services which sit in front of LOB system services.  When I ran my demo using the final version of Windows Server AppFabric as the host, I got this dumpster-fire of an error from the Routing Service:

    An unexpected failure occurred. Applications should not attempt to handle this error. For diagnostic purposes, this English message is associated with the failure: ‘Shouldn’t allocate SessionChannels if session-less and impersonating’.

    Now, anytime that a framework error returns zero results from various search engines, you KNOW you are in faaaantastic shape.  After spending hours fiddling with directory permissions, IIS/AppFabric settings and consulting a shaman, I decided to switch WCF bindings and see if that helped.  Workflow Services don’t make binding changes particularly easy (from what I’ve seen; your mileage may vary), so I used a protocol mapping section to flip the default Workflow Service binding from BasicHttpBinding to WsHttpBinding and then also switched the Routing Service to use WsHttpBinding.

    <protocolMapping>
          <add scheme="http" binding="wsHttpBinding"/>
    </protocolMapping>
    

    Voila! It worked.  So, I confidently (and erroneously) added a small block of text in the book chapter telling you that problems with the Routing Service and BasicHttp can be avoided by doing the protocol mapping and using WsHttp in the Routing Service.  I was wrong.

    Once the book went to press, I had some time to rebuild a similar solution from scratch using the BasicHttpBinding.  Naturally, it worked perfectly fine.  So, I went line by line through both solutions and discovered that the Routing Service in my book demo had the following line in the web.config file:

    <serviceHostingEnvironment aspNetCompatibilityEnabled="true" multipleSiteBindingsEnabled="true" />
    

    If I removed the aspNetCompatibilityEnabled property, my solution worked fine.  You can read more about this particular setting here.  What  is interesting is that if I purposely ADD this element to the configuration files of the Workflow Services, I don’t get any errors.  It only seems to cause grief for the Routing Service.  I’m not sure how it got into my configuration file in the first place, but I’m reviewing security footage to see if the dog is to blame.  Still not sure why this worked with the beta of Server AppFabric though.

    So, you’d never hit the above error if you used WsHttpBindings in your Workflow Services and upstream Routing Service, but if you do choose to use the BasicHttpBinding binding for your Routing Service, for all that is holy, please remove that configuration setting.

  • Book’s Sample Chapter, Articles and Press Release

    The book is now widely available and our publisher is starting up the promotion machine.  At the bottom of this post is the publisher’s press release.  Also, we now have one sample chapter online (Mike Sexton’s Debatching Bulk Data) as well as two articles representing some of the material from my Content Based Routing chapter (Part 1 – Content Based Routing on the Microsoft Platform, Part II – Building the Content Based Routing Solution on the Microsoft Platform).  This hopefully provides a good sneak peak into the book’s style.

    ## PRESS RELEASE ##

    Solve business problems on the Microsoft application platform using Packt’s new book

     Applied Architecture Patterns on the Microsoft Platform is a new book from Packt that offers an architectural methodology for choosing Microsoft application platform technologies. Written by a team of specialists in the Microsoft space, this book examines new technologies such as Windows Server AppFabric, StreamInsight, and Windows Azure Platform, and their application in real-world solutions.

     Filled with live examples on how to use the latest Microsoft technologies, this book guides developers through thirteen architectural patterns utilizing code samples for a wide variety of technologies including Windows Server AppFabric, Windows Azure Platform AppFabric, SQL Server (including Integration Services, Service Broker, and StreamInsight), BizTalk Server, Windows Communication Foundation (WCF), and Windows Workflow Foundation (WF).

     This book is broken down into 4 different sections. Part 1 starts with getting readers up to speed with various Microsoft technologies. Part 2 concentrates on messaging patterns and the inclusion of use cases highlighting content-based routing. Part 3 digs into bulk data processing, and multi-master synchronization. Finally the last part covers performance-related patterns including low latency, failover to the cloud, and reference data caching.

     Developers can learn about the core components of BizTalk Server 2010, with an emphasis on BizTalk Server versus Windows Workflow and BizTalk Server versus SQL Server. They will not only be in a position to develop their first Windows Azure Platform AppFabric, and SQL Azure applications but will also learn to master data management and data governance of SQL Server Integration Services, Microsoft Sync Framework, and SQL Server Service Broker.

     Architects, developers, and managers wanting to get up to speed on selecting the most appropriate platform for a particular problem will find this book to be a useful and beneficial read. This book is out now and is available from Packt. For more information, please visit the site.

    [Cross posted on Book’s dedicated website]

  • RSS, RSS Readers and Finding Information

    Last week my long-suffering blog reader, Bloglines, pulled the plug.  I’ve since moved over to Google Reader and like it more than the last time I tried it.  Coinciding with Bloglines’ death, a few blog posts cropped up talking about the state of RSS (readers) and the evolved information gathering habits of the consumer.  I can’t say I totally get this new perspective that information should come to me, and active subscriptions are a thing of the past.

    I read Don Dodge’s post last week where he first repeated Robert Scoble’s statement that “if the information is important, it will find me” and then goes on to say that he doesn’t really use RSS readers anymore and relies on real-time channels and content aggregators.  I don’t see how I’d be satisfied consuming my information only through the recommendations of others.  Unless I was (a) on Twitter 25 hours a day or (b) expected every thoughtful technical article would get snapped up by an aggregator, I don’t see how I could replace my own RSS reader.  In an RSS reader, I subscribe to the people who write things that interest me.  Why would I want to rely on others telling me that so-and-so just wrote something profound?

    Today Dave Winer wrote an overall good piece on rebooting RSS where he mentions:

    I keep saying the same thing over and over, the Google Reader approach is wrong, it isn’t giving you what’s new — and that’s all that matters in news.

    Again, I just don’t see it.  Assuming that my RSS reader doesn’t ONLY subscribes to traditional news sources,  I do want “unread counts” and the backlog of things to peruse.  Sure, I don’t want or need a 12-day backlog of sports news from ESPN when I return from vacation, but when I have an RSS subscription to some of my favorite bloggers (e.g. Lori MacVittie of F5), I expect to queue up the interesting articles that aren’t time sensitive “news”, but rather smart opinion pieces.  I don’t use an RSS reader for traditional “news” as much as I use it to actively listen in on the long-form thoughts of insightful people.  I might be strange in that my RSS readers isn’t for news as much as following individual bloggers where this increasing obsession with information speed is less relevant.

    So, maybe I’m clinging to old information consumption models, but I like RSS readers and not relying on browser bookmarks or the whims of my Twitter “friends” to identify smart content.  I notice that my blog gets a high level of traffic from syndicated readers (not site visits), so many of you all seem to be using RSS readers as well.

    What say you?  Is traditional RSS consumption dead?  Do you instead use a mix of bookmarks, aggregators and social-sharing to find new information?

    [Update: Great post from GigaOm that came in after mine and makes the same points, and a few new ones.  Recommended reading.]

  • And … The New Book is Released

    Nearly 16 months after a book idea was born, the journey is now complete.  Today, you can find our book, Applied Architecture Patterns on the Microsoft Platform, in stock at Amazon.com and for purchase and download at the Packt Publishing site.

    I am currently in Stockholm along with co-authors Stephen Thomas and Ewan Fairweather delivering a 2 day workshop for the BizTalk User Group Sweden.  We’re providing overviews of the core Microsoft application platform technologies and then excerpting the book to show how we analyzed a particular use case, chose a technology and then implemented it.  It’s our first chance to see if this book was a crazy idea, or actually useful.  So far, the reaction has been positive.  Of course, the Swedes are such a nice bunch that they may just be humoring me.

    I have absolutely no idea how this book will be received by you all.  I hope you find it to be a unique tool for evaluating architecture and building solutions on Microsoft technology.  If you DON’T like it, then I’ll blame this book idea on Ewan.

  • Interview Series: Four Questions With … Mark Simms

    Happy September and welcome to the 23rd interview with a thought leader in the “connected technology” space.  This month I grabbed Mark Simms who is member of Microsoft’s AppFabric Customer Advisory team, blogger, author and willing recipient of my random emails.

    Mark is an expert on Microsoft StreamInsight and has a lot of practical customer experience with the product.  Let’s see what he has to say.

    Q: While event-driven architecture (EDA) and complex event processing (CEP) are hardly new concepts, there does seem to be momentum in these areas.  While typically a model for financial services, EDA and CEP have gained a following in other arenas as well.  To what might you attribute this increased attention in event processing and which other industries do you see taking advantage of this paradigm?

    A: I tend to think about technology in terms of tipping points, driven by need.  The financial sector, driven by the flood of market data, risks and trades was the first to hit the challenge of needing timely analytics (and by need, we mean worth the money to get), spawning the development of a number of complex event processing engines.  As with all specialized engines, they do an amazing job within their design sphere, but run into limitations when you try to take them outside of their comfort zone.  At the same time, technology drivers such as (truly) distributed computing, scale-out architectures and “managed by somebody” elastic computing fabrics (ok, ok, I’ll call it the “Cloud”) have led to an environment wherein the volume of data being created is staggering – but the volume of information that can be processed (and stored, etc) hasn’t.

    Whereas I spend most of my time lately working on two sectors (process control – oil & gas, smart grids, utilities and web analytics), the incoming freight train of cloud computing is going to land the challenge of dealing with correlating nuggets of information spread across both space and time into some semblance of coherence.  In essence, finding the proverbial needle in the stack of needles tumbling down an escalator is coming soon to a project near you.

    Q: It’s one thing to bake the publication and consumption of events directly into a new system.  But what are some strategies and patterns for event-enabling existing packaged or custom applications?

    A: This depends both on the type of events that are of interest, and the overall architecture of the system.  Message based architectures leveraging a rich subscription infrastructure are an ideal candidate for ease of event-enabling.  CEP engines can attach to key endpoints and observe messages and metadata, inferring events, patterns, etc.  For more monolithic systems there are still a range of options.  Since very little of interest happens on a single machine (other than StarCraft 2’s single player campaign), there’s almost always a network interface that can be tapped into.  As an example on our platform, one might leverage WCF interceptors to extract events from the metadata of a given service call and transfer the event to a centralized StreamInsight instance for processing.  Another approach that can be leveraged with most applications on the Microsoft platform is to extract messages from ETW logs and infer events for processing – between StreamInsight’s ability to handle real-time and historical data, this opens up some very compelling approaches to optimization, performance tuning, etc, for Windows applications.

    Ultimately, it comes down to finding some observable feed of data from the existing system and converting that feed into some usual stream of events.  If the data simply doesn’t exist in an accessible form, alas, StreamInsight does not ship with magic event pixie dust.

    Q: Microsoft StreamInsight leverages a few foundational Microsoft technologies like .NET and LINQ.  What are other parts of the Microsoft stack (applications or platforms) that you see complimenting StreamInsight, and how?

    A: StreamInsight is about taking in a stream of data, and extracting relevant information from that data by way of pattern matching, temporal windows, exception detection and the like.  This implies two things – data comes from somewhere, and information goes somewhere else.  This opens up a world wherein pretty much every technology under the fluorescent lamps is a candidate for complimenting StreamInsight.  Rather than get into a meandering and potentially dreadfully boring bulleted list of doom, here’s some of (but not the only :)) top of mind technologies I think about:

    • SQL Server.  I’ve been a SQL Server guy for the better part of a decade now (after a somewhat interminable sojourn in the land of Oracle and mysql), and for pretty much every project I’m involved with that’s where some portion of the data lives.  Either as the repository for reference data, destination for filtered and aggregate results, or the warehouse of historical data to mine for temporal patterns (think ETL into StreamInsight) the rest of SQL Server suite of technology is never far away.  In a somewhat ironic sense, as I write up my answers, I’m working on a SQL output adapter in the background leveraging SQL Service Broker for handling rate conversion and bursty data.
    • App Fabric Cache. Filling a similar complementary role in terms of a data repository as SQL Server (in a less transactional & durable sense), I look to AppFabric Cache to provide a distributed store for reference data, and a “holding pond” of sorts to handle architectural patterns such as holding on to 30 minutes worth of aggregated results to “feed” newly connecting clients.
    • SharePoint and Silverlight.  Ultimately, every bit of the technology is at some point trying to improve the lot of its users – the fingers and eyeballs factor.  Great alignment SharePoint, combined with Silverlight for delivering rich client experiences (a necessity for visualizing fast-moving data – the vast majority of all visualization tools and frameworks assume that the data is relatively stationary) will be a crucial element in putting a face on the value that StreamInsight delivers.

    Q [stupid question]: They say you can’t teach old dogs new tricks.  I think that in some cases that’s a good thing.  I recently saw a television commercial for shaving cream and noticed that the face-actor shaved slightly differently than I do.  I wondered if I’ve been doing it wrong for 20 years and tried out the new way.  After stopping the bleeding and regaining consciousness, I decided there was absolutely no reason to change my shaving strategy.  Give us an example or two of things that you’re too old or too indifferent to change.

    A: One of the interesting things about being stuck in a rut is that it’s often a very comfortable rut.  If I wasn’t on the road, I’d ask my wife who would no doubt have a (completely accurate) laundry list of these sorts of habits. 

    One of the best aspects of my job on the AFCAT team is our relentless inquisitive drive to charge out into unknown technical territory.  I’m never happier than when I’m learning something new, whether it be figuring out how to apply a new technology or trying to master a new recipe or style of cuisine.  Coupled with a recent international relocation that broke a few of my more self-obvious long standing habits (Tom Horton’s coffee, ketchup chips, a 10-year D&D campaign), this is probably the hardest question to answer.

    With the aforementioned lack of a neutral opinion to fall back on, I’m going to have to pull a +1 on your shaving example – I’ve been using the same shaving cream for almost two decades now, and the last time I tried switching up, I reconfirmed that I am indeed rather violently allergic to every single other shaving balm on the planet 😉

    Thanks Mark.  Keep an eye on his blog and the AppFabric CAT team blog for more in-depth details on the Microsoft platform technologies.

    Share

  • Do you know the Microsoft Customer Advisory Teams? You should.

    For those who live and work with Microsoft application platform technologies, the Microsoft Customer Advisory Teams (CAT) are a great source of real-world info about products and technology.  These are the small, expert-level teams whose sole job is to make sure customers are successful with Microsoft technology.  Last month I had the pleasure of presenting to both the SQL CAT and Server AppFabric CAT teams about blogging and best practices and thought I’d throw a quick plug out for these groups here.

    First off, the SQL CAT team (dedicated website here) has a regular blog of best practices, and link to the best whitepapers for SQL admins, architects, and developers.  I’m not remotely a great SQL Server guy, but I love following this team’s work and picking up tidbits that make me slightly more dangerous at work.  If you actually need to engage these guys on a project, contact your Microsoft rep.

    As for the Windows Server AppFabric CAT team, they also have a team blog with great expert content.  This team, which contains the artists-formerly-known-as-BizTalk-Rangers, provides deep expertise on BizTalk Server, Windows Server AppFabric, WCF, WF, AppFabric Caching and StreamInsight.  You’ll find a great bunch of architects on this team including Tim Wieman, Mark Simms, Rama Ramani, Paolo Salvatori and more, all led by Suren Machiraju and the delightfully frantic Curt Peterson. They’ve recently produced posts about using BizTalk with the AppFabric Service Bus, material on the Entity Framework,  and a ridiculously big and meaty post from Mark Simms about building StreamInsight apps.

    I highly recommend subscribing to both these team blogs and following SQL CAT on twitter (@sqlcat).

    Share

  • How Intelligent is BizTalk 2010’s Intelligent Mapper?

    One of the interesting new features of the BizTalk Server 2010 Mapper (and corresponding Windows Workflow shape) is the “suggestive matching” which helps the XSLT map author figure out which source (or destination) nodes are most likely related.  The MSDN page for suggestive matching has some background material on the feature.  I thought I’d run a couple quick tests to see just how smart this new mapper is.

    Before the suggestive match feature was introduced, we could do bulk mapping through the “link by” feature.  With that feature, you could connect two parent nodes and choose to map the children nodes based on the structure (order), exact names or through the mass copy function.  However, this is a fairly coarse way to map that doesn’t take into account the real semantic differences in a map.  It also doesn’t help you find any better destination candidates that may be in a different section of the schema.

    2010.08.15mapper01

    Through Suggestive Matching, I should have an easier time finding matching nodes with similar, but non-exact naming.  However, per the point of this post, I wasn’t sure if the Mapper just did a simple comparison or anything further.

    Simple Name Matching

    In this scenario, we are simply checking to see if the Mapper looks for the same textual value from the source in the destination.  In my source schema I have a field called “ID.”  In my destination schema I have a field called “ItemID.”  As you’d expect, the suggestive match points this relationship out.

    2010.08.15mapper02

    In that case, the name of the source node is a substring of the destination.  What if the destination node is a substring of the source?  To demonstrate that, I have a source field named “PhoneNumber” and the destination node is named “Phone.”  Sure enough, a match is still made.

    2010.08.15mapper03

    Also, it doesn’t matter where in the node name that a matching value is found.  If I have a “Code” field in the source tree and both a “ZipCode” and “OrderCodeIdentifier” in the destination, both nodes are considered possible matches.  The word “code” in the latter field, although between other text, is still identified as a match.  Not revolutionary of course, but nice.

    2010.08.15mapper04

    Complex Name Matching

    In this scenario, I was looking to see if the Mapper detected any differences based on more than just the substrings.  That is, could it figure out that “FirstName” and “FName” are the same?  Unfortunately, the “FirstName” field below resulted in a match to all name fields in the destination.

    2010.08.15mapper05

    The highlighted link is considered the best match, and I noticed that as I added more characters to the “FName” node, I got a different “best match.”

    2010.08.15mapper06

    You see that “FirName” is considered a close match to “FirstName.”  Has anyone else found any cases where similar but inexact worded is still marked as a match?

    Node Positioning

    I was hoping that via intelligent mapping that an address with a similar structure could be matched across.  That is, if in one map I had certain identically named nodes before an after one, that it might guess that the middle ones matched.  For instance, what if I have “City” between “Street” and “State” in the source and “Town” between “Street” and “State” in the destination, that maybe it would detect a pattern.  But alas, that is apparently a dream.

    2010.08.15mapper07

    Summary

    It looks like our new intelligent mapper, with the help of Suggestive Match, does a decent job of textual matching between a source and destination schema.  I have yet to see any examples of advanced conditions outside of that.  Still, if all we get is textual matching, that still provides developers a bit of help when traversing monstrous schemas with multiple destination candidates for a source node.

    If you have any additional experiences with this, I’d love to hear it.

    Share