Author: Richard Seroter

  • Using SnapLogic to Link (Cloud) Apps

    I like being exposed to new technologies, so I reached out to the folks at SnapLogic and asked to take their platform for a spin. SnapLogic is part of this new class of “integration platform as a service” providers that take a modern approach to application/data integration. In this first blog post (of a few where I poke around the platform), I’ll give you a sense of what SnapLogic is, how it works, and show a simple solution.

    What Is It?

    With more and more SaaS applications in use, a company needs to rethink how they integrate their application portfolio. SnapLogic offers a scalable, AWS-hosted platform that streams data between endpoints that exist in the cloud or on-premises. Integration jobs can be invoked programmatically, via the web interface, or on a schedule. The platform supports more than traditional ETL operations. Instead, I can use SnapLogic to do BOTH batch and real-time integration. It runs as a multi-tenant cloud service and has tools for building, managing, and monitoring integration flows.

    The platform offers a modern, mobile-friendly interface, and offers many of the capabilities you expect from a traditional integration stack: audit trails, bulk data support, guaranteed delivery, and security controls. However, it differs from classic stacks in that it offers geo-redundancy, self-updating software, support for hierarchical/relational data, and elastic scale. That’s pretty compelling stuff if you’re faced with trying to integrate with new cloud apps from legacy integration tools.

    How Does It Work?

    The agent that runs SnapLogic workflows is called a Snaplex. While the SnapLogic cloud itself is multi-tenant, each customer gets their own elastic Snaplex. What if you have data behind the corporate firewall that a cloud-hosted Snaplex can’t access? Fortunately, SnapLogic lets you deploy an on-premises Snaplex that can talk to local systems. This helps you design integration solutions that secure span environments.

    SnapLogic workflows are called pipelines and the tasks within a pipeline are called snaps. With over 160+ snaps available (and an SDK to add more), integration developers can put together a pipeline pretty quickly. Pipelines are built in a web-based design surface where snaps are connected to form simple or complex workflows.

    2014.03.23snaplogic03

    It’s easy to drag snaps to the pipeline designer, set properties, and connect snaps together.

    2014.03.23snaplogic04

     

    The platform offers a dashboard view where you can see the health of your environment, pipeline run history, and details about what’s running in the Snaplex.

    2014.03.23snaplogic01

    The “manager” screens let you do things like create users, add groups, browse pipelines, and more.

    2014.03.23snaplogic02

     

    Show Me An Example!

    Ok, let’s try something out. In this basic scenario, I’m going to do a file transfer/translation process. I want to take in a JSON file, and output a CSV file. The source JSON contains some sample device reads:

    2014.03.23snaplogic05

    I sketched out a flow that reads a JSON file that I uploaded to the SnapLogic file system, parses it, and then splits it into individual documents for processing. There are lots of nice usability touches, such as interpreting my JSON format and helping me choose a place to split the array up.

    2014.03.23snaplogic06

    Then I used a CSV Formatter snap to export each record to a CSV file. Finally, I wrote the results to a file. I could have also used the File Writer snap to publish the results to Amazon S3, FTP, S/FTP, FTP/S, HTTP, or HDFS.

    2014.03.23snaplogic07

    It’s easy to run a pipeline within this interface. That’s the most manual way of kicking off a pipeline, but it’s handy for debugging or irregular execution intervals.

    2014.03.23snaplogic08

    The result? A nicely formatted CSV file that some existing system can easily consume.

    2014.03.23snaplogic09

    Do you want to run this on a schedule? Imagine pulling data from a source every night and updating a related system. It’s pretty easy with SnapLogic as all you have to do is define a task and point to which pipeline to execute.

    2014.03.23snaplogic10

    Notice in that image above that you can also set the “Run With” value to “Triggered” which gives you a URL for external invocation. If I pulled the last snap off my pipeline, then the CSV results would get returned to the HTTP caller. If I pulled the first snap off my pipeline, then I could send an HTTP POST request and send a JSON message into the pipeline. Pretty cool!

    Summary

    It’s good to be aware of what technologies are out there, and SnapLogic is definitely one to keep an eye on. It provides a very cloud-native integration suite that can satisfy both ETL and ESB scenarios in an easy-to-use way. I’ll do another post or two that shows how to connect cloud endpoints together, so watch out for that.

    What do you think? Have you used SnapLogic before or think that this sort of integration platform is the future?

  • Windows Azure BizTalk Services Just Became 19% More Interesting

    Buried in the laundry list of new Windows Azure features outlined by Scott Guthrie was a mention of some pretty intriguing updates to Windows Azure BizTalk Services (WABS). Specifically, this cloud-based brokered messaging service can now accept messages from Windows Azure Service Bus Topics and Queues ( there were some other updates to the service as well, and you can read about them on the BizTalk team blog). Why does this make the service more interesting to me? Because it makes this a more useful service for cloud integration scenarios. Instead of only offering REST or FTP input channels, WABS now lets you build complex scenarios that use the powerful pub-sub capabilities of Windows Azure Service Bus brokered messaging. This blog post will take a brief look at how to use these new features, and why they matter.

    First off, there’s a new set of developer components to use. Download the installer to get the new capabilities.

    2014.02.21biztalk01

    I’m digging this new style of Windows installer that lets you know which components need upgrading.

    2014.02.21biztalk02

    After finishing the upgrade, I fired up Visual Studio 2012 (as I didn’t see a template added for Visual Studio 2013 usage), and created a new WABS project. Sure enough, there are two new “sources” in the Toolbox.

    2014.02.21biztalk05

    What are the properties of each? When I added the Service Bus Queue Source to the bridge configuration, I saw that you add a connection string and queue name.

    2014.02.21biztalk06

    For Service Bus Topics, you use a Service Bus Subscription Source and specify the connection string and subscription name.

    2014.02.21biztalk07

    What was missing in the first release of WABS was the ability to do durable messaging as an input channel. In addition, the WABS bridge engine still doesn’t support a broadcast scenario, so if you want to send the same message to 10 different endpoints, you can’t. One solution was to use the Topic destination, but what if you wanted to add endpoint-specific transformations or lookup logic first? You’re out of luck. NOW, you could build a solution where you take in messages from a combination of queues, REST endpoints, and topic subscriptions, and route it accordingly. Need to send a message to 5 recipients? Now you send it to a topic, and then have bridges that respond to each topic subscription with endpoint-specific transformation and logic. MUCH better. You just have more options to build reliable integrations between endpoints now.

    Let’s deploy an example. I used the Sever Explorer in Visual Studio to create a new queue and a topic with a single subscription. I also added another queue (“marketing”) that will receive all the inbound messages.

    2014.02.21biztalk08

    I then built a bridge configuration that took in messages from multiple sources (queue and topic) and routed to a single queue.

    2014.02.21biztalk09

    Configuring the sources isn’t as easy as it should be. I still have to copy in a connection string (vs. look it up from somewhere), but it’s not too painful. The Windows Azure Portal does a nice job of showing you the connection string value you need.

    2014.02.21biztalk10

    After deploying the bridge successfully, I opened up the Service Bus Explorer and sent a message to the input queue.

    2014.02.21biztalk11

    I then sent another message to the input topic.

    2014.02.21biztalk12

    After a second or two, I queried the “marketing” queue which should have both messages routed through the WABS bridge. Hey, there it is! Both messages were instantly routed to the destination queue.

    2014.02.21biztalk13

    WABS is a very new, but interesting tool in the integration-as-a-service space. This February update makes it more likely that I’d recommend it for legitimate cloud integration scenarios.

  • Richard’s Top 10 Rules for Meeting Organizers

    Meetings are a necessary evil. Sometimes, you simply have to get a bunch of people together at one time in order to resolve a problem or share information. While Tier 3 was a very collaborative environment where there were very few meetings and information flowed freely, our new parent company (CenturyLink) has 50,000 people spread around the globe who need information that I have. So in any given week, I now have 40+ meetings on my calendar. Good times. Fortunately my new colleagues are a great bunch, but they occasionally fall into same bad meeting habits that I’ve experienced throughout my career.

    Very few people LIKE meetings, so how can organizers make them less painful? Here are my top 10 pieces of advice (in no particular order):

    1. Have a purpose to the meeting, and state it up front. It seems obvious, but I’ve been in plenty of meetings in my career where it seemed like the organizer just wanted to get people together and talk generally about things. I hate those. Stop doing that. Instead, have a clear, succinct purpose to what the meeting is about, and reiterate it when the meeting starts.
    2. Provide an agenda. Failing to send an agenda to meeting participants should result in meeting privileges being taken away from an organizer. What are we trying to accomplish? What are the expected outcomes? I’ve learned to reject meeting invites with an empty description, and it feels GLORIOUS. Conversely, I quickly accept a meeting invite with a clear purpose and agenda.
    3. Give at least 1 day of lead time. I find it amusing when I get meeting invites for a meeting that’s already under way, or that’s starting momentarily. That assumes that (potential) participants are sitting around waiting for spontaneous meeting requests that are immediately the most important thing in the world. In reality, that is only the case 1% of the time. Try and schedule meetings as far ahead as possible so that people can make adjustments to their calendar as needed.
    4. Always have a decision-maker present. Is there anything rougher than sitting through a 1 hour meeting and then realizing that no one in the room has the authority to do anything? Ugh. Unless the meeting is purely for information sharing, there should ALWAYS be people in the room who can take action.
    5. Only invite people who will actively contribute. It’s rare that a 10+ person meeting yields value for all those participants. Don’t invite the whole world and hope that a few key folks accept. Don’t invite more than one person who can answer the question. The only people who should be in the meeting are the ones who make decisions or those that have information that contribute to a decision.
    6. Do not cancel a meeting after it’s scheduled start time. Nothing’s more fun than sitting on hold waiting for a meeting to start, and then getting a cancellation notice. Plenty of people arrange parts of their days around certain meetings, so last-second cancellations are infuriating. If you’re a meeting organizer and you are pretty sure that a meeting will be moved/cancelled, send out a notice as soon as possible! Your participants will thank you.
    7. Start meetings on time. I’ll typically give an organizer 5-8 minutes to start a meeting. If I’m still waiting after that, then the meeting really isn’t that important.
    8. Schedule the least amount of time possible. It’s demoralizing to see 2-5 hour meetings on your calendar. Especially when they are for “working sessions” that may not need the whole time slot. We all know the rule that people fill up the time given, so as a general rule, provide as little time as possible. Be focused! When organizing any meeting, start with a 30 minute duration and ask yourself why it has to be longer than that.
    9. Prep the presenters with background material. I enjoy meetings where I hear “… and now Richard will walk through some important data points” and I’m caught by surprise. This is where an agenda can help, but make sure that presenters in your meeting know EXACTLY what you expect of them, who is attending, and any context about why the meeting is occurring.
    10. End every meeting with clear action items. Nothing deflates a good meeting like an vague finish. Summarize what needs to be done, and assign responsibility. Even in “information sharing” meetings you can have action items like “read more about it here!”

    Any other tips you have to help organizers conduct efficient, impactful meetings?

  • Upcoming Speaking Engagements in London and Seattle

    In a few weeks, I’ll kick off a run of conference presentations that I’m really looking forward to.

    First, I’ll be in London for the BizTalk Summit 2014 event put on by the BizTalk360 team. In my talk “When To Use What: A Look at Choosing Integration Technology”, I take a fresh look at the topic of my book from a few years ago. I’ll walk through each integration-related technology from Microsoft and use a “buy, hold, or sell” rating to indicate my opinion on its suitability for a project today. Then I’ll discuss a decision framework for choosing among this wide variety of technologies, before closing with an example solution. The speaker list for this event is fantastic, and apparently there are only a handful of tickets remaining.

    The month after this, I’ll be in Seattle speaking at the ALM Forum. This well-respected event for agile software practitioners is held annually and I’m very excited to be part of the program. I am clearly the least distinguished speaker in the group, and I’m totally ok with that. I’m speaking in the Practices of DevOps track and my topic is “How Any Organization Can Transition to DevOps – 10 Practical Strategies Gleaned from a Cloud Startup.” Here I’ll drill into a practical set of tips I learned by witnessing (and participating in) a DevOps transformation at Tier 3 (now CenturyLink Cloud). I’m amped for this event as it’s fun to do case studies and share advice that can help others.

    If you’re able to attend either of those events, look me up!

  • Using the New Salesforce Toolkit for .NET

    Wade Wegner, a friend of the blog and an evangelist for Salesforce.com, just built a new .NET Toolkit for Salesforce developers. This Toolkit is open source and available on GitHub. Basically, it makes it much simpler to securely interact with the Salesforce.com REST API from .NET code. It takes care of “just working” on multiple Windows platforms (Win 7/8, WinPhone), async processing, and wrapping up all the authentication and HTTP stuff needed to call Salesforce.com endpoints. In this post, I’ll do a basic walkthrough of adding the Toolkit to a project and working with a Salesforce.com resource.

    After creating a new .NET project (Console project, in my case) in Visual Studio, all you need to do is reference the NuGet packages that Wade created. Specifically, look for the DeveloperForce.Force package which pulls in the “common” package (that has baseline stuff) as well as the JSON.NET package.

    2014.01.16force01

    First up, add a handful of using statements to reference the libraries we need to use the Toolkit, grab configuration values, and work with dynamic objects.

    using System.Configuration;
    using Salesforce.Common;
    using Salesforce.Force;
    using System.Dynamic;
    

    The Toolkit is written using the async and await model for .NET, so calling this library requires some knowledge of this. To make life simple for this demo, define an operation like this that can be called from the Main entry point.

    static void Main(string[] args)
    {
            Do().Wait();
    }
    
    static async Task Do()
    {
         ...
    }
    

    Let’s fill out the “Do” operation that uses the Toolkit. First, we need to capture our Force.com credentials. The Toolkit supports a handful of viable authentication flows. Let’s use the “username-password” flow. This means we need OAuth/API credentials from Salesforce.com. In the Salesforce.com Setup screens, go to Create, then Apps and create a new application. For a full walkthrough of getting credentials for REST calls, see my article on the DeveloperForce site.

    2014.01.16force02

    With the consumer key and consumer secret in hand, we can now authenticate using the Toolkit. In the code below, I yanked the credentials from the app.config accompanying the application.

    //get credential values
    string consumerkey = ConfigurationManager.AppSettings["consumerkey"];
    string consumersecret = ConfigurationManager.AppSettings["consumersecret"];
    string username = ConfigurationManager.AppSettings["username"];
    string password = ConfigurationManager.AppSettings["password"];
    
    //create auth client to retrieve token
    var auth = new AuthenticationClient();
    
    //get back URL and token
    await auth.UsernamePassword(consumerkey, consumersecret, username, password);
    

    When you call this, you’ll see that the AuthenticationClient now has populated properties for the instance URL and access token. Pull those values out, as we’re going to use them when interacting the the REST API.

    var instanceUrl = auth.InstanceUrl;
    var accessToken = auth.AccessToken;
    var apiVersion = auth.ApiVersion;
    

    Now we’re ready to query Salesforce.com with the Toolkit. In this first instance, create a class that represents the object we’re querying.

    public class Contact
        {
            public string Id { get; set; }
            public string FirstName { get; set; }
            public string LastName { get; set; }
        }
    

    Let’s instantiate the ForceClient object and issue a query. Notice that we pass in a SQL-like syntax when querying the Salesforce.com system. Also, see that the Toolkit handles all the serialization for us!

    var client = new ForceClient(instanceUrl, accessToken, apiVersion);
    
    //Toolkit handles all serialization
    var contacts = await client.Query<Contact>("SELECT Id, LastName From Contact");
    
    //loop through returned contacts
    foreach (Contact c in contacts)
    {
          Console.WriteLine("Contact - " + c.LastName);
    }
    

    My Salesforce.com app has the following three contacts in the system …

    2014.01.16force03

    Calling the Toolkit using the code above results in this …

    2014.01.16force04

    Easy! But does the Toolkit support dynamic objects too? Let’s assume you’re super lazy and don’t want to create classes that represent the Salesforce.com objects. No problem! I can use late binding through the dynamics keyword and get back an object that has whatever fields I requested. See here that I added the “FirstName” to the query and am not passing in a known class type.

    var client = new ForceClient(instanceUrl, accessToken, apiVersion);
    
    var contacts = await client.Query<dynamic>("SELECT Id, FirstName, LastName FROM Contact");
    
    foreach (dynamic c in contacts)
    {
          Console.WriteLine("Contact - " + c.FirstName + " " + c.LastName);
    }
    

    What happens when you run this? You should have all the queried values available as properties.

    2014.01.16force05

    The Toolkit supports more than just “query” scenarios. It also works great for create/update/delete as well. Like before, these operations worked with strongly typed objects or dynamic ones. First, add the code below to create a contact using our known “contact” type.

    Contact c = new Contact() { FirstName = "Golden", LastName = "Tate" };
    
    string recordId = await client.Create("Contact", c);
    
    Console.WriteLine(recordId);
    

    That’s a really simple way to create Salesforce.com records. Want to see another way? You can use the dynamic ExpandoObject to build up an object on the fly and send it in here.

    dynamic c = new ExpandoObject();//
    c.FirstName = "Marshawn";
    c.LastName = "Lynch";
    c.Title = "Chief Beast Mode";
    
    string recordId = await client.Create("Contact", c);
    
    Console.WriteLine(recordId);
    

    After running this, we can see this record in our Salesforce.com database.

    2014.01.16force06

    Summary

    This is super useful and a fantastic way to easily interact with Salesforce.com from .NET code. Wade’s looking for feedback and contributions as he builds this out further. Add issues if you encounter bugs, and issue a pull request if you want to add features like error handling or support for other operations.

  • Data Stream Processing with Amazon Kinesis and .NET Applications

    Amazon Kinesis is a new data stream processing service from AWS that makes it possible to ingest and read high volumes of data in real-time. That description may sound vaguely familiar to those who followed Microsoft’s attempts to put their CEP engine StreamInsight into the Windows Azure cloud as part of “Project Austin.” Two major differences between the two: Kinesis doesn’t have the stream query aspects of StreamInsight, and Amazon actually SHIPPED their product.

    Kinesis looks pretty cool, and I wanted to try out a scenario where I have (1) a Windows Azure Web Site that generates data, (2) Amazon Kinesis processing data, and (3) an application in the CenturyLink Cloud which is reading the data stream.

    2014.01.08kinesis05

    What is Amazon Kinesis?

    Kinesis provides a managed service that handles the intake, storage, and transportation of real-time streams of data. Each stream can handle nearly unlimited data volumes. Users set up shards which are the means for scaling up (and down) the capacity of the stream. All the data that comes into the a Kinesis stream is replicated across AWS availability zones within a region. This provides a great high availability story. Additionally, multiple sources can write to a stream, and a stream can be read by multiple applications.

    Data is available in the stream for up to 24 hours, meaning that applications (readers) can pull shard records based on multiple schemes: given sequence number, oldest record, latest record. Kinesis uses DynamoDB to store application state (like checkpoints). You can interact with Kinesis via the provided REST API or via platform SDKs.

    What DOESN’T Kinesis do? It doesn’t have any sort of adapter model, so it’s up to the developer to build producers (writers) and applications (readers). There is a nice client library for Java that has a lot of built in logic for application load balancing and such. But for the most part, this is still a developer-oriented solution for building big data processing solutions.

    Setting up Amazon Kinesis

    First off, I logged into the AWS console and located Kinesis in the navigation menu.

    2014.01.08kinesis01

    I’m then given the choice to create a new stream.

    2014.01.08kinesis02

    Next, I need to choose the initial number of shards for the stream. I can either put in the number myself, or use a calculator that helps me estimate how many shards I’ll need based on my data volume.

    2014.01.08kinesis03

    After a few seconds, my managed Kinesis stream is ready to use. For a given stream, I can see available shards, and some CloudWatch metrics related to capacity, latency, and requests.

    2014.01.08kinesis04

    I now have an environment for use!

    Creating a data producer

    Now I was ready to build an ASP.NET web site that publishes data to the Kinesis endpoint. The AWS SDK for .NET already Kinesis objects, so no reason to make this more complicated than it has to be. My ASP.NET site has NuGet packages that reference JSON.NET (for JSON serialization), AWS SDK, jQuery, and Bootstrap.

    2014.01.08kinesis06

    The web application is fairly basic. It’s for ordering pizza from a global chain. Imagine sending order info to Kinesis and seeing real-time reactions to marketing campaigns, weather trends, and more. Kinesis isn’t a messaging engine per se, but it’s for collecting and analyzing data. Here, I’m collecting some simplistic data in a form.

    2014.01.08kinesis07

    When clicking the “order” button, I build up the request and send it to a particular Kinesis stream. First, I added the following “using” statements:

    using Newtonsoft.Json;
    using Amazon.Kinesis;
    using Amazon.Kinesis.Model;
    using System.IO;
    using System.Text;
    

    The button click event has the following (documented) code.  Notice a few things. My AWS credentials are stored in the web.config file, and I pass in an AmazonKinesisConfig to the client constructor. Why? I need to tell the client library which AWS region my Kinesis stream is in so that it can build the proper request URL. See that I added a few properties to the actual put request object. First, I set the stream name. Second, I added a partition key which is used to place the record in a given shard. It’s a way of putting “like” records in a particular shard.

    protected void btnOrder_Click(object sender, EventArgs e)
        {
            //generate unique order id
            string orderId = System.Guid.NewGuid().ToString();
    
            //build up the CLR order object
            Order o = new Order() { Id = orderId, Source = "web", StoreId = storeid.Text, PizzaId = pizzaid.Text, Timestamp = DateTime.Now.ToString() };
    
            //convert to byte array in prep for adding to stream
            byte[] oByte = Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(o));
    
            //create stream object to add to Kinesis request
            using (MemoryStream ms = new MemoryStream(oByte))
            {
                //create config that points to AWS region
                AmazonKinesisConfig config = new AmazonKinesisConfig();
                config.RegionEndpoint = Amazon.RegionEndpoint.USEast1;
    
                //create client that pulls creds from web.config and takes in Kinesis config
                AmazonKinesisClient client = new AmazonKinesisClient(config);
    
                //create put request
                PutRecordRequest requestRecord = new PutRecordRequest();
                //list name of Kinesis stream
                requestRecord.StreamName = "OrderStream";
                //give partition key that is used to place record in particular shard
                requestRecord.PartitionKey = "weborder";
                //add record as memorystream
                requestRecord.Data = ms;
    
                //PUT the record to Kinesis
                PutRecordResponse responseRecord = client.PutRecord(requestRecord);
    
                //show shard ID and sequence number to user
                lblShardId.Text = "Shard ID: " + responseRecord.ShardId;
                lblSequence.Text = "Sequence #:" + responseRecord.SequenceNumber;
            }
        }
    

    With the web application done, I published it to a Windows Azure Web Site. This is super easy to do with Visual Studio 2013, and within a few seconds my application was there.

    2014.01.08kinesis08

    Finally, I submitted a bunch of records to Kinesis by adding pizza orders. Notice the shard ID and sequence number that Kinesis returns from each PUT request.

    2014.01.08kinesis09

    Creating a Kinesis application (record consumer)

    To realistically read data from a Kinesis stream, there are three steps. First, you need to describe the stream in order to find out the shards. If I want a fleet of servers to run this application and read the stream, I’d need a way for each application to claim a shard to work on. The second step is to retrieve a “shard iterator” for a given shard. The iterator points to a place in the shard where I want to start reading data. Recall from above that I can start with the latest unread records, oldest records, or at a specific point in the shard. The third and final step is to get the records from a particular iterator. Part of the result set of this operation is a “next iterator” value. In my code, if I find another iterator value, I once again call the “get records” operation to pull any records from that iterator position.

    Here’s the total code block, documented for your benefit.

    private static void ReadFromKinesis()
    {
        //create config that points to Kinesis region
        AmazonKinesisConfig config = new AmazonKinesisConfig();
        config.RegionEndpoint = Amazon.RegionEndpoint.USEast1;
    
       //create new client object
       AmazonKinesisClient client = new AmazonKinesisClient(config);
    
       //Step #1 - describe stream to find out the shards it contains
       DescribeStreamRequest describeRequest = new DescribeStreamRequest();
       describeRequest.StreamName = "OrderStream";
    
       DescribeStreamResponse describeResponse = client.DescribeStream(describeRequest);
       List<Shard> shards = describeResponse.StreamDescription.Shards;
       foreach(Shard s in shards)
       {
           Console.WriteLine("shard: " + s.ShardId);
       }
    
       //grab the only shard ID in this stream
       string primaryShardId = shards[0].ShardId;
    
       //Step #2 - get iterator for this shard
       GetShardIteratorRequest iteratorRequest = new GetShardIteratorRequest();
       iteratorRequest.StreamName = "OrderStream";
       iteratorRequest.ShardId = primaryShardId;
       iteratorRequest.ShardIteratorType = ShardIteratorType.TRIM_HORIZON;
    
       GetShardIteratorResponse iteratorResponse = client.GetShardIterator(iteratorRequest);
       string iterator = iteratorResponse.ShardIterator;
    
       Console.WriteLine("Iterator: " + iterator);
    
       //Step #3 - get records in this iterator
       GetShardRecords(client, iterator);
    
       Console.WriteLine("All records read.");
       Console.ReadLine();
    }
    
    private static void GetShardRecords(AmazonKinesisClient client, string iteratorId)
    {
       //create reqest
       GetRecordsRequest getRequest = new GetRecordsRequest();
       getRequest.Limit = 100;
       getRequest.ShardIterator = iteratorId;
    
       //call "get" operation and get everything in this shard range
       GetRecordsResponse getResponse = client.GetRecords(getRequest);
       //get reference to next iterator for this shard
       string nextIterator = getResponse.NextShardIterator;
       //retrieve records
       List<Record> records = getResponse.Records;
    
       //print out each record's data value
       foreach (Record r in records)
       {
           //pull out (JSON) data in this record
           string s = Encoding.UTF8.GetString(r.Data.ToArray());
           Console.WriteLine("Record: " + s);
           Console.WriteLine("Partition Key: " + r.PartitionKey);
       }
    
       if(null != nextIterator)
       {
           //if there's another iterator, call operation again
           GetShardRecords(client, nextIterator);
       }
    }
    

    Now I had a working Kinesis application that can run anywhere. Clearly it’s easy to run this on AWS EC2 servers (and the SDK does a nice job with retrieving temporary credentials for apps running within EC2), but there’s a good chance that cloud users have a diverse portfolio of providers. Let’s say I love the application services from AWS, but like the server performance and management capabilities from CenturyLink. In this case, I built a Windows Server to run my Kinesis application.

    2014.01.08kinesis10

    With my server ready, I ran the application and saw my shards, my iterators, and my data records.

    2014.01.08kinesis11

    Very cool and pretty simple. Don’t forget that each data consumer has some work to do to parse the stream, find the (partition) data they want, and perform queries on it. You can imagine loading this into an Observable and using LINQ queries on it to aggregate data. Regardless, it’s very nice to have a durable stream processing service that supports replays and multiple readers.

    Summary

    The “internet of things” is here, and companies that can quickly gather and analyze data will have a major advantage. Amazon Kinesis is an important service to that end, but don’t think of it as something that ONLY works with other applications in the AWS cloud. We saw here that you could have all sorts of data producers running on devices, on-premises, or in other clouds. The Kinesis applications that consume data can also run virtually anywhere. The modern architect recognizes that composite applications are the way to go, and hopefully this helped you understand another services that’s available to you!

  • How Do I Get Things Done? New Pluralsight Course Reveals All My Secrets

    Back in August when I released a Pluralsight course on Optimizing Systems on AWS, the most frequent question I heard had nothing to do with AWS or the cloud. Instead, it was the familiar “how do you do so much stuff?” Various suggestions have been offered: robot twin, outsourcing to South America, zero sleep. Alas, the truth is less sensational. I’ve learned over the years (from experience, studying, and watching others) to be ruthless about efficiency and always take a strategic viewpoint on new opportunities. During the past few months I’ve jotted down all my techniques – as well as those used by a host of my fellow Pluralsight authors – and put them all into a single, 1 hour Pluralsight course that was released today.

    Productivity Tips for the Busy Tech Professional is a collection of 17 actionable tips for getting control over your busy schedule and spending more of your time on the things you enjoy. The course is broken up into three short modules:

    • Setting the Foundation. A handful of tips that will shape how you approach your activities.
    • Establishing a Frame of Mind. Tips for better decision making and prioritization.
    • Managing Time Effectively. Batch of tips for the day-to-day activities that will help you make better use of your available time.

    No theoretical stuff here. It’s all (hopefully) practical advice that you can use RIGHT NOW to get more done while keeping your work/life balance under control. Now, obviously everyone is different and for some, these tips won’t make a difference or you might completely disagree with them. That’s cool with me, as I’m just hoping to inspire change (large or small) that helps us all become happier and more fulfilled.

    So watch this course on your lunch break, and let me know what you think! Something you agree/disagree with? Have your own tips to share? I’d really love to hear it.

  • Favorite Books and Blog Posts of 2013

    The year is nearly up, and it’s been a good one. 30+ posts on the blog, job changes, 4 new Pluralsight courses, twice-monthly InfoQ.com contributions, graduation with a Masters Degree in Engineering, and speaking engagements around the world (Amsterdam, Gothenburg, New Orleans, San Francisco, London, and Chicago). Here’s a quick recap of my favorite blogs posts (from here, or other places that I write), and some of the best books that I read this year.

    Blog Posts

    While this was one of my quieter years on the blog, I probably wrote more than ever before thanks to the variety of places that I publish things. Here were some of my favorites:

    Books

    I read a few dozen books this year, and these were the ones that stood out for being informative, interesting, and inspirational.

    Once again, thanks to all of you for continuing to read my work and inspiring me to keep it up!

  • How Do BizTalk Services Work? I Asked the Product Team to Find Out

    Windows Azure BizTalk Services was recently released by Microsoft, and you can find a fair bit about this cloud service online. I wrote up a simple walkthrough, Sam Vanhoutte did a nice comparison of features between BizTalk Server and BizTalk Services,  the Neudesic folks have an extensive series of blog posts about it, and the product documentation isn’t half bad.

    However, I wanted to learn more about how the service itself works, so I reached out to Karthik Bharathy who is a senior PM on the BizTalk team and part of the team that shipped BizTalk Services. I threw a handful of technical questions at him, and I got back some fantastic answers. Hopefully you learn something new;  I sure did!

    Richard: Explain what happens after I deploy an app. Do you store the package in my storage account and add it to a new VM that’s in a BizTalk Unit?

    Karthik: Let’s start with some background information – the app that you are referring today is the VS project with a combination of the bridge configuration and artifacts like maps, schemas and DLLs. When you deploy the project, each of the artifacts and the bridge configuration are uploaded into one by one. The same notion also applies through the BizTalk Portal when you are deploying an agreement and uploading artifacts to the resources.

    The bridge configuration represents the flow of the message in the pipeline in XML format. Every time you build BizTalk Services project in Visual Studio, an <entityName>.Pipeline.atom is generated in the project bins folder. This atom file is the XML representation of the pipeline configuration. For example, under the <pipelines> section you can see the bridges configured along with the route information. You can also get a similar listing by issuing a GET operation on the bridge endpoint with the right ACS credentials.

    Now let’s say the bridge is configured with Runtime URL <deploymentURL>/myBridge1. After you click deploy, the pipeline configuration gets published in the repository of the <deploymentURL> for handling /myBridge1. For every message sent to the <deploymentURL>, the role looks at the complete path including /myBridge1 and load the pipeline configuration from the repository. Once the pipeline configuration is loaded, the message is processed per the configuration. If the configuration does not exist, then an error is returned to the caller.

    Richard: What about scaling an integration application? How does that work?

    Karthik: Messages are processed by integration roles in the deployment. When the customer initiates a scale operation, we update the service configuration to add/remove instances based on the ask. The BizTalk Services deployment updates its state during scaling operation and new messages to the deployment is handled by one of the instances of the role. This is similar to how Web or Worker roles are scaled up/down in Azure today.

    Richard: Is there any durability in the bridge? What if a downstream endpoint is offline?

    Karthik: The concept of a bridge is close to a messaging channel if we may borrow the phrase from Enterprise Integration Patterns. It helps in bridging the impedance between two messaging systems. As such the bridge is a stateless system and does not have persistence built into it. Therefore bridges need to report any processing errors back to the sender. In the case where the downstream endpoint is offline, the bridge propagates the error back to the sender – the semantics are slightly different a) based on the bridge and b) based on the source from where the message has been picked up.

    For EAI, bridges with HTTP the error code is sent back to the sender while with the same bridge using an FTP head, the system tries to pickup and process the message again from the source at regular intervals (and errors out eventually). In both cases you can see relevant track records in the portal.

    For B2B, our customers rarely intend to send an HTTP error back to their partners. When the message cannot be sent to a downstream (success) endpoint, the message is routed to the suspend endpoint. You might argue that the suspend endpoint could be down as well – while this is generally a bad idea to put in a flaky target in success or suspend endpoints, we don’t rule out this possibility. In the worst case we deliver the error code back to the sender.

    Richard: Bridges resemble BizTalk Server ESB Toolkit itineraries. Did you ever consider re-using that model?

    Karthik: ESB is an architectural pattern and you should look at the concept of bridge as being part of the ESB model for BizTalk Services on Azure. The sources and destinations are similar to the on-ramp, off-ramp model and the core processing is part of the bridge. Of course, additional capabilities like exception management, governance, alerts will need to added to bring it closer to the ESB Toolkit.

    Richard: How exactly does the high availability option work?

    Karthik: Let’s revisit the scaling flow we talked about earlier. If we had a scale >=2, you essentially have a system that can process messages even when one of the machines can down in your configuration. If one of the machines are down, the load balancer in our system routes the message to the running instances. For example, this is taken care of during “refresh” when customers can restart their deployment after updating a user DLL. This ensures message processing is not impacted.

    Richard: It looks like backup and restore is for the BizTalk Services configuration, not tracking data. What’s the recommended way to save/store an audit trail for messages?

    Karthik: The purpose of backup and restore is for the deployment configuration including schemas, maps, bridges, agreements. The tracking data comes from the Azure SQL database provided by the customer. The customer can add the standard backup/restore tools directly on that storage. To save/store an audit of messages, you have couple of options at the moment – with B2B you can turn archiving on either AS2 or X12 processing and with EAI you can plug-in an IMessageInspector extension that can read the IMessage data and save it to an external store.

    Richard: What part of the platform are you most excited about?

    Karthik: Various aspects of the platform are exciting – we started off building capabilities with ‘AppFabric Connect’ to enable server customers leverage existing investments with the cloud. Today, we have built a richer set of functionality with BizTalk Adapter services to connect popular LOBs with Bridges. In the case of B2B, BizTalk Server traditionally exposed functionality for IT Operators to manage trading partner relationships using the Admin Console. Today, we have a rich TPM functionality in the BizTalk Portal and also have the OM API public for the developer community. In EAI we allow extending message processing using custom code  If I should call out one I like, it has to be custom code enablement. The dedicated deployment model managed by Microsoft makes this possible. It is always a challenge to enable user DLL to execute without providing some sort of sandboxing. Then there are also requirements around performance guarantees. BizTalk Services dedicated deployment takes care of all these – if the code behaves in an unexpected way, only that deployment is affected. As the resources are isolated, there are also better guarantees about the performance. In a configuration driven experience this makes integration a whole lot simpler.

    Thanks Karthik for an informative chat!

  • Window Azure BizTalk Services: How to Get Started and When to Use It

    The “integration as a service” space continues to heat up, and Microsoft officially entered the market with the General Availability of Windows Azure BizTalk Services (WABS) a few weeks ago.  I recently wrote up an InfoQ article that summarized the product and what it offered. I also figured that I should actually walk through the new installation and deployment process to see what’s changed. Finally, I’ll do a brief assessment of where I’d consider using this vs. the other cloud-based integration tools.

    Installation

    Why am I installing something if this is a cloud service? Well, the development of integration apps still occurs on premises, so I need an SDK for the necessary bits. Grab the Windows Azure BizTalk Services SDK Setup and kick off the process. I noticed what seems to be a much cleaner installation wizard.

    2013.12.10biztalkservices01

    After choosing the components that I wanted to install (including the runtime, developer tools, and developer SDK) …

    2013.12.10biztalkservices02

    … you are presented with a very nice view of the components that are needed, and which versions are already installed.

    2013.12.10biztalkservices03

    At this point, I have just about everything on my local developer machine that’s needed to deploy an integration application.

    Provisioning

    WABS applications run in the Windows Azure cloud. Developers provision and manage their WABS instances in the Windows Azure portal. To start with, choose App Services and then BizTalk Services before selecting Custom Create.

    2013.12.10biztalkservices04

    Next, I’m asked to pick an instance name, edition, geography, tracking database, and Windows Azure subscription. There are four editions: basic, developer, standard, and premium. As you move between editions (and pay more money), you have access to greater scalability, more integration applications, on-premises connectivity, archiving, and high availability.

    2013.12.10biztalkservices05

    I created a new database instance and storage account to ensure that there would be no conflicts with old (beta) WABS instances. Once the provisioning process was complete (maybe 15 minutes or so), I saw my new instance in the Windows Azure Portal.

    2013.12.10biztalkservices07

    Drilling into the WABS instance, I saw connection strings, IP addresses, certificate information, usage metrics, and more.

    2013.12.10biztalkservices08

    The Scale tab showed me the option to add more “scale units” to a particular instance. Basic, standard and premium edition instances have “high availability” built in where multiple VMs are beneath a single scale unit. Adding scale units to an instance requires standard or premium editions.

    2013.12.10biztalkservices09

    In the technical preview and beta releases of WABS, the developer was forced to create a self-signed certificate to use when securing a deployment. Now, I can download a dev certificate from the Portal and install it into my personal certificate store and trusted root authority. When I’m ready for production, I can also upload an official certificate to my WABS instance.

    Developing

    WABS projects are built in Visual Studio 2012/2013. There’s an entirely new project type for WABS, and I could see it when creating a new project.

    2013.12.10biztalkservices10

    Let’s look at what we have in Visual Studio to create WABS solutions. First, the Server Explorer includes a WABS component for creating cloud-accessible connections to line-of-business systems. I can create connections to Oracle/SAP/Siebel/SQL Server repositories on-premises and make them part of the “bridges” that define a cloud integration process.

    2013.12.10biztalkservices11

    Besides line-of-business endpoints, what else can you add to a Bridge? The final palette of activities in the Toolbox is shown below. There are a stack of destination endpoints that cover a wide range of choices. The source options are limited, but there are promises of new items to come. The Bridges themselves support one-way or request-response scenarios.

    2013.12.10biztalkservices12

    Bridges expect either XML or flat file messages. A message is defined by a schema. The schema editor is virtually identical to the visual tool that ships with the standard BizTalk Server product. Since source and destination message formats may differ, it’s often necessary to include a transformation component. Transformation is done via a very cool Mapper that includes a sophisticated set of canned operations for transforming the structure and content of a message. This isn’t the Mapper that comes with BizTalk Server, but a much better one.

    2013.12.10biztalkservices14

     

    A bridge configuration model includes a source, one or more bridges (which can be chained together), and one or more destinations. In the case below, I built a simple one-way model that routes to a Windows Azure Service Bus Queue.

    2013.12.10biztalkservices15

    Double-clicking a particular bridge shows me the bridge configuration. In this configuration, I specified an XML schema for the inbound message, and a transformation to a different format.

    2013.12.10biztalkservices16

     

    At this point, I have a ready-to-go integration solution.

    Deploying

    To deploy one of theses solutions, you’ll need a certificate installed (see earlier note), and the Windows Azure Access Control Service credentials shown in the Windows Azure Portal. With that info handy, I right-clicked the project in Visual Studio and chose Deploy. Within a few seconds, I was told that everything was up and running. To confirm, I clicked the Manage button on the WABS instance page was visited the WABS-specific Portal. This Portal shows my deployed components, and offers tracking services. Ideally, this would be baked into the Windows Azure Portal itself, but at least it somewhat resembles the standard Portal.

    2013.12.10biztalkservices17

    So it looks like I have everything I need to build a test.

    Testing

    I finally built a simple Console application in Visual Studio to submit a message to my Bridge. The basic app retrieved a valid ACS token and sent it along with my XML message to the bridge endpoint. After running the app, I got back the tracking ID for my message.

    2013.12.10biztalkservices19

    To actually see that this worked, I first looked in the WABS Portal to find the tracked instance. Sure enough, I saw a tracked message.

    2013.12.10biztalkservices18

    As a final confirmation, I opened up the powerful Service Bus Explorer tool and connected to my Windows Azure account. From here, I could actually peek into the Windows Azure Service Bus Queue that the WABS bridge published to. As you’d expect, I saw the transformed message in the queue.

    2013.12.10biztalkservices20

    Verdict

    There is a wide spectrum of products in the integration-as-a-service domain. There are batch-oriented tools like Informatica Cloud and Dell Boomi, and things like SnapLogic that handle both real-time and batch. Mulesoft sells the CloudHub which is a comprehensive platform for real-time integration. And of course, there are other integration solutions like AWS Simple Notification Services or the Windows Azure Service Bus.

    So where does WABS fit in? It’s clearly message-oriented, not batch-oriented. It’s more than a queuing solution, but less than an ESB. It seems like a good choice for simple connections between different organizations, or basic EDI processing. There are a decent number of source and destination endpoints available, and the line-of-business connectivity is handy. The built in high availability and optional scalability mean that it can actually be used in production right now, so that’s a plus.

    There’s still lots of room for improvement, but I guess that’s to be expected with a v1 product. I’d like to see a better entry-level pricing structure, more endpoint choices, more comprehensive bridge options (like broadcasting to multiple endpoints), built-in JSON support, and SDKs that support non-.NET languages.

    What do you think? See any use cases where you’d use this today, or are there specific features that you’ll be waiting for before jumping in?