Richard Seroter's Architecture Musings

Author: Richard Seroter

Interview Series: Four Questions With … Dean Robertson

I took a brief hiatus from my series of interviews with “connected systems” thought leaders, but we’re back with my 39th edition. This month, we’re chatting with Dean Robertson who is a longtime integration architect, BizTalk SME, organizer of the Azure User Group in Brisbane, and both the founder and Technology Director of Australian consulting firm Mexia. I’ll be hanging out in person with Dean and his team in a few weeks when I visit Australia to deliver some presentations on building hybrid cloud applications.

Let’s see what Dean has to say.

Q: In the past year, we’ve seen a number of well known BizTalk-oriented developers embrace the new Windows Azure integration services. How do you think BizTalk developers should view these cloud services from Microsoft? What should they look at first, assuming these developers want to explore further?

A: I’ve heard on the grapevine that a number of local BizTalk guys down here in Australia are complaining that Azure is going to take away our jobs and force us all to re-train in the new technologies, but in my opinion nothing could be further from the truth.

BizTalk as a product is extremely mature and very well understood by both the developer & customer communities, and the business problems that a BizTalk-based EAI/SOA/ESB solution solves are not going to be replaced by another Microsoft product anytime soon. Further, BizTalk integrates beautifully with the Azure Service Bus through the WCF netMessagingBinding, which makes creating hybrid integration solutions (that span on-premises & cloud) a piece of cake. Finally the Azure Service Bus is conceptually one big cloud-scale BizTalk messaging engine anyway, with secure pub-sub capabilities, durable message persistence, message transformation, content-based routing and more! So once you see the new Azure integration capabilities for what they are, a whole new world of ‘federated bus’ integration architectures reveal themselves to you. So I think ‘BizTalk guys’ should see the Azure Service Bus bits as simply more tools in their toolbox, and trust that their learning investments will pay off when the technology circles back to on-premises solutions in the future.

As for learning these new technologies, Pluralsight has some terrific videos by Scott Seely and Richard Seroter that help get the Azure Service Bus concepts across quickly. I also think that nothing beats downloading the latest bits from MS and running the demo’s first hand, then building their own “Hello Cloud” integration demo that includes BizTalk. Finally, they should come along to industry events (<plug>like Mexia’s Integration Masterclass with Richard Seroter</plug> 🙂 ) and their local Azure user groups to meet like-minded people love to talk about integration!

Q: What integration problem do you think will get harder when hybrid clouds become the norm?

A: I think Business Activity Monitoring (BAM) will be the hardest thing to consolidate because you’ll have integration processes running across on-premises BizTalk, Azure Service Bus queues & topics, Azure web & worker roles, and client devices. Without a mechanism to automatically collect & aggregate those business activity data points & milestones, organisations will have no way to know whether their distributed business processes are executing completely and successfully. So unless Microsoft bring out an Azure-based BAM capability of their own, I think there is a huge opportunity opening up in the ISV marketplace for a vendor to provide a consolidated BAM capture & reporting service. I can assure you Mexia is working on our offering as we speak 🙂

Q: Do you see any trends in the types of applications that you are integrating with? More off-premise systems? More partner systems? Web service-based applications?

A: Whilst a lot of our day-to-day work is traditional on-premises SOA/EAI/ESB, Mexia has also become quite good at building hybrid integration platforms for retail clients by using a combination of BizTalk Server running on-premises at Head Office, Azure Service Bus queues and topics running in the cloud (secured via ACS), and Windows Service agents installed at store locations. With these infrastructure pieces in place we can move lots of different types of business messages (such as sales, stock requests, online orders, shipping notifications etc) securely around world with ease, and at an infinitesimally low cost per message.

As the world embraces cloud computing and all of the benefits that it brings (such as elastic IT capacity & secure cloud scale messaging) we believe there will be an ever-increasing demand for hybrid integration platforms that can provide the seamless ‘connective tissue’ between an organisations’ on-premises IT assets and their external suppliers, branch offices, trading partners and customers.

Q [stupid question]: Here in the States, many suburbs have people on the street corners who swing big signs that advertise things like “homes for sales!’ and “furniture – this way!” I really dislike this advertising model because they don’t broadcast traditional impulse buys. Who drives down the street, sees one of these clowns and says “Screw it, I’m going to go pick up a new mattress right now.” Nobody. For you, what are your true impulse purchases where you won’t think twice before acting on an urge, and plopping down some money.

A: This is a completely boring answer, but I cannot help myself on www.amazon.com. If I see something cool that I really want to read about, I’ll take full advantage of the ‘1-click ordering’ feature before my cognitive dissonance has had a chance to catch up. However when the book arrives either in hard-copy or on my Kindle, I’ll invariably be time poor for a myriad of reasons (running Mexia, having three small kids, client commitments etc) so I’ll only have time to scan through it before I put it on my shelf with a promise to myself to come back and read it properly one day. But at least I have an impressive bookshelf!

Thanks Dean, and see you soon!

May 9, 2012
Windows Azure Service Bus EAI Doesn’t Support Multicast Messaging. Should It?

Lately, I’ve been playing around a lot with the Windows Azure Service Bus EAI components (currently in CTP). During my upcoming Australia trip (register now!) I’m going to be walking through a series of use cases for this technology.

There are plenty of cool things about this software, and one of them is that you can visually model the routing of messages through the bus. For instance, I can define a routing scenario (using “Bridges” and destination endpoints) that takes in an “order” message, and routes it to an (onsite) database, Service Bus Queue or a public web service.

Super cool! However, the key word in the previous sentence was “or.” I cannot send a message to ALL those endpoints because currently, the Service Bus EAI engine doesn’t support the multi-cast scenario. You can only route a message to a single destination. So the flow above is valid, IF I have routing rules (e.g. “OrderAmount > 100”) that help the engine decide which of the endpoints to send the message to. I asked about this in the product forums, and had that (non) capability confirmed. If you need to do multi-cast messaging, then the suggestion is to use Service Bus Topics as an endpoint. Service Bus Topics (unlike Service Bus Queues) support multiple subscribers who can all receive a copy of a message. The end result would be this:

However, for me, one of the great things about the Bridges is the ability to use Mapping to transform message (format/content) before it goes to an endpoint. In the image below, note that I have a Transform that takes the initial “Order” message and transforms it to the format expected by my SQL Server database endpoint (from my first diagram).

If I had to use Topics to send messages to a database and web service (via the second diagram), then I’d have to push the transformation responsibility down to the application that polls the Topic and communicates with the database or service. I’d also lose the ability to send directly to my endpoint and would require a Service Bus Topic to act as an intermediary. That may work for some scenarios, but I’d love the option to use all the nice destination options (instead of JUST Topics), perform the mapping in the EAI Bridges, and multi-cast to all the endpoints.

What do you think? Should the Azure Service Bus EAI support multi-cast messaging, or do you think that scenario is unusual for you?

May 4, 2012
Richard Going to Oz to Deliver an Integration Workshop? This is Happening.

At the most recent MS MVP Summit, Dean Robertson, founder of IT consultancy Mexia, approached me about visiting Australia for a speaking tour. Since I like both speaking and koalas, this seemed like a good match.

As a result, we’ve organized sessions for which you can now register to attend. I’ll be in Brisbane, Melbourne and Sydney talking about the overall Microsoft integration stack, with special attention paid to recent additions to the Windows Azure integration toolset. As usual, there should be lots of practical demonstrations that help to show the “why”, “when” and “how” of each technology.

If you’re in Australia, New Zealand or just needed an excuse to finally head down under, then come on over! It should be lots of fun.

April 13, 2012
Deploying Node.js Applications to Iron Foundry using the Cloude9 IDE
This week, I attended the Cloud Foundry “one year anniversary” event where among other things, Cloud9 announced support for deployment to Cloud Foundry from their innovative Cloud9 IDE. The Cloud9 IDE lets you write HTML5, JavaScript and Node.js applications in an entirely web-based environment. Their IDE’s editor support many other programming languages, but they provide the fullest support for HTML/JavaScript. Up until this week, you could deploy your applications to Joyent, Heroku and Windows Azure. Now, you can also target any Cloud Foundry environment. Since I’ve been meaning to build a Node.js application, this seemed like the perfect push to do so. In this blog post, I’ll show you how to author a Node.js application in the Cloud9 IDE and push it to Iron Foundry’s distribution of Cloud Foundry. Iron Foundry recently announced their support for many languages besides .NET, so here’s a chance to see if that’s really the case.

Let’s get started. First, I signed up for a free Cloud9 IDE account. It was super easy. Once I got my account, I saw a simple dashboard that showed my projects and allowed me to connect my account to Github.

From here, I can create a new project by clicking the “+” icon above My Projects.

At this point, I was asked for the name of my project and type of project (Git/Mercurial/FTP). Once my SeroterNodeTest project was provisioned, I jumped into the Cloud9 IDE editor interface. I don’t have any files (except for some simple Git instructions in a README file) but I got my first look at the user interface.

The Cloud9 IDE provides much more than just code authoring and syntax highlighting. The IDE lets me create files, pull in Github projects, run my app in their environment, deploy to a supported cloud environment, and perform testing/debugging of the app. Now I was ready to build the app!

I didn’t want to JUST build a simple “hello world” app, so I thought I’d use some recommended practices and let my app either return HTML or JSON based on querystring parameters. To start with, I’ll create my Node.js server by right-clicking my project and adding a new file named server.js.

Before writing any code, I decided that I didn’t want to just build an HTML string by hand and have my Node.js app return it. So, I decided to use Mustache and separate my data from my HTML. Now, I couldn’t see an easy way to import this JavaScript library through the UI until I noticed that Cloud9 IDE supported the Node Package Manager (npm) in the exposed command window. From this command window, I could write a simple command (“npm install mustache”) and the necessary JavaScript libraries were added to my project.

Great. Now I was ready to write my Node.js server code. First, I added a few references to required libraries.
```
//create some variables that reference key libraries
var http = require('http');
var url = require('url');
var Mustache = require('./node_modules/mustache/mustache.js');
```
Next, I created a handler function that writes out HTML when it gets invoked. This function takes a “response” object which represents the content being returned to the caller. Doing response writing at this level helps prevent blocking calls in Node.js.
```
//This function returns an HTML response when invoked
function getweb(response)
{
    console.log('getweb called');
    //create JSON object
    var data = {
        name: 'Richard',
        age: 35
    };

    //create template that formats the data
    var template = 'Hi there, <strong>{{ name }}</strong>';

    //use Mustache to apply the template and create HTML
    var result = Mustache.to_html(template, data);

    //write results back to caller
    response.writeHead(200, {'Content-Type': 'text/html'});
    response.write(result);
    response.end();
}
```
My second handler responds to a different URL querystring and returns a JSON object back to the caller.
```
//This function returns JSON to simulate a service call
function callservice(response)
{
    console.log('callservice called');
    //create JSON object
    var data = {
        name: 'Richard',
        age: 35
    };

    //write results back to caller
    response.writeHead(200, {'Content-Type': 'text/html'});
    //convert JSON to string
    response.write(JSON.stringify(data));
    response.end();
}
```
How do I choose which of these two handlers to call? I have a function that uses input parameters to dynamically invoke one function or the other, based on the querystring input.
```
//function that routes the request to appropriate handlers
function routeRequest(path, reqhandle, response)
{
    //does the request map to one of my function handlers?
     if (typeof reqhandle[path] === 'function') {
       //yes, so call the function
       reqhandle[path](response);
     }
     else
     {
         console.log('no match');
         response.end();
     }
}
```
The last function in my server.js file is the most important. This “startup” function is my entry point of the module. It starts the Node.js server and defines the operation that is called on each request. That operation invokes the previously defined routeRequest function which then explicitly handles the request.
```
//inital function that routes requests
function startup(reqhandle)
{
    //function that responds to client requests
    function onRequest(request, response)
    {
        //yank out the path from the URL the client hit
        var path = url.parse(request.url).pathname;

        //handle individual requests
        routeRequest(path, reqhandle, response);
    }

    //start up the Node.js server
    http.createServer(onRequest).listen(process.env.PORT);
    console.log('Server running');
}
```
Finally, at the bottom of this module, I expose the functions that I want other modules to be able to call.
```
//expose this module's operations so they can be called from main JS file
exports.startup = startup;
exports.getweb = getweb;
exports.callservice = callservice;
```
With my primary server done, I went and added a new file, index.js.

This acts as my application entry point. Here I reference the server.js module and create an array of valid querystring values and which function should respond to which path.
```
//reference my server.js module
var server = require('./server');

//create an array of valid input values and what server function to invoke
var reqhandle = {};
reqhandle['/'] = server.getweb;
reqhandle['/web'] = server.getweb;
reqhandle['/service'] = server.callservice;

//call the startup function to get the server going
server.startup(reqhandle);
```
And … we’re done. I switched to the Run tab, made sure I was starting with index.js and clicked the Debug button. At the bottom of the screen, in the Console window, I could see whether or not the application was able to start up. If so, a URL is shown.

Clicking that link took me to my application hosted by Cloud9.

With no URL parameters (just “/”), the web function was called. If I add “/service” to the URL, I see a JSON result.

Cool! Just to be thorough, I also threw the “/web” on the URL, and sure enough, my web function was called.

I was now ready to deploy this bad boy to Iron Foundry. The Cloud9 IDE is going to look for a package.json file before allowing deployment, so I went ahead and added a very simple one.

Also, Cloud Foundry uses a different environmental variable to allocate the server port that Node.js listens on.So, I switched this line:

http.createServer(onRequest).listen(process.env.C9_PORT);

to this …

http.createServer(onRequest).listen(process.env.VCAP_APP_PORT);

I moved to the Deployment tab and clicked on the “+” sign at the top.

What comes up is a wizard where I chose to deploy to Cloud Foundry (but could have also chosen Windows Azure, Joyent or Heroku).

The key phrasing there is that you are signing into a Cloud Foundry API. So ANY Cloud Foundry provider (that is accessible by Cloud9 IDE) is a valid target. I plugged in the API endpoint of the newest Iron Foundry environment, and provided my credentials.

Once I signed in, I saw that I had no apps in this environment yet. After putting a name to application, I clicked the Create New Cloud Foundry application button and was given the choice of Node.js runtime version, number of instances to run this on, and how much RAM to allocate.

That was the final step in the deployment target wizard, and now all that’s left to do is select this new package and click Deploy.

In seven seconds, the deployment was done and I was provided my Iron Foundry URL.

Sure enough, hitting that URL (http://seroternodetest.ironfoundry.me/service) in the browser resulted in my Node.js application returning the expected response.

How cool is all that? I admit that while I find Node.js pretty interesting, I don’t have a whole lot of enterprise-type scenarios in mind yet. But, playing with Node.js gave me a great excuse to try out the handy Cloud9 IDE while flexing Iron Foundry’s newfound love for polyglot environments.

What do you think? Have you tried web-only IDEs? Do you have any sure-thing usage scenarios for Node.js in enterprise environments?
April 13, 2012
Three Software Updates to be Aware Of

In the past few days, there have been three sizable product announcements that should be of interest to the cloud/integration community. Specifically, there are noticeable improvements to Microsoft’s CEP engine StreamInsight, Windows Azure’s integration services, and Tier 3’s Iron Foundry PaaS.

First off, the Microsoft StreamInsight team recently outlined changes that are coming in their StreamInsight 2.1 release. This is actually a pretty major update with some fundamental modification to the programmatic object model. I can attest to the fact that it can be challenge to build up the necessary host/query/adapter plumbing necessary to get a solution rolling, and the StreamInsight team has acknowledged this. The new object model will be a bit more straightforward. Also, we’ll see IEnumerable and IObservable become more first-class citizens in the platform. Developers are going to be encouraged to use IEnumerable/IObservable in lieu of adapters in both embedded AND server-based deployment scenarios. In addition to changes to the object model, we’ll also see improved checkpointing (failure recovery) support. If you want to learn more about StreamInsight, and are a Pluralsight subscriber, you can watch my course on this product.

Next up, Microsoft released the latest CTP for its Windows Azure Service Bus EAI and EDI components. As a refresher, these are “BizTalk in the cloud”-like services that improve connectivity, message processing and partner collaboration for hybrid situations. I summarized this product in an InfoQ article written in December 2011. So what’s new? Microsoft issued a description of the core changes, but in a nutshell, the components are maturing. The tooling is improving, the message processing engine can handle flat files or XML, the mapping and schema designers have enhanced functionality, and the EDI offering is more complete. You can download this release from the Microsoft site.

Finally, those cats at Tier 3 have unleashed a substantial update to their open-source Iron Foundry (public or private) .NET PaaS offering. The big takeaway is that Iron Foundry is now feature-competitive with its parent project, the wildly popular Cloud Foundry. Iron Foundry now supports a full suite of languages (.NET as well as Ruby, Java, PHP, Python, Node.js), multiple backend databases (SQL Server, Postgres, MySQL, Redis, MongoDB), and queuing support through Rabbit MQ. In addition, they’ve turned on the ability to tunnel into backend services (like SQL Server) so you don’t necessarily need to apply the monkey business that I employed a few months back. Tier 3 has also beefed up the hosting environment so that people who try out their hosted version of Iron Foundry can have a stable, reliable experience. A multi-language, private PaaS with nearly all the services that I need to build apps? Yes, please.

Each of the above releases is interesting in its own way and to me, they have relationships with one another. The Azure services enable a whole new set of integration scenarios, Iron Foundry makes it simple to move web applications between environments, and StreamInsight helps me quickly make sense of the data being generated by my applications. It’s a fun time to be an architect or developer!

April 6, 2012
ETL in the Cloud with Informatica: Part 4 – Sending Salesforce.com Data to Local Database
The Informatica Cloud is an integration-as-a-service platform for designing and executing Extract-Transform-Load (ETL) tasks. This is the fourth and final post in a blog series that looked a few realistic usage scenarios for this platform. In this post, I’ll show you how you can send real-time data changes from Salesforce.com to a local SQL Server database.

As a reminder, in this four-part blog series, I am walking through the following scenarios:
- Part 1 – Sending File Data to Dynamics CRM Online
- Part 2 – Sending Salesforce.com Data to Dynamics CRM Online
- Part 3 – Sending Dynamics CRM Online Data to a Local Database
- Part 4 – Sending Salesforce.com Data to Local Database
Scenario Summary

I originally tried to do this with a SQL Azure database, but the types of errors I was getting led me to believe that Informatica is not yet using a JDBC driver that supports Azure. So be it. Here’s what I built:

In this solution, I (1) create the ETL task in the web-based designer, (2) setup Salesforce.com Outbound Messaging to send out an event whenever a new Account is added, (3) receive that event on an endpoint hosted in the Informatica Cloud and push the message to the on-premises agent, and (4) update the local database with the new account.

Outbound Messaging is such a cool feature of Salesforce.com and a way to have a truly event-driven line of business application. Let’s see how it works.

Building the ETL Package

To start with, I decided to reuse the same CrmAccount table that I created for the last post. This table holds some basic details for a given account.

Next, I went to the Informatica Cloud task designer and created a new Data Synchronization task. I first need to create the task BEFORE I can set up Outbound Messaging in Salesforce.com. On the first page of the wizard, I defined my ETL and set the operation for Insert.

On the next wizard page, I reused the Salesforce.com connection that I created in the second post of this blog series. I set the Source Object to Account and saw the simple preview of the accounts currently in Salesforce.com.

I then set up my target, using the same SQL Server connection that I created in the previous post. I then chose the CrmAccount table and saw that there were no rows in there.

I didn’t choose any filter of data and moved on to the Field Mapping section. Here, I filled each target field with a value from the source object.

Finally, on the scheduling tab, I chose the “Run this task in real-time upon receiving an outbound message from Salesforce” option. When selected, this option reveals a URL that Salesforce.com can call from its Outbound Messaging activity.

That’s it! Now, how about we go get Salesforce.com all set up for this solution?

Setting up Salesforce.com Outbound Messaging

In my Salesforce.com Setup console, I went to the Workflow Rules section.

I then created a brand new Workflow Rule and selected the Account object. I then named the rule, set it to run when records are created or edited and gave it a simple evaluation rule that checks to see if the Account Name has a value.

On the next page of this wizard, I was given the choice of what to do when that workflow condition is met. Notice that besides Outbound Messaging, there are also options for creating tasks and sending email messages.

After choosing New Outbound Message, I needed to provide a name for this Outbound Message, the endpoint URL provided to me by the Informatica Cloud, and the data fields that my mapping will expect. In my case, there were five fields that were used in my mapping.

After saving this configuration, I completed the Workflow Rule and activated it.

Testing the ETL

With my Informatica Cloud configuration ready, and Salesforce.com Workflow Rule activated, I went and created a brand new Account record.

After saving the new record, I went and looked in the Outbound Messaging Delivery Status view and it was empty, meaning that it had already completed! Sure enough, I checked my database table and BOOM, there it was.

That’s impressive!

Summary

One of the trickiest aspects of Salesforce.com Outbound Messaging is that you need an public-facing internet endpoint to push to, even if your receiving app is inside your firewall. By using the Informatica Cloud, you get one! This scenario demonstrated a way to do *instant* data transfer from Salesforce.com to a local database. I think that’s pretty killer.

I hope you found this series useful. A modern enterprise architecture landscape will include traditional components like BizTalk Server and Informatica (or SSIS for that matter), but also start to contain cloud-based integration tools. Informatica Cloud should be high on your list of options for integrating both on-premises and cloud applications, especially if you want to stop installing and maintaining integration software!
March 27, 2012
ETL in the Cloud with Informatica: Part 3 – Sending Dynamics CRM Online Data to Local Database
In Part 1 and Part 2 of this series, I’ve taken a look at doing Extract-Transform-Load (ETL) operations using the Informatica Cloud. This platform looks like a great choice for bulk movement of data between cloud or on-premises systems. So far we’ve seen how to move data from on-premises to the cloud, and then between clouds. In this post, I’ll show you how you can transfer data from a cloud application (Dynamics CRM Online) to a SQL Server database running onsite.

As a reminder, in this four-part blog series, I am walking through the following scenarios:
- Part 1 – Sending File Data to Dynamics CRM Online
- Part 2 – Sending Salesforce.com Data to Dynamics CRM Online
- Part 3 – Sending Dynamics CRM Online Data to a Local Database
- Part 4 – Sending Salesforce.com Data to Local Database
Scenario Summary

For this demo, I’ll be building a solution that looks like this:

For this case, I (1) build the ETL package using the Informatica Cloud’s web-based designer, (2) the Cloud Secure Agent retrieves the ETL details when the task is triggered, (3) the data is retrieved from Dynamics CRM Online, and (4) the data is loaded into a SQL Server database.

You can probably think of many scenarios where this situation will apply. For example, good practices for cloud applications often state that you keep onsite backups of your data. This is one way to do that on a daily schedule. In another case, you may have very complex reporting needs and cannot accomplish them using Dynamic CRM Online’s built in reporting capability, so a local, transformed replica makes sense.

Let’s see how to make this happen.

Setting up the Target Database

First up, I created a database table in my SQL Server 2008 R2 instance. This table, called CrmAccount holds a few of the attributes that reside in the Dynamics CRM Online “Account” entity.

Next, I added a new Login to my Instance and switched my server to accept both Windows Authentication *and* SQL Server authentication. Why? During some trial runs with this, I couldn’t seem to get integrated authentication to work in the Informatica Cloud designer. When I switched to a local DB account, the connection worked fine.

After this, I confirmed that I had TCP/IP enabled since the Cloud Secure Agent uses this port for connecting to my server.

Building the ETL Package

With all that set up, now we can build our ETL task in the Informatica Cloud environment. The first step in the Data Synchronization wizard is to provide a name for my task and choose the type of operation (e.g. Insert, Update, Upsert, Delete).

Next, I’ll chose my Source. In this step, I reused the Dynamics CRM Online connection that I created in the first post of the series. After choosing that connection, I selected the Account entity as my Source Object. A preview of the data was then automatically shown.

With my source in place, I moved on to define my target. In this case, my target is going to involve a new SQL Server connection. To create this connection, I supplied the name of my server, instance (if applicable), database, credentials (for the SQL Server login account) and port number.

Once I defined the connection, the drop down list (Target Object) was auto-populated with the tables in my database. I selected CrmAccount and saw a preview of my (empty) table.

On the next wizard page, I decided to not apply any filters on the Dynamics CRM Online data. So, ALL accounts should be copied over to my database table. I was now ready for the data mapping exercise. The following wizard page let me drag-and-drop fields from the source (Dynamics CRM Online) to the target (SQL Server 2008 R2).

On the last page of the wizard, I chose to NOT run this task on a schedule. I could set this run every five minutes, or once a week. There’s lots of flexibility in this.

Testing the ETL

Let’s test this out. In my list of Data Synchronization Tasks I can see the tasks from the last two posts, and a new tasks representing what we created above.

By clicking the green Run Now button, I can kick off this ETL. As an aside, the Informatica Cloud exposes a REST API where among other things, you can make a web request that kicks off a task on demand. That’s a neat feature that can come in handy if you have an ETL that runs infrequently, but a need arises for it to run RIGHT NOW. In this case, I’m going with the Run Now button.

To compare results, I have 14 account records in my Dynamics CRM Online organization.

I can see in my Informatica Cloud Activity Log that the ETL task completed and 14 records moved over.

To be sure, I jumped back to my SQL Server database and checked out my table.

As I expected, I can see 14 new records in my table. Success!

Summary

Sending data from a cloud application to an on-premises database is a realistic use case and hopefully this demo showed how easily it can be accomplished with the Informatica Cloud. The database connection is relatively straightforward and the data mapping tool should satisfy most ETL needs.

In the next post of this series, I’ll show you how to send data, in real-time, from Salesforce.com to a SQL Server database.
March 27, 2012
ETL in the Cloud with Informatica: Part 2 – Sending Salesforce.com Data to Dynamics CRM Online
In my last post, we saw how the Informatica Cloud lets you create bulk data load (i.e. ETL) tasks using a web-based designer and uses a lightweight local machine agent to facilitate the data exchange. In this post, I’ll show you how to transfer data from Salesforce.com to Dynamics CRM Online using the Informatica Cloud.

In this four-part blog series, I will walk through the following scenarios:
- Part 1 – Sending File Data to Dynamics CRM Online
- Part 2 – Sending Salesforce.com Data to Dynamics CRM Online
- Part 3 – Sending Dynamics CRM Online Data to a Local Database
- Part 4 – Sending Salesforce.com Data to Local Database
Scenario Summary

In this post, I’ll build the following solution.

In this solution, (1) I leverage the web-based designer to craft the ETL between Salesforce.com and Dynamics CRM Online, (2) use a locally installed Secure Cloud Agent to retrieve ETL details, (3) pull data from Salesforce.com, and finally (4) move that data into Dynamics CRM Online.

What’s interesting is that even though this is a “cloud only” ETL, the Informatica Cloud solution still requires the use of the Cloud Secure Agent (installed on-premises) to facilitate the actual data transfer.

To view some of the setup steps (such as signing up for services and installing required software), see the first post in this series.

Building the ETL Package

To start with, I logged into the Informatica Cloud and created a new Data Synchronization task.

On the next wizard page, I created a new connection type for Salesforce.com and provided all the required credentials.

With that in place, I could select that connection, the entity (“Contact”) to pull data from, and see a quick preview of that data in my Salesforce.com account.

On the next wizard page, I configured a connection to my ETL target. I chose an existing Dynamics CRM Online connection, and selected the “Contact” entity.

Instead of transferring all the data from my Salesforce.com organization to my Dynamics CRM Online organization, I used the next wizard page to define a data filter. In my case, I’m only going to grab Salesforce.com contacts that have a title of “Architect”.

For the data mapping exercise, it’s nice that the Informatica tooling automatically links fields through its Automatch capability. In this scenario, I didn’t do any manual mapping and relied solely on Automatch.

While, like in my first post, I chose not to schedule this task, you’ll notice here that I *have* to select a Secure Cloud Agent. The agent is responsible for executing the ETL task after retrieving the details of the task from the Informatica Cloud.

This ETL is now complete.

Testing the ETL

In my list of Data Synchronization Tasks list, I can see my new task. The green Run Now button will trigger the task.

I have this record in my Salesforce.com application. Notice the “title” of Architect.

After a few moments, the task runs and I could see in the Informatica Cloud’s Activity Log that this task completed successfully.

To be absolutely sure, I logged into my Dynamics CRM Online account, and sure enough, I now have that one record added.

Summary

There are lots of reasons to do ETL between cloud applications. While Salesforce.com and Dynamics CRM Online are competing products, many large organizations are going to likely leverage both platforms for different reasons. Maybe you’ll have your sales personnel use Salesforce.com for traditional sales functions, and use Dynamics CRM Online for something like partner management. Either way, it’s great to have the option to easily move data between these environments without having to install and manage enterprise software on site.

Next up, I’ll show you how to take Dynamics CRM Online data and push it to an on-premises database.
March 26, 2012
ETL in the Cloud with Informatica: Part 1 – Sending File Data to Dynamics CRM Online
The more software systems that we deploy to cloud environments, the greater the need will be to have an efficient integration strategy. Integration through messaging is possible through something like an on-premises integration server, or via a variety of cloud tools such as queues hosted in AWS or something like the Windows Azure Service Bus Relay. However, what if you want to do some bulk data movement with Extract-Transform-Load (ETL) tools that cater to cloud solutions? One of the market leaders in the overall ETL market, Informatica, has also established a strong integration-as-a-service offering with its Informatica Cloud. They recently announced support for Dynamics CRM Online as a source/destination for ETL operations, so I got inspired to give their platform a whirl.

Informatica Cloud supports a variety of sources/destinations for ETL operations and leverages a machine agent (“Cloud Secure Agent”) for securely connecting on-premises environments to cloud environments. Instead of installing any client development tools, I can design my ETL process entirely through their hosted web application. When the ETL process executes, the Cloud Secure Agent retrieves the ETL details from the cloud and runs the task. There is no need to install or maintain a full server product for hosting and running these tasks. The Informatica Cloud doesn’t actually store any transactional data itself, and acts solely as a passthrough that executes the package (through the Cloud Secure Agent) and moves data around. All in all, neat stuff.

In this four-part blog series, I will walk through the following scenarios:
- Part 1 – Sending File Data to Dynamics CRM Online
- Part 2 – Sending Salesforce.com Data to Dynamics CRM Online
- Part 3 – Sending Dynamics CRM Online Data to a Local Database
- Part 4 – Sending Salesforce.com Data to Local Database
Scenario Summary

So what are we building in this post?

What’s going to happen is that (1) I’ll use the Informatica Cloud to define an ETL that takes a flat file from my local machine and copies the data to Dynamics CRM Online, (2) the Secure Cloud Agent will communicate with the Informatica Cloud to get the ETL details, (3) the Secure Cloud Agent retrieves the flat file from my local machine, and finally (4) the package runs and data is loaded into Dynamics CRM Online.

Sound good? Let’s jump in.

Setup

In this first post of the blog series, I’ll outline a few of the setup steps that I followed to get everything up and running. In subsequent posts, I’ll skip over this. First, I used my existing, free, Salesforce.com Developer account. Next, I signed up for a 30-day free trial of Dynamics CRM Online. After that, I signed up for a 30-day free trial of the Informatica Cloud.

Finally, I downloaded the Informatica agent to my local machine.

Once the agent is installed, I can manage it through a simple console.

Building the ETL Package

To get started, I logged into my Informatica Cloud account and walked through their Data Synchronization wizard. In the first step, I named my Task and chose to do an Insert operation.

Next, I chose to create a “flat file” connection type. This requires my Agent to have permissions on my file system, so I set the Agent’s Windows Service to run as a trusted account on my machine.

With the connection defined, I could then choose to use a comma delimited formatter, and chose the text file in the “temp” directory I had selected above. I can immediately see a preview that showed how my data was parsed.

On the next wizard page, I chose to create a new target connection. Here I selected Dynamics CRM Online as my destination system, and filled out the required properties (e.g. user ID, password, CRM organization name).

Note that the Organization Name above is NOT the Organization Unique Name that is part of the Dynamics CRM Online account and viewable from the Customizations -> Developer Resources page.

Rather, this is the Organization Name that I set up when signed up for my free trial. Note that this value is also case sensitive. Once I set this connection, an automatic preview of the data in that Dynamics CRM entity was shown.

On the next wizard page, I kept the default options and did NOT add any filters to the source data.

Now we get to the fun part. The Field Mapping page is where I set which source fields go to which destination fields. The interface supports drag and drop between the two sides.

Besides straight up one-to-one mapping, you can also leverage Expressions when conditional logic or field manipulation is needed. In the picture below, you can see that I added a concatenation function to combine the FirstName and LastName fields and put them into a FullName field.

In addition to Expressions, we also have the option of adding Lookups to the mapping. A lookup allows us to pull in one value (e.g. City) based on another (e.g. Zip) that may be in an entirely different source location. The final step of the wizard involves defining a schedule for running this task. I chose to have “no schedule” which means that this task is run manually.

And that’s it! I now have an Informatica package that can be run whenever I want.

Testing the ETL

We’re ready to try this out. The Tasks page shows all my available tasks, and the green Run Now button will kick the ETL off. Remember that my Cloud Secure Agent must be up and running for this to work. After starting up the job, I was told that it make take a few minutes to launch and run. Within a couple minutes, I saw a “success” message in my Activity Log.

But that doesn’t prove anything! Let’s look inside my Dynamics CRM Online application and locate one of those new records.

Success! My three records came across, and in the record above, we can see that the first name, last name and phone number were transferred over.

Summary

That was pretty straightforward. As you can imagine, these ETLs can get much more complicated as you have related entities and such. However, this web-based ETL designer means that organizations will have a much simpler maintenance profile since they don’t have to host and run these ETLs using on-premises servers.

Next up, I’ll show you how you can move data between two entirely cloud-based environments: Salesforce.com and Dynamics CRM Online.
March 26, 2012
Microsoft Dynamics CRM Online: By the Numbers

I’ve enjoyed attending Microsoft’s 2012 Convergence Conference, and one action item for me is to take another look at Dynamics CRM Online. Now, one reason that I spend more time playing with Salesforce.com instead of Dynamics CRM Online is because Salesforce.com has a free tier, and Dynamics CRM Online only has a 30 day trial. They really need to change that. Regardless, I’ve also focused more on Salesforce.com because of their market leading position and the perceived immaturity of Microsoft’s business solutions cloud. After attending a few different sessions here, I have to revisit that opinion.

I sat through a really fascinating breakout session about how Microsoft operates its (Dynamics) cloud business. The speaker sprinkled various statistics throughout his presentation, so I gathered them all up and have included them here.

30,000. Number of engineers at Microsoft doing cloud-related work.

2,000. Number of people managing Microsoft online services.

1,000. Number of servers that power Dynamics CRM Online.

99.9%. Guaranteed uptime per month (44 minutes of downtime allowed). Worst case, there is 5-15 minutes worth of data loss (RPO).

41. Number of global markets in which CRM Online is available for use.

40+. Number of different cloud services managed by Microsoft Global Foundation Services (GFS). The GFS site says “200 online services and web portal”, but maybe they use different math.

30. Number of days that the free trial lasts. Seriously, fix this.

19. Number of servers in each rack that make up “pod.” Each “scale group” (which contains all the items needed for a CRM instance) is striped across server racks, and multiple scale groups are collected into pods. While CRM app/web servers may be multi-tenet, each customer’s database is uniquely provisioned and not shared.

8. Number of months it took the CRM Online team to devise and deliver a site failover solution that requires a single command. Impressive. They make heavy use of SQL Server 2012 “always on” capabilities for their high availability and disaster recovery strategy.

5. Copies of data that exist for a given customer. You have (1) your primary organization database, (2) a synchronous snapshot database (which is updated at the same time as the primary), (3)(4) asynchronous copies made in the alternate data center (for a given region), and finally, (5) a daily backup to an offsite location. Whew!

6. Number of data centers that have CRM Online available (California, Virginia, Dublin, Amsterdam, Hong Kong and Singapore).

0. Amount of downtime necessary to perform all the upgrades in the environment. These include daily RFCs, 0-3 out-of-band releases per month, monthly security patches, bi-monthly update rollups, password changes every 70 days, and twice-yearly service updates. It sounds pretty darn complicated to handle both backwards and forwards compatibility while keeping customers online during upgrades, but it sounds like they pull it off.

Overall? That’s pretty hearty stuff. Recent releases are starting to bring CRM Online within shouting distance of its competitors and for some scenarios, it may even be a better choice that Salesforce.com. Either way, I have a newfound understanding about the robustness of the platform and will look to incorporate CRM Online into a few more of my upcoming demos.

March 21, 2012