Author: Richard Seroter

ETL in the Cloud with Informatica: Part 4 – Sending Salesforce.com Data to Local Database
The Informatica Cloud is an integration-as-a-service platform for designing and executing Extract-Transform-Load (ETL) tasks. This is the fourth and final post in a blog series that looked a few realistic usage scenarios for this platform. In this post, I’ll show you how you can send real-time data changes from Salesforce.com to a local SQL Server database.

As a reminder, in this four-part blog series, I am walking through the following scenarios:
- Part 1 – Sending File Data to Dynamics CRM Online
- Part 2 – Sending Salesforce.com Data to Dynamics CRM Online
- Part 3 – Sending Dynamics CRM Online Data to a Local Database
- Part 4 – Sending Salesforce.com Data to Local Database
Scenario Summary

I originally tried to do this with a SQL Azure database, but the types of errors I was getting led me to believe that Informatica is not yet using a JDBC driver that supports Azure. So be it. Here’s what I built:

In this solution, I (1) create the ETL task in the web-based designer, (2) setup Salesforce.com Outbound Messaging to send out an event whenever a new Account is added, (3) receive that event on an endpoint hosted in the Informatica Cloud and push the message to the on-premises agent, and (4) update the local database with the new account.

Outbound Messaging is such a cool feature of Salesforce.com and a way to have a truly event-driven line of business application. Let’s see how it works.

Building the ETL Package

To start with, I decided to reuse the same CrmAccount table that I created for the last post. This table holds some basic details for a given account.

Next, I went to the Informatica Cloud task designer and created a new Data Synchronization task. I first need to create the task BEFORE I can set up Outbound Messaging in Salesforce.com. On the first page of the wizard, I defined my ETL and set the operation for Insert.

On the next wizard page, I reused the Salesforce.com connection that I created in the second post of this blog series. I set the Source Object to Account and saw the simple preview of the accounts currently in Salesforce.com.

I then set up my target, using the same SQL Server connection that I created in the previous post. I then chose the CrmAccount table and saw that there were no rows in there.

I didn’t choose any filter of data and moved on to the Field Mapping section. Here, I filled each target field with a value from the source object.

Finally, on the scheduling tab, I chose the “Run this task in real-time upon receiving an outbound message from Salesforce” option. When selected, this option reveals a URL that Salesforce.com can call from its Outbound Messaging activity.

That’s it! Now, how about we go get Salesforce.com all set up for this solution?

Setting up Salesforce.com Outbound Messaging

In my Salesforce.com Setup console, I went to the Workflow Rules section.

I then created a brand new Workflow Rule and selected the Account object. I then named the rule, set it to run when records are created or edited and gave it a simple evaluation rule that checks to see if the Account Name has a value.

On the next page of this wizard, I was given the choice of what to do when that workflow condition is met. Notice that besides Outbound Messaging, there are also options for creating tasks and sending email messages.

After choosing New Outbound Message, I needed to provide a name for this Outbound Message, the endpoint URL provided to me by the Informatica Cloud, and the data fields that my mapping will expect. In my case, there were five fields that were used in my mapping.

After saving this configuration, I completed the Workflow Rule and activated it.

Testing the ETL

With my Informatica Cloud configuration ready, and Salesforce.com Workflow Rule activated, I went and created a brand new Account record.

After saving the new record, I went and looked in the Outbound Messaging Delivery Status view and it was empty, meaning that it had already completed! Sure enough, I checked my database table and BOOM, there it was.

That’s impressive!

Summary

One of the trickiest aspects of Salesforce.com Outbound Messaging is that you need an public-facing internet endpoint to push to, even if your receiving app is inside your firewall. By using the Informatica Cloud, you get one! This scenario demonstrated a way to do *instant* data transfer from Salesforce.com to a local database. I think that’s pretty killer.

I hope you found this series useful. A modern enterprise architecture landscape will include traditional components like BizTalk Server and Informatica (or SSIS for that matter), but also start to contain cloud-based integration tools. Informatica Cloud should be high on your list of options for integrating both on-premises and cloud applications, especially if you want to stop installing and maintaining integration software!
March 27, 2012
ETL in the Cloud with Informatica: Part 3 – Sending Dynamics CRM Online Data to Local Database
In Part 1 and Part 2 of this series, I’ve taken a look at doing Extract-Transform-Load (ETL) operations using the Informatica Cloud. This platform looks like a great choice for bulk movement of data between cloud or on-premises systems. So far we’ve seen how to move data from on-premises to the cloud, and then between clouds. In this post, I’ll show you how you can transfer data from a cloud application (Dynamics CRM Online) to a SQL Server database running onsite.

As a reminder, in this four-part blog series, I am walking through the following scenarios:
- Part 1 – Sending File Data to Dynamics CRM Online
- Part 2 – Sending Salesforce.com Data to Dynamics CRM Online
- Part 3 – Sending Dynamics CRM Online Data to a Local Database
- Part 4 – Sending Salesforce.com Data to Local Database
Scenario Summary

For this demo, I’ll be building a solution that looks like this:

For this case, I (1) build the ETL package using the Informatica Cloud’s web-based designer, (2) the Cloud Secure Agent retrieves the ETL details when the task is triggered, (3) the data is retrieved from Dynamics CRM Online, and (4) the data is loaded into a SQL Server database.

You can probably think of many scenarios where this situation will apply. For example, good practices for cloud applications often state that you keep onsite backups of your data. This is one way to do that on a daily schedule. In another case, you may have very complex reporting needs and cannot accomplish them using Dynamic CRM Online’s built in reporting capability, so a local, transformed replica makes sense.

Let’s see how to make this happen.

Setting up the Target Database

First up, I created a database table in my SQL Server 2008 R2 instance. This table, called CrmAccount holds a few of the attributes that reside in the Dynamics CRM Online “Account” entity.

Next, I added a new Login to my Instance and switched my server to accept both Windows Authentication *and* SQL Server authentication. Why? During some trial runs with this, I couldn’t seem to get integrated authentication to work in the Informatica Cloud designer. When I switched to a local DB account, the connection worked fine.

After this, I confirmed that I had TCP/IP enabled since the Cloud Secure Agent uses this port for connecting to my server.

Building the ETL Package

With all that set up, now we can build our ETL task in the Informatica Cloud environment. The first step in the Data Synchronization wizard is to provide a name for my task and choose the type of operation (e.g. Insert, Update, Upsert, Delete).

Next, I’ll chose my Source. In this step, I reused the Dynamics CRM Online connection that I created in the first post of the series. After choosing that connection, I selected the Account entity as my Source Object. A preview of the data was then automatically shown.

With my source in place, I moved on to define my target. In this case, my target is going to involve a new SQL Server connection. To create this connection, I supplied the name of my server, instance (if applicable), database, credentials (for the SQL Server login account) and port number.

Once I defined the connection, the drop down list (Target Object) was auto-populated with the tables in my database. I selected CrmAccount and saw a preview of my (empty) table.

On the next wizard page, I decided to not apply any filters on the Dynamics CRM Online data. So, ALL accounts should be copied over to my database table. I was now ready for the data mapping exercise. The following wizard page let me drag-and-drop fields from the source (Dynamics CRM Online) to the target (SQL Server 2008 R2).

On the last page of the wizard, I chose to NOT run this task on a schedule. I could set this run every five minutes, or once a week. There’s lots of flexibility in this.

Testing the ETL

Let’s test this out. In my list of Data Synchronization Tasks I can see the tasks from the last two posts, and a new tasks representing what we created above.

By clicking the green Run Now button, I can kick off this ETL. As an aside, the Informatica Cloud exposes a REST API where among other things, you can make a web request that kicks off a task on demand. That’s a neat feature that can come in handy if you have an ETL that runs infrequently, but a need arises for it to run RIGHT NOW. In this case, I’m going with the Run Now button.

To compare results, I have 14 account records in my Dynamics CRM Online organization.

I can see in my Informatica Cloud Activity Log that the ETL task completed and 14 records moved over.

To be sure, I jumped back to my SQL Server database and checked out my table.

As I expected, I can see 14 new records in my table. Success!

Summary

Sending data from a cloud application to an on-premises database is a realistic use case and hopefully this demo showed how easily it can be accomplished with the Informatica Cloud. The database connection is relatively straightforward and the data mapping tool should satisfy most ETL needs.

In the next post of this series, I’ll show you how to send data, in real-time, from Salesforce.com to a SQL Server database.
March 27, 2012
ETL in the Cloud with Informatica: Part 2 – Sending Salesforce.com Data to Dynamics CRM Online
In my last post, we saw how the Informatica Cloud lets you create bulk data load (i.e. ETL) tasks using a web-based designer and uses a lightweight local machine agent to facilitate the data exchange. In this post, I’ll show you how to transfer data from Salesforce.com to Dynamics CRM Online using the Informatica Cloud.

In this four-part blog series, I will walk through the following scenarios:
- Part 1 – Sending File Data to Dynamics CRM Online
- Part 2 – Sending Salesforce.com Data to Dynamics CRM Online
- Part 3 – Sending Dynamics CRM Online Data to a Local Database
- Part 4 – Sending Salesforce.com Data to Local Database
Scenario Summary

In this post, I’ll build the following solution.

In this solution, (1) I leverage the web-based designer to craft the ETL between Salesforce.com and Dynamics CRM Online, (2) use a locally installed Secure Cloud Agent to retrieve ETL details, (3) pull data from Salesforce.com, and finally (4) move that data into Dynamics CRM Online.

What’s interesting is that even though this is a “cloud only” ETL, the Informatica Cloud solution still requires the use of the Cloud Secure Agent (installed on-premises) to facilitate the actual data transfer.

To view some of the setup steps (such as signing up for services and installing required software), see the first post in this series.

Building the ETL Package

To start with, I logged into the Informatica Cloud and created a new Data Synchronization task.

On the next wizard page, I created a new connection type for Salesforce.com and provided all the required credentials.

With that in place, I could select that connection, the entity (“Contact”) to pull data from, and see a quick preview of that data in my Salesforce.com account.

On the next wizard page, I configured a connection to my ETL target. I chose an existing Dynamics CRM Online connection, and selected the “Contact” entity.

Instead of transferring all the data from my Salesforce.com organization to my Dynamics CRM Online organization, I used the next wizard page to define a data filter. In my case, I’m only going to grab Salesforce.com contacts that have a title of “Architect”.

For the data mapping exercise, it’s nice that the Informatica tooling automatically links fields through its Automatch capability. In this scenario, I didn’t do any manual mapping and relied solely on Automatch.

While, like in my first post, I chose not to schedule this task, you’ll notice here that I *have* to select a Secure Cloud Agent. The agent is responsible for executing the ETL task after retrieving the details of the task from the Informatica Cloud.

This ETL is now complete.

Testing the ETL

In my list of Data Synchronization Tasks list, I can see my new task. The green Run Now button will trigger the task.

I have this record in my Salesforce.com application. Notice the “title” of Architect.

After a few moments, the task runs and I could see in the Informatica Cloud’s Activity Log that this task completed successfully.

To be absolutely sure, I logged into my Dynamics CRM Online account, and sure enough, I now have that one record added.

Summary

There are lots of reasons to do ETL between cloud applications. While Salesforce.com and Dynamics CRM Online are competing products, many large organizations are going to likely leverage both platforms for different reasons. Maybe you’ll have your sales personnel use Salesforce.com for traditional sales functions, and use Dynamics CRM Online for something like partner management. Either way, it’s great to have the option to easily move data between these environments without having to install and manage enterprise software on site.

Next up, I’ll show you how to take Dynamics CRM Online data and push it to an on-premises database.
March 26, 2012
ETL in the Cloud with Informatica: Part 1 – Sending File Data to Dynamics CRM Online
The more software systems that we deploy to cloud environments, the greater the need will be to have an efficient integration strategy. Integration through messaging is possible through something like an on-premises integration server, or via a variety of cloud tools such as queues hosted in AWS or something like the Windows Azure Service Bus Relay. However, what if you want to do some bulk data movement with Extract-Transform-Load (ETL) tools that cater to cloud solutions? One of the market leaders in the overall ETL market, Informatica, has also established a strong integration-as-a-service offering with its Informatica Cloud. They recently announced support for Dynamics CRM Online as a source/destination for ETL operations, so I got inspired to give their platform a whirl.

Informatica Cloud supports a variety of sources/destinations for ETL operations and leverages a machine agent (“Cloud Secure Agent”) for securely connecting on-premises environments to cloud environments. Instead of installing any client development tools, I can design my ETL process entirely through their hosted web application. When the ETL process executes, the Cloud Secure Agent retrieves the ETL details from the cloud and runs the task. There is no need to install or maintain a full server product for hosting and running these tasks. The Informatica Cloud doesn’t actually store any transactional data itself, and acts solely as a passthrough that executes the package (through the Cloud Secure Agent) and moves data around. All in all, neat stuff.

In this four-part blog series, I will walk through the following scenarios:
- Part 1 – Sending File Data to Dynamics CRM Online
- Part 2 – Sending Salesforce.com Data to Dynamics CRM Online
- Part 3 – Sending Dynamics CRM Online Data to a Local Database
- Part 4 – Sending Salesforce.com Data to Local Database
Scenario Summary

So what are we building in this post?

What’s going to happen is that (1) I’ll use the Informatica Cloud to define an ETL that takes a flat file from my local machine and copies the data to Dynamics CRM Online, (2) the Secure Cloud Agent will communicate with the Informatica Cloud to get the ETL details, (3) the Secure Cloud Agent retrieves the flat file from my local machine, and finally (4) the package runs and data is loaded into Dynamics CRM Online.

Sound good? Let’s jump in.

Setup

In this first post of the blog series, I’ll outline a few of the setup steps that I followed to get everything up and running. In subsequent posts, I’ll skip over this. First, I used my existing, free, Salesforce.com Developer account. Next, I signed up for a 30-day free trial of Dynamics CRM Online. After that, I signed up for a 30-day free trial of the Informatica Cloud.

Finally, I downloaded the Informatica agent to my local machine.

Once the agent is installed, I can manage it through a simple console.

Building the ETL Package

To get started, I logged into my Informatica Cloud account and walked through their Data Synchronization wizard. In the first step, I named my Task and chose to do an Insert operation.

Next, I chose to create a “flat file” connection type. This requires my Agent to have permissions on my file system, so I set the Agent’s Windows Service to run as a trusted account on my machine.

With the connection defined, I could then choose to use a comma delimited formatter, and chose the text file in the “temp” directory I had selected above. I can immediately see a preview that showed how my data was parsed.

On the next wizard page, I chose to create a new target connection. Here I selected Dynamics CRM Online as my destination system, and filled out the required properties (e.g. user ID, password, CRM organization name).

Note that the Organization Name above is NOT the Organization Unique Name that is part of the Dynamics CRM Online account and viewable from the Customizations -> Developer Resources page.

Rather, this is the Organization Name that I set up when signed up for my free trial. Note that this value is also case sensitive. Once I set this connection, an automatic preview of the data in that Dynamics CRM entity was shown.

On the next wizard page, I kept the default options and did NOT add any filters to the source data.

Now we get to the fun part. The Field Mapping page is where I set which source fields go to which destination fields. The interface supports drag and drop between the two sides.

Besides straight up one-to-one mapping, you can also leverage Expressions when conditional logic or field manipulation is needed. In the picture below, you can see that I added a concatenation function to combine the FirstName and LastName fields and put them into a FullName field.

In addition to Expressions, we also have the option of adding Lookups to the mapping. A lookup allows us to pull in one value (e.g. City) based on another (e.g. Zip) that may be in an entirely different source location. The final step of the wizard involves defining a schedule for running this task. I chose to have “no schedule” which means that this task is run manually.

And that’s it! I now have an Informatica package that can be run whenever I want.

Testing the ETL

We’re ready to try this out. The Tasks page shows all my available tasks, and the green Run Now button will kick the ETL off. Remember that my Cloud Secure Agent must be up and running for this to work. After starting up the job, I was told that it make take a few minutes to launch and run. Within a couple minutes, I saw a “success” message in my Activity Log.

But that doesn’t prove anything! Let’s look inside my Dynamics CRM Online application and locate one of those new records.

Success! My three records came across, and in the record above, we can see that the first name, last name and phone number were transferred over.

Summary

That was pretty straightforward. As you can imagine, these ETLs can get much more complicated as you have related entities and such. However, this web-based ETL designer means that organizations will have a much simpler maintenance profile since they don’t have to host and run these ETLs using on-premises servers.

Next up, I’ll show you how you can move data between two entirely cloud-based environments: Salesforce.com and Dynamics CRM Online.
March 26, 2012
Microsoft Dynamics CRM Online: By the Numbers

I’ve enjoyed attending Microsoft’s 2012 Convergence Conference, and one action item for me is to take another look at Dynamics CRM Online. Now, one reason that I spend more time playing with Salesforce.com instead of Dynamics CRM Online is because Salesforce.com has a free tier, and Dynamics CRM Online only has a 30 day trial. They really need to change that. Regardless, I’ve also focused more on Salesforce.com because of their market leading position and the perceived immaturity of Microsoft’s business solutions cloud. After attending a few different sessions here, I have to revisit that opinion.

I sat through a really fascinating breakout session about how Microsoft operates its (Dynamics) cloud business. The speaker sprinkled various statistics throughout his presentation, so I gathered them all up and have included them here.

30,000. Number of engineers at Microsoft doing cloud-related work.

2,000. Number of people managing Microsoft online services.

1,000. Number of servers that power Dynamics CRM Online.

99.9%. Guaranteed uptime per month (44 minutes of downtime allowed). Worst case, there is 5-15 minutes worth of data loss (RPO).

41. Number of global markets in which CRM Online is available for use.

40+. Number of different cloud services managed by Microsoft Global Foundation Services (GFS). The GFS site says “200 online services and web portal”, but maybe they use different math.

30. Number of days that the free trial lasts. Seriously, fix this.

19. Number of servers in each rack that make up “pod.” Each “scale group” (which contains all the items needed for a CRM instance) is striped across server racks, and multiple scale groups are collected into pods. While CRM app/web servers may be multi-tenet, each customer’s database is uniquely provisioned and not shared.

8. Number of months it took the CRM Online team to devise and deliver a site failover solution that requires a single command. Impressive. They make heavy use of SQL Server 2012 “always on” capabilities for their high availability and disaster recovery strategy.

5. Copies of data that exist for a given customer. You have (1) your primary organization database, (2) a synchronous snapshot database (which is updated at the same time as the primary), (3)(4) asynchronous copies made in the alternate data center (for a given region), and finally, (5) a daily backup to an offsite location. Whew!

6. Number of data centers that have CRM Online available (California, Virginia, Dublin, Amsterdam, Hong Kong and Singapore).

0. Amount of downtime necessary to perform all the upgrades in the environment. These include daily RFCs, 0-3 out-of-band releases per month, monthly security patches, bi-monthly update rollups, password changes every 70 days, and twice-yearly service updates. It sounds pretty darn complicated to handle both backwards and forwards compatibility while keeping customers online during upgrades, but it sounds like they pull it off.

Overall? That’s pretty hearty stuff. Recent releases are starting to bring CRM Online within shouting distance of its competitors and for some scenarios, it may even be a better choice that Salesforce.com. Either way, I have a newfound understanding about the robustness of the platform and will look to incorporate CRM Online into a few more of my upcoming demos.

March 21, 2012
I’m at the Microsoft Convergence conference this week

From Monday through Wednesday of this week, I’ll be at Microsoft’s Convergence conference in Houston, Texas. This is Microsoft’s annual conference for the Dynamics product line, and this year I’ll be attending as a speaker.

I’m co-delivering a session entitled Managing Complex Implementations of Microsoft Dynamics CRM. I now have a bit of experience with this because of my day job, so it should be fun to share some of the learnings. We’re going to cover all the things that make a CRM project (or any complex project, for that matter) complex, including “introducing new technology”, “multi-source data migration”, “industry regulations” and more. We’ll then cover some lessons learned from project scoping/planning/estimation exercises and conclude by looking at the ideal team makeup for complex projects.

All in all, should be a good time. If you happen to be attending this year, stop on by!

March 18, 2012
Doing a Multi-Cloud Deployment of an ASP.NET Web Application

The recent Azure outage once again highlighted the value in being able to run an application in multiple clouds so that a failure in one place doesn’t completely cripple you. While you may not run an application in multiple clouds simultaneously, it can be helpful to have a standby ready to go. That standby could already be deployed to backup environment, or, could be rapidly deployed from a build server out to a cloud environment.

https://twitter.com/#!/jamesurquhart/status/174919593788309504

So, I thought I’d take a quick look at how to take the same ASP.NET web application and deploy it to three different .NET-friendly public clouds: Amazon Web Services (AWS), Iron Foundry, and Windows Azure. Just for fun, I’m keeping my database (AWS SimpleDB) separate from the primary hosting environment (Windows Azure) so that my database could be available if my primary, or backup (Iron Foundry) environments were down.

My application is very simple: it’s a Web Form that pulls data from AWS SimpleDB and displays the results in a grid. Ideally, this works as-is in any of the below three cloud environments. Let’s find out.

Deploying the Application to Windows Azure

Windows Azure is a reasonable destination for many .NET web applications that can run offsite. So, let’s see what it takes to push an existing web application into the Windows Azure application fabric.

First, after confirming that I had installed the Azure SDK 1.6, I right-clicked my ASP.NET web application and added a new Azure Deployment project.

After choosing this command, I ended up with a new project in this Visual Studio solution.

While I can view configuration properties (how many web roles to provision, etc), I jumped right into Publishing without changing any settings. While there was a setting to add an Azure storage account (vs. using local storage), but I didn’t think I had a need for Azure storage.

The first step in the Publishing process required me to supply authentication in the form of a certificate. I created a new certificate, uploaded it to the Windows Azure portal, took my Azure account’s subscription identifier, and gave this set of credentials a friendly name.

I didn’t have any “hosted services” in this account, so I was prompted to create one.

With a host created, I then left the other settings as they were, with the hope of deploying this app to production.

After publishing, Visual Studio 2010 showed me the status of the deployment that took about 6-7 minutes.

An Azure hosted service and single instance were provisioned. A storage account was also added automatically.

I had an error and updated my configuration file to show the error, and that update took another 5 minutes (upon replacing the original). The error was that the app couldn’t load the AWS SDK component that was referenced. So, I switched the AWS SDK dll to “copy local” in the ASP.NET application project and once again redeployed my application. This time it worked fine, and I was able to see my SimpleDB data from my Azure-hosted ASP.NET website.

Not too bad. Definitely a bit of upfront work to do, but subsequent projects can reuse the authentication-related activities that I completed earlier. The sluggish deployment times really stunt momentum, but realistically, you can do some decent testing locally so that what gets deployed is pretty solid.

Deploying the Application to Iron Foundry

Tier3’s Iron Foundry is the .NET-flavored version of VMware’s popular Cloud Foundry platform. Given that you can use Iron Foundry in your own data center, or in the cloud, it’s something that developers should keep a close eye on. I decided to use the Cloud Foundry Explorer that sits within Visual Studio 2010. You can download it from the Iron Foundry site. With that installed, I can right-click my ASP.NET application and choose to Push Cloud Foundry Application.

Next, if I hadn’t previously configured access to the Iron Foundry cloud, I’d need to create a connection with the target API and my valid credentials. With the connection in place, I set the name of my cloud application and clicked Push.

In under 60 seconds, my application was deployed and ready to look at.

What if a change to the application is needed? I updated the HTML, right clicked my project and chose to Update Cloud Foundry Application. Once again, in a few seconds, my application was updated and I could see the changes. Taking an existing ASP.NET and moving to Iron Foundry doesn’t require any modifications to the application itself.

If you’re looking for a multi-language, on-or-off premises PaaS, that is easy to work with, then I strongly encourage you to try Iron Foundry out.

Deploying the Application to AWS via CloudFormation

While AWS does not have a PaaS, per se, they do make it easy to deploy apps in a PaaS-like way via CloudFormation. Via CloudFormation, I can deploy a set of related resources and manage them as one deployment unit.

From within Visual Studio 2010, I right-clicked my ASP.NET web application and chose Publish to AWS CloudFormation.

When the wizard launches, I was asked to choose one of two deployment templates (single instance or multiple, load balanced instances).

After selecting the single instance template, I kept the default values in the next wizard page. These settings include the size of the host machine, security group and name of this stack.

On the next wizard pages, I kept the default settings (e.g. .NET version) and chose to deploy my application. Immediately, I saw a window in Visual Studio that showed the progress of my deployment.

In about 7 minutes, I had a finished deployment and a URL to my application was provided. Sure enough, upon clicking that link, I was sent to my web application running successfully in AWS.

Just to compare to previous scenarios, I went ahead and made a small change to the HTML of the web application and once again chose Publish to AWS CloudFormation from the right-click menu.

As you can see, it saw my previous template, and as I walked through the wizard, it retrieved any existing settings and allowed me to make any changes where possible. When I clicked Deploy again, I saw that my package was being uploaded, and in less than a minute, I saw the changes in my hosted web application.

So while I’m still leveraging the AWS infrastructure-as-a-service environment, the use of CloudFormation makes this seem a bit more like an application fabric. The deployments were very straightforward and smooth, arguably the smoothest of all three options shown in this post.

Summary

I was able to fairly easily take the same ASP.NET website and from Visual Studio 2010, deploy to three distinct clouds. Each cloud has their own steps and processes, but each are fairly straightforward. Because Iron Foundry doesn’t require new VMs to be spun up, it’s consistently the faster deployment scenario. That can make a big difference during development and prototyping and should be something you factor into your cloud platform selection. Windows Azure has a nice set of additional services (like queuing, storage, integration), and Amazon gives you some best-of-breed hosting and monitoring. Tier 3’s Iron Foundry lets you use one of the most popular open source, multi-environment PaaS platforms for .NET apps. There are factors that would lead you to each of these clouds.

This is hopefully a good bit of information to know when panic sets in over the downtime of a particular cloud. However, as you build your application with more and more services that are specific to a given environment, this multi-cloud strategy becomes less straightforward. For instance, if an ASP.NET application leverages SQL Azure for database storage, then you are still in pretty good shape when an application has to move to other environments. ASP.NET talks to SQL Server using the same ports and API, regardless of whether it’s using SQL Azure or a SQL instance deployed on an Amazon instance. But, if I’m using Azure Queues (or Amazon SQS for that matter), then it’s more difficult to instantly replace that component in another cloud environment.

Keep all these portability concerns in mind when building your cloud-friendly applications!

March 6, 2012

Using SignalR To Push StreamInsight Events to Client Browsers

I’ve spent some time recently working with the asynchronous web event messaging engine SignalR. This framework uses JavaScript (with jQuery) on the client and ASP.NET on the server to enable very interactive communication patterns. The coolest part is being able to have the server-side application call a JavaScript function on each connected browser client. While many SignalR demos you see have focused on scenarios like chat applications, I was thinking of how to use SignalR to notify business users of interesting events within an enterprise. Wouldn’t it be fascinating if business events (e.g. “Project X requirements document updated”, “Big customer added in US West region”, “Production Mail Server offline”, “FAQ web page visits up 78% today”) were published from source applications and pushed to a live dashboard-type web application available to users? If I want to process these fast moving events and perform rich aggregations over windows of events, then Microsoft StreamInsight is a great addition to a SignalR-based solution. In this blog post, I’m going to walk through a demonstration of using SignalR to push business events through StreamInsight and into a Tweetdeck-like browser client.

Solution Overview

So what are we building? To make sure that we keep an eye on the whole picture while building the individual components, I’ve summarized the solution here.

Basically, the browser client will first (through jQuery) call a server operation that adds that client to a message group (e.g. “security events”). Events are then sent from source applications to StreamInsight where they are processed. StreamInsight then calls a WCF service that is part of the ASP.NET web application. Finally, the WCF Service uses the SignalR framework to invoke the “addEventMsg()” function on each connected browser client. Sound like fun? Good. Let’s jump in.

Setting up the SignalR application

I started out by creating a new ASP.NET web application. I then used the NuGet extension to locate the SignalR libraries that I wanted to use.

Once the packages were chosen from NuGet, they got automatically added to my ASP.NET app.

The next thing to do was add the appropriate JavaScript references at the top of the page. Note the last one. It is a virtual JavaScript location (you won’t find it in the design-time application) that is generated by the SignalR framework. This script, which you can view in the browser at runtime, holds all the JavaScript code that corresponds to the server/browser methods defined in my ASP.NET application.

After this, I added the HTML and ASP.NET controls necessary to visualize my Tweetdeck-like event viewer. Besides a column where each event shows up, I also added a listbox that holds all the types of events that someone might subscribe to. Maybe one set of users just want security-oriented events, or another wants events related to a given IT project.

With my look-and-feel in place, I then moved on to adding some server-side components. I first created a new class (BizEventController.cs) that uses the SignalR “Hubs” connection model. This class holds a single operation that gets called by the JavaScript in the browser and adds the client to a given messaging group. Later, I can target a SignalR message to a given group.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;

//added reference to SignalR
using SignalR.Hubs;

///
<summary> /// Summary description for BizEventController /// </summary>

public class BizEventController : Hub
{
    public void AddSubscription(string eventType)
    {
        AddToGroup(eventType);
    }
}

I then switched back to the ASP.NET page and added the JavaScript guts of my SignalR application. Specifically, the code below (1) defines an operation on my client-side hub (that gets called by the server) and (2) calls the server side controller that adds clients to a given message group.

$(function () {
            //create arrays for use in showing formatted date string
            var days = ['Sun', 'Mon', 'Tues', 'Wed', 'Thur', 'Fri', 'Sat'];
            var months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'June', 'July', 'Aug', 'Sept', 'Oct', 'Nov', 'Dec'];

            // create proxy that uses in dynamic signalr/hubs file
            var bizEDeck = $.connection.bizEventController;

            // Declare a function on the chat hub so the server can invoke it
            bizEDeck.addEventMsg = function (message) {
                //format date
                var receiptDate = new Date();
                var formattedDt = days[receiptDate.getDay()] + ' ' + months[receiptDate.getMonth()] + ' ' + receiptDate.getDate() + ' ' + receiptDate.getHours() + ':' + receiptDate.getMinutes();
                //add new "message" to deck column
                $('#deck').prepend('</pre>
<div>' + message + '' + formattedDt + ' via StreamInsight</div>
<pre>
');
            };

            //act on "subscribe" button
            $("#groupadd").click(function () {
                //call subscription function in server code
                bizEDeck.addSubscription($('#group').val());
                //add entry in "subscriptions" section
                $('#subs').append($('#group').val() + '</pre>

<hr />

<pre>');
            });

            // Start the connection
            $.connection.hub.start();
        });

Building the web service that StreamInsight will call to update browsers

The UI piece was now complete. Next, I wanted a web service that StreamInsight could call and pass in business events that would get pushed to each browser client. I’m leveraging a previously-built StreamInsight WCF adapter that can be used to receive web service request and call web service endpoints. I built a WCF service and in the underlying class, I pull the list of all connected clients and invoke the JavaScript function.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.Serialization;
using System.ServiceModel;
using System.Text;

using SignalR;
using SignalR.Infrastructure;
using SignalR.Hosting.AspNet;
using StreamInsight.Samples.Adapters.Wcf;
using Seroter.SI.AzureAppFabricAdapter;

public class NotificationService : IPointEventReceiver
{
	//implement the operation included in interface definition
	public ResultCode PublishEvent(WcfPointEvent result)
	{
		//get category from key/value payload
		string cat = result.Payload["Category"].ToString();
		//get message from key/value payload
		string msg = result.Payload["EventMessage"].ToString();

		//get SignalR connection manager
		IConnectionManager mgr = AspNetHost.DependencyResolver.Resolve();
		//retrieve list of all connected clients
		dynamic clients = mgr.GetClients();

		//send message to all clients for given category
		clients[cat].addEventMsg(msg);
		//also send message to anyone subscribed to all events
		clients["All"].addEventMsg(msg);

		return ResultCode.Success;
	}
}

Preparing StreamInsight to receive, aggregate and forward events

The website is ready, the service is exposed, and all that’s left is to get events and process them. Specifically, I used a WCF adapter to create an endpoint and listen for events from sources, wrote a few queries, and then sent the output to the WCF service created above.

The StreamInsight application is below. It includes the creation of the embedded server and all other sorts of fun stuff.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

using Microsoft.ComplexEventProcessing;
using Microsoft.ComplexEventProcessing.Linq;
using Seroter.SI.AzureAppFabricAdapter;
using StreamInsight.Samples.Adapters.Wcf;

namespace SignalRTest.StreamInsightHost
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine(":: Starting embedded StreamInsight server ::");

            //create SI server
            using(Server server = Server.Create("RSEROTERv12"))
            {
                //create SI application
                Application app = server.CreateApplication("SeroterSignalR");

                //create input adapter configuration
                WcfAdapterConfig inConfig = new WcfAdapterConfig()
                {
                    Password = "",
                    RequireAccessToken = false,
                    Username  = "",
                    ServiceAddress = "http://localhost:80/StreamInsightv12/RSEROTER/InputAdapter"
                };

                //create output adapter configuration
                WcfAdapterConfig outConfig = new WcfAdapterConfig()
                {
                    Password = "",
                    RequireAccessToken = false,
                    Username = "",
                    ServiceAddress = "http://localhost:6412/SignalRTest/NotificationService.svc"
                };

                //create event stream from the source adapter
                CepStream input = CepStream.Create("BizEventStream", typeof(WcfInputAdapterFactory), inConfig, EventShape.Point);
                //build initial LINQ query that is a simple passthrough
                var eventQuery = from i in input
                                 select i;

                //create unbounded SI query that doesn't emit to specific adapter
                var query0 = eventQuery.ToQuery(app, "BizQueryRaw", string.Empty, EventShape.Point, StreamEventOrder.FullyOrdered);
                query0.Start();

                //create another query that latches onto previous query
                //filters out all individual web hits used in later agg query
                var eventQuery1 = from i in query0.ToStream()
                                  where i.Category != "Web"
                                  select i;

                //another query that groups events by type; used here for web site hits
                var eventQuery2 = from i in query0.ToStream()
                                  group i by i.Category into EventGroup
                                  from win in EventGroup.TumblingWindow(TimeSpan.FromSeconds(10))
                                  select new BizEvent
                                  {
                                      Category = EventGroup.Key,
                                      EventMessage = win.Count().ToString() + " web visits in the past 10 seconds"
                                  };
                //new query that takes result of previous and just emits web groups
                var eventQuery3 = from i in eventQuery2
                                  where i.Category == "Web"
                                  select i;

                //create new SI queries bound to WCF output adapter
                var query1 = eventQuery1.ToQuery(app, "BizQuery1", string.Empty, typeof(WcfOutputAdapterFactory), outConfig, EventShape.Point, StreamEventOrder.FullyOrdered);
                var query2 = eventQuery3.ToQuery(app, "BizQuery2", string.Empty, typeof(WcfOutputAdapterFactory), outConfig, EventShape.Point, StreamEventOrder.FullyOrdered);

                //start queries
                query1.Start();
                query2.Start();
                Console.WriteLine("Query started. Press [Enter] to stop.");

                Console.ReadLine();
                //stop all queries
                query1.Stop();
                query2.Stop();
                query0.Stop();
                Console.Write("Query stopped.");
                Console.ReadLine();

            }
        }

        private class BizEvent
        {
            public string Category { get; set; }
            public string EventMessage { get; set; }
        }
    }
}

Everything is now complete. Let’s move on to testing with a simple event generator that I created.

Testing the solution

I built a simple WinForm application that generates business events or a user-defined number of simulated website visits. The business events are passed through StreamInsight, and the website hits are aggregated so that StreamInsight can emit the count of hits every ten seconds.

To highlight the SignalR experience, I launched three browser instances with two different group subscriptions. The first two subscribe to all events, and the third one subscribes just to website-based events. For the latter, the client JavaScript function won’t get invoked by the server unless the events are in the “Web” category.

The screenshot below shows the three browser instances launched (one in IE, two in Chrome).

Next, I launched my event-generator app and StreamInsight host. I sent in a couple of business (not web) events and hoped to see them show up in two of the browser instances.

As expected, two of the browser clients were instantly updated with these events, and the other subscriber was not. Next, I sent in a handful of simulated website hit events and observed the results.

Cool! So all three browser instances were instantly updated with ten-second-counts of website events that were received.

Summary

SignalR is an awesome framework for providing real-time, interactive, bi-directional communication between clients and servers. I think there’s a lot of value of using SignalR for dashboards, widgets and event monitoring interfaces. In this post we saw a simple “business event monitor” application that enterprise users could leverage to keep up to date on what’s happening within enterprise systems. I used StreamInsight here, but you could use BizTalk Server or any application that can send events to web services.

What do you think? Where do you see value for SignalR?

UPDATE:I’ve made the source code for this project available and you can retrieve it from here.

February 29, 2012

My New Pluralsight Course, “AWS Developer Fundamentals”, Is Now Available
I just finished designing, building and recording a new course for Pluralsight. I’ve been working with Amazon Web Services (AWS) products for a few years now, and I jumped at the chance to build a course that looked at the AWS services that have significant value for developers. That course is AWS Developer Fundamentals, and it is now online and available for Pluralsight subscribers.

In this course, I and cover the following areas, and
- Compute Services. A walkthrough of EC2 and how to provision and interact with running instances.
- Storage Services. Here we look at EBS and see examples of adding volumes, creating snapshots, and attaching volumes made from snapshots. We also cover S3 and how to interact with buckets and objects.
- Database Services. This module covers the Relational Database Service (RDS) with some MySQL demos, SimpleDB and the new DynamoDB.
- Messaging Services. Here we look at the Simple Queue Service (SQS) and Simple Notification Service (SNS).
- Management and Deployment. This module covers the administrative components and includes a walkthrough of the Identity and Access Management (IAM) capabilities.
Each module is chock full of exercises that should help you better understand how AWS services work. Instead of JUST showing you how to interact with services via an SDK, I decided that each set of demos should show how to perform functions using the Management Console, the raw (REST/Query) API, and also the .NET SDK. I think that this gives the student a good sense of all the viable ways to execute AWS commands. Not every application platform has an SDK available for AWS, so seeing the native API in action can be enlightening.

I hope you take the time to watch it, and if you’re not a Pluralsight subscriber, now’s the time to jump in!
February 23, 2012

Comparing AWS/Box/Azure for Managed File Transfer Provider

As organizations continue to form fluid partnerships and seek more secure solutions than “give the partner VPN access to our network”, cloud-based managed file transfer (MFT) solutions seem like an important area to investigate. If your company wants to share data with another organization, how do you go about doing it today? Do you leverage existing (aging?) FTP infrastructure? Do you have an internet-facing extranet? Have you used email communication for data transfer?

All of those previous options will work, but an offsite (cloud-based) storage strategy is attractive for many reasons. Business partners never gain direct access to your systems/environment, the storage in cloud environments is quite elastic to meet growing needs, and cloud providers offer web-friendly APIs that can be used to easily integrate with existing applications. There are downsides related to loss of physical control over data, but there are ways to mitigate this risk through server-side encryption.

That said, I took a quick look at three possible options. There are other options besides these, but I’ve got some familiarity with all of these, so it made my life easier to stick to these three. Specifically, I compared the Amazon Web Services S3 service, Box.com (formerly Box.net), and Windows Azure Blob Storage.

Comparison

The criteria along the left of the table are primarily from the Wikipedia definition of MFT capabilities, along with a few additional capabilities that I added.

Feature	Amazon S3	Box.com	Azure Storage
Multiple file transfer protocols	HTTP/S (REST, SOAP)	HTTP/S (REST, SOAP)	HTTP/S (REST)
Secure transfer over encrypted protocols	HTTPS	HTTPS	HTTPS
Securely storage of files	AES-256 provided	AES-256 provided (for enterprise users)	No out-of-box; up to developer
Authenticate users against central factors	AWS Identity & Access Management	Uses Box.com identities, SSO via SAML and ADFS	Through Windows Azure Active Directory (and federation standards like OAuth, SAML)
Integrate to existing apps with documented API	Rich API	Rich API	Rich API
Generate reports based on user and file transfer activities	Can set up data access logs	Comprehensive controls	Apparently custom; none found.
Individual file size limit	5 TB	2 GB (for business and enterprise users)	200GB for block blob, 1TB for page blob
Total storage limits	Unlimited	Unlimited (for enterprise users)	5 PB
Pricing scheme	Pay monthly for storage, transfer out, requests	Per user	Pay monthly for storage, transfer out, requests
SLA Offered	99.999999999% durability and 99.99% availability of objects	?	99.9% availability
Other Key Features	Content expiration policies, versioning, structured storage options	Polished UI tools or users and administrators; integration with apps like Salesforce.com	Access to other Azure services for storage, compute, integration

Summary

Overall, there are some nice options out there. Amazon S3 is great for pay-as-you go storage with a very mature foundation and enormous size limits. Windows Azure is new at this, but they provide good identity federation options and good pricing and storage limits. Box.com is clearly the most end-user-friendly option and a serious player in this space. All have good-looking APIs that developers should find easy to work with.

Have any of you used these platforms for data transfer between organizations?

February 6, 2012

Author: Richard Seroter

Scenario Summary

Building the ETL Package

Setting up Salesforce.com Outbound Messaging

Testing the ETL

Summary

Scenario Summary

Setting up the Target Database

Building the ETL Package

Testing the ETL

Summary

Scenario Summary

Building the ETL Package

Testing the ETL

Summary

Scenario Summary

Setup

Building the ETL Package

Testing the ETL

Summary

Deploying the Application to Windows Azure

Deploying the Application to Iron Foundry

Deploying the Application to AWS via CloudFormation

Summary

Solution Overview

Setting up the SignalR application

Building the web service that StreamInsight will call to update browsers

Preparing StreamInsight to receive, aggregate and forward events

Testing the solution

Summary

Comparison

Summary