Author: Richard Seroter

  • ETL in the Cloud with Informatica: Part 4 – Sending Salesforce.com Data to Local Database

    The Informatica Cloud is an integration-as-a-service platform for designing and executing Extract-Transform-Load (ETL) tasks. This is the fourth and final post in a blog series that looked a few realistic usage scenarios for this platform. In this post, I’ll show you how you can send real-time data changes from Salesforce.com to a local SQL Server database.

    As a reminder, in this four-part blog series, I am walking through the following scenarios:

    Scenario Summary

    I originally tried to do this with a SQL Azure database, but the types of errors I was getting led me to believe that Informatica is not yet using a JDBC driver that supports Azure. So be it. Here’s what I built:

    2012.03.26informatica42

    In this solution, I (1) create the ETL task in the web-based designer, (2) setup Salesforce.com Outbound Messaging to send out an event whenever a new Account is added, (3) receive that event on an endpoint hosted in the Informatica Cloud and push the message to the on-premises agent, and (4) update the local database with the new account.

    Outbound Messaging is such a cool feature of Salesforce.com and a way to have a truly event-driven line of business application. Let’s see how it works.

    Building the ETL Package

    To start with, I  decided to reuse the same CrmAccount table that I created for the last post. This table holds some basic details for a given account.

    2012.03.26informatica30

    Next, I went to the Informatica Cloud task designer and created a new Data Synchronization task. I first need to create the task BEFORE I can set up Outbound Messaging in Salesforce.com. On the first page of the wizard, I defined my ETL and set the operation for Insert.

    2012.03.26informatica43

    On the next wizard page, I reused the Salesforce.com connection that I created in the second post of this blog series. I set the Source Object to Account and saw the simple preview of the accounts currently in Salesforce.com.

    2012.03.26informatica44

    I then set up my target, using the same SQL Server connection that I created in the previous post. I then chose the CrmAccount table and saw that there were no rows in there.

    2012.03.26informatica45

    I didn’t choose any filter of data and moved on to the Field Mapping section. Here, I filled each target field with a value from the source object.

    2012.03.26informatica46

    Finally, on the scheduling tab, I chose the “Run this task in real-time upon receiving an outbound message from Salesforce” option. When selected, this option reveals a URL that Salesforce.com can call from its Outbound Messaging activity.

    2012.03.26informatica47

    That’s it! Now, how about we go get Salesforce.com all set up for this solution?

    Setting up Salesforce.com Outbound Messaging

    In my Salesforce.com Setup console, I went to the Workflow Rules section.

    2012.03.26informatica48

    I then created a brand new Workflow Rule and selected the Account object. I then named the rule, set it to run when records are created or edited and gave it a simple evaluation rule that checks to see if the Account Name has a value.

    2012.03.26informatica49

    On the next page of this wizard, I was given the choice of what to do when that workflow condition is met. Notice that besides Outbound Messaging, there are also options for creating tasks and sending email messages.

    2012.03.26informatica50

    After choosing New Outbound Message, I needed to provide a name for this Outbound Message, the endpoint URL provided to me by the Informatica Cloud, and the data fields that my mapping will expect. In my case, there were five fields that were used in my mapping.

    2012.03.26informatica51

    After saving this configuration, I completed the Workflow Rule and activated it.

    Testing the ETL

    With my Informatica Cloud configuration ready, and Salesforce.com Workflow Rule activated, I went and created a brand new Account record.

    2012.03.26informatica52

    After saving the new record, I went and looked in the Outbound Messaging Delivery Status view and it was empty, meaning that it had already completed! Sure enough, I checked my database table and BOOM, there it was.

    2012.03.26informatica53

    That’s impressive!

    Summary

    One of the trickiest aspects of Salesforce.com Outbound Messaging is that you need an public-facing internet endpoint to push to, even if your receiving app is inside your firewall. By using the Informatica Cloud, you get one! This scenario demonstrated a way to do *instant* data transfer from Salesforce.com to a local database. I think that’s pretty killer.

    I hope you found this series useful. A modern enterprise architecture landscape will include traditional components like BizTalk Server and Informatica (or SSIS for that matter), but also start to contain cloud-based integration tools. Informatica Cloud should be high on your list of options for integrating both on-premises and cloud applications, especially if you want to stop installing and maintaining integration software!

  • ETL in the Cloud with Informatica: Part 3 – Sending Dynamics CRM Online Data to Local Database

    In Part 1 and Part 2 of this series, I’ve taken a look at doing Extract-Transform-Load (ETL) operations using the Informatica Cloud. This platform looks like a great choice for bulk movement of data between cloud or on-premises systems. So far we’ve seen how to move data from on-premises to the cloud, and then between clouds. In this post, I’ll show you how you can transfer data from a cloud application (Dynamics CRM Online) to a SQL Server database running onsite.

    As a reminder, in this four-part blog series, I am walking through the following scenarios:

    Scenario Summary

    For this demo, I’ll be building a solution that looks like this:

    2012.03.26informatica29

    For this case, I (1) build the ETL package using the Informatica Cloud’s web-based designer, (2) the Cloud Secure Agent retrieves the ETL details when the task is triggered, (3) the data is retrieved from Dynamics CRM Online, and (4) the data is loaded into a SQL Server database.

    You can probably think of many scenarios where this situation will apply. For example, good practices for cloud applications often state that you keep onsite backups of your data. This is one way to do that on a daily schedule. In another case, you may have very complex reporting needs and cannot accomplish them using Dynamic CRM Online’s built in reporting capability, so a local, transformed replica makes sense.

    Let’s see how to make this happen.

    Setting up the Target Database

    First up, I created a database table in my SQL Server 2008 R2 instance. This table, called CrmAccount holds a few of the attributes that reside in the Dynamics CRM Online “Account” entity.

    2012.03.26informatica30

    Next, I added a new Login to my Instance and switched my server to accept both Windows Authentication *and* SQL Server authentication. Why? During some trial runs with this, I couldn’t seem to get integrated authentication to work in the Informatica Cloud designer. When I switched to a local DB account, the connection worked fine.

    After this, I confirmed that I had TCP/IP enabled since the Cloud Secure Agent uses this port for connecting to my server.

    2012.03.26informatica31

    Building the ETL Package

    With all that set up, now we can build our ETL task in the Informatica Cloud environment. The first step in the Data Synchronization wizard is to provide a name for my task and choose the type of operation (e.g. Insert, Update, Upsert, Delete).

    2012.03.26informatica32

    Next, I’ll chose my Source. In this step, I reused the Dynamics CRM Online connection that I created in the first post of the series. After choosing that connection, I selected the Account entity as my Source Object. A preview of the data was then automatically shown.

    2012.03.26informatica33

    With my source in place, I moved on to define my target. In this case, my target is going to involve a new SQL Server connection. To create this connection, I supplied the name of my server, instance (if applicable), database, credentials (for the SQL Server login account) and port number.

    2012.03.26informatica34

    Once I defined the connection, the drop down list (Target Object) was auto-populated with the tables in my database. I selected CrmAccount and saw a preview of my (empty) table.

    2012.03.26informatica35

    On the next wizard page, I decided to not apply any filters on the Dynamics CRM Online data. So, ALL accounts should be copied over to my database table. I was now ready for the data mapping exercise. The following wizard page let me drag-and-drop fields from the source (Dynamics CRM Online) to the target (SQL Server 2008 R2).

    2012.03.26informatica36

    On the last page of the wizard, I chose to NOT run this task on a schedule. I could set this run every five minutes, or once a week. There’s lots of flexibility in this.

    Testing the ETL

    Let’s test this out. In my list of Data Synchronization Tasks I can see the tasks from the last two posts, and a new tasks representing what we created above.

    2012.03.26informatica37

    By clicking the green Run Now button, I can kick off this ETL. As an aside, the Informatica Cloud exposes a REST API where among other things, you can make a web request that kicks off a task on demand. That’s a neat feature that can come in handy if you have an ETL that runs infrequently, but a need arises for it to run RIGHT NOW. In this case, I’m going with the Run Now button.

    To compare results, I have 14 account records in my Dynamics CRM Online organization.

    2012.03.26informatica38

    I can see in my Informatica Cloud Activity Log that the ETL task completed and 14 records moved over.

    2012.03.26informatica39

    To be sure, I jumped back to my SQL Server database and checked out my table.

    2012.03.26informatica40

    As I expected,  I can see 14 new records in my table. Success!

    Summary

    Sending data from a cloud application to an on-premises database is a realistic use case and hopefully this demo showed how easily it can be accomplished with the Informatica Cloud. The database connection is relatively straightforward and the data mapping tool should satisfy most ETL needs.

    In the next post of this series, I’ll show you how to send data, in real-time, from Salesforce.com to a SQL Server database.

  • ETL in the Cloud with Informatica: Part 2 – Sending Salesforce.com Data to Dynamics CRM Online

    In my last post, we saw how the Informatica Cloud lets you create bulk data load (i.e. ETL) tasks using a web-based designer and uses a lightweight local machine agent to facilitate the data exchange. In this post, I’ll show you how to transfer data from Salesforce.com to Dynamics CRM Online using the Informatica Cloud.

    In this four-part blog series, I will walk through the following scenarios:

    Scenario Summary

    In this post, I’ll build the following solution.

    2012.03.26informatica17

    In this solution, (1) I leverage the web-based designer to craft the ETL between Salesforce.com and Dynamics CRM Online, (2) use a locally installed Secure Cloud Agent to retrieve ETL details, (3) pull data from Salesforce.com, and finally (4) move that data into Dynamics CRM Online.

    What’s interesting is that even though this is a “cloud only” ETL, the Informatica Cloud solution still requires the use of the Cloud Secure Agent (installed on-premises) to facilitate the actual data transfer.

    To view some of the setup steps (such as signing up for services and installing required software), see the first post in this series.

    Building the ETL Package

    To start with, I logged into the Informatica Cloud and created a new Data Synchronization task.

    2012.03.26informatica18

    On the next wizard page, I created a new connection type for Salesforce.com and provided all the required credentials.

    2012.03.26informatica19

    With that in place, I could select that connection, the entity (“Contact”) to pull data from, and see a quick preview of that data in my Salesforce.com account.

    2012.03.26informatica20

    On the next wizard page, I configured a connection to my ETL target. I chose an existing Dynamics CRM Online connection, and selected the “Contact” entity.

    2012.03.26informatica21

    Instead of transferring all the data from my Salesforce.com organization to my Dynamics CRM Online organization, I  used the next wizard page to define a data filter. In my case, I’m only going to grab Salesforce.com contacts that have a title of “Architect”.

    2012.03.26informatica22

    For the data mapping exercise, it’s nice that the Informatica tooling automatically links fields through its Automatch capability. In this scenario, I didn’t do any manual mapping and relied solely on Automatch.

    2012.03.26informatica23

    While, like in my first post, I chose not to schedule this task, you’ll notice here that I *have* to select a Secure Cloud Agent. The agent is responsible for executing the ETL task after retrieving the details of the task from the Informatica Cloud.

    2012.03.26informatica24

    This ETL is now complete.

    Testing the ETL

    In my list of Data Synchronization Tasks list, I can see my new task. The green Run Now button will trigger the task.

    2012.03.26informatica25

    I have this record in my Salesforce.com application. Notice the “title” of Architect.

    2012.03.26informatica26

    After a few moments, the task runs and I could see in the Informatica Cloud’s Activity Log that this task completed successfully.

    2012.03.26informatica27

    To be absolutely sure, I logged into my Dynamics CRM Online account, and sure enough, I now have that one record added.

    2012.03.26informatica28

    Summary

    There are lots of reasons to do ETL between cloud applications. While Salesforce.com and Dynamics CRM Online are competing products, many large organizations are going to likely leverage both platforms for different reasons. Maybe you’ll have your sales personnel use Salesforce.com for traditional sales functions, and use Dynamics CRM Online for something like partner management. Either way, it’s great to have the option to easily move data between these environments without having to install and manage enterprise software on site.

    Next up, I’ll show you how to take Dynamics CRM Online data and push it to an on-premises database.

  • ETL in the Cloud with Informatica: Part 1 – Sending File Data to Dynamics CRM Online

    The more software systems that we deploy to cloud environments, the greater the need will be to have an efficient integration strategy. Integration through messaging is possible through something like an on-premises integration server, or via a variety of cloud tools such as queues hosted in AWS or something like the Windows Azure Service Bus Relay. However, what if you want to do some bulk data movement with Extract-Transform-Load (ETL) tools that cater to cloud solutions? One of the market leaders in the overall ETL market, Informatica, has also established a strong integration-as-a-service offering with its Informatica Cloud. They recently announced support for Dynamics CRM Online as a source/destination for ETL operations, so I got inspired to give their platform a whirl.

    Informatica Cloud supports a variety of sources/destinations for ETL operations and leverages a machine agent (“Cloud Secure Agent”) for securely connecting on-premises environments to cloud environments. Instead of installing any client development tools, I can design my ETL process entirely through their hosted web application. When the ETL process executes, the Cloud Secure Agent retrieves the ETL details from the cloud and runs the task. There is  no need to install or maintain a full server product for hosting and running these tasks. The Informatica Cloud doesn’t actually store any transactional data itself, and acts solely as a passthrough that executes the package (through the Cloud Secure Agent) and moves data around. All in all, neat stuff.

    In this four-part blog series, I will walk through the following scenarios:

    Scenario Summary

    So what are we building in this post?

    2012.03.26informatica01

    What’s going to happen is that (1) I’ll use the Informatica Cloud to define an ETL that takes a flat file from my local machine and copies the data to Dynamics CRM Online, (2) the Secure Cloud Agent will communicate with the Informatica Cloud to get the ETL details, (3) the Secure Cloud Agent retrieves the flat file from my local machine, and finally (4) the package runs and data is loaded into Dynamics CRM Online.

    Sound good? Let’s jump in.

    Setup

    In this first post of the blog series, I’ll outline a few of the setup steps that I followed to get everything up and running. In subsequent posts, I’ll skip over this. First, I used my existing, free, Salesforce.com Developer account. Next, I signed up for a 30-day free trial of Dynamics CRM Online. After that, I signed up for a 30-day free trial of the Informatica Cloud.

    Finally, I downloaded the Informatica agent to my local machine.

    2012.03.26informatica02

    Once the agent is installed, I can manage it through a simple console.

    2012.03.26informatica03

    Building the ETL Package

    To get started, I logged into my Informatica Cloud account and walked through their Data Synchronization wizard. In the first step, I named my Task and chose to do an Insert operation.

    2012.03.26informatica04

    Next, I chose to create a “flat file” connection type. This requires my Agent to have permissions on my file system, so I set the Agent’s Windows Service to run as a trusted account on my machine.

    2012.03.26informatica05

    With the connection defined, I could then choose to use a comma delimited formatter, and chose the text file in the “temp” directory I had selected above. I can immediately see a preview that showed how my data was parsed.

    2012.03.26informatica06

    On the next wizard page, I chose to create a new target connection. Here I selected Dynamics CRM Online as my destination system, and filled out the required properties (e.g. user ID, password, CRM organization name).

    2012.03.26informatica07

    Note that the Organization Name above is NOT the Organization Unique Name that is part of the Dynamics CRM Online account and viewable from the Customizations -> Developer Resources page.

    2012.03.26informatica08

    Rather, this is the Organization Name that I set up when signed up for my free trial. Note that this value is also case sensitive. Once I set this connection, an automatic preview of the data in that Dynamics CRM entity was shown.

    2012.03.26informatica09

    On the next wizard page, I kept the default options and did NOT add any filters to the source data.

    2012.03.26informatica10

    Now we get to the fun part. The Field Mapping page is where I set which source fields go to which destination fields. The interface supports drag and drop between the two sides.

    2012.03.26informatica11

    Besides straight up one-to-one mapping, you can also leverage Expressions when conditional logic or field manipulation is needed. In the picture below, you can see that I added a concatenation function to combine the FirstName and LastName fields and put them into a FullName field.

    2012.03.26informatica12

    In addition to Expressions, we also have the option of adding Lookups to the mapping. A lookup allows us to pull in one value (e.g. City) based on another (e.g. Zip) that may be in an entirely different source location. The final step of the wizard involves defining a schedule for running this task. I chose to have “no schedule” which means that this task is run manually.

    2012.03.26informatica13

    And that’s it! I now have an Informatica package that can be run whenever I want.

    Testing the ETL

    We’re ready to try this out. The Tasks page shows all my available tasks, and the green Run Now button will kick the ETL off. Remember that my Cloud Secure Agent must be up and running for this to work. After starting up the job, I was told that it make take a few minutes to launch and run. Within a couple minutes, I saw a “success” message in my Activity Log.

    2012.03.26informatica15

    But that doesn’t prove anything! Let’s look inside my Dynamics CRM Online application and locate one of those new records.

    2012.03.26informatica16

    Success! My three records came across, and in the record above, we can see that the first name, last name and phone number were transferred over.

    Summary

    That was pretty straightforward. As you can imagine, these ETLs can get much more complicated as you have related entities and such. However, this web-based ETL designer means that organizations will have a much simpler maintenance profile since they don’t have to host and run these ETLs using on-premises servers.

    Next up, I’ll show you how you can move data between two entirely cloud-based environments: Salesforce.com and Dynamics CRM Online.

  • Microsoft Dynamics CRM Online: By the Numbers

    I’ve enjoyed attending Microsoft’s 2012 Convergence Conference, and one action item for me is to take another look at Dynamics CRM Online. Now, one reason that I spend more time playing with Salesforce.com instead of Dynamics CRM Online is because Salesforce.com has a free tier, and Dynamics CRM Online only has a 30 day trial. They really need to change that. Regardless, I’ve also focused more on Salesforce.com because of their market leading position and the perceived immaturity of Microsoft’s business solutions cloud. After attending a few different sessions here, I have to revisit that opinion.

    I sat through a really fascinating breakout session about how Microsoft operates its (Dynamics) cloud business. The speaker sprinkled various statistics throughout his presentation, so I gathered them all up and have included them here.

    30,000. Number of engineers at Microsoft doing cloud-related work.

    2,000. Number of people managing Microsoft online services.

    1,000. Number of servers that power Dynamics CRM Online.

    99.9%. Guaranteed uptime per month (44 minutes of downtime allowed). Worst case, there is 5-15 minutes worth of data loss (RPO).

    41. Number of global markets in which CRM Online is available for use.

    40+. Number of different cloud services managed by Microsoft Global Foundation Services (GFS). The GFS site says “200 online services and web portal”, but maybe they use different math.

    30. Number of days that the free trial lasts. Seriously, fix this.

    19. Number of servers in each rack that make up “pod.” Each “scale group” (which contains all the items needed for a CRM instance) is striped across server racks, and multiple scale groups are collected into pods. While CRM app/web servers may be multi-tenet, each customer’s database is uniquely provisioned and not shared.

    8. Number of months it took the CRM Online team to devise and deliver a site failover solution that requires a single command. Impressive. They make heavy use of SQL Server 2012 “always on” capabilities for their high availability and disaster recovery strategy.

    5. Copies of data that exist for a given customer. You have (1) your primary organization database, (2) a synchronous snapshot database (which is updated at the same time as the primary), (3)(4) asynchronous copies made in the alternate data center (for a given region), and finally, (5) a daily backup to an offsite location. Whew!

    6. Number of data centers that have CRM Online available (California, Virginia, Dublin, Amsterdam, Hong Kong and Singapore).

    0. Amount of downtime necessary to perform all the upgrades in the environment. These include daily RFCs, 0-3 out-of-band releases per month, monthly security patches, bi-monthly update rollups, password changes every 70 days, and twice-yearly service updates. It sounds pretty darn complicated to handle both backwards and forwards compatibility while keeping customers online during upgrades, but it sounds like they pull it off.

    Overall? That’s pretty hearty stuff. Recent releases are starting to bring CRM Online within shouting distance of its competitors and for some scenarios, it may even be a better choice that Salesforce.com. Either way, I have a newfound understanding about the robustness of the platform and will look to incorporate CRM Online into a few more of my upcoming demos.

  • I’m at the Microsoft Convergence conference this week

    From Monday through Wednesday of this week, I’ll be at Microsoft’s Convergence conference in Houston, Texas. This is Microsoft’s annual conference for the Dynamics product line, and this year I’ll be attending as a speaker.

    I’m co-delivering a session entitled Managing Complex Implementations of Microsoft Dynamics CRM. I now have a bit of experience with this because of my day job, so it should be fun to share some of the learnings. We’re going to cover all the things that make a CRM project (or any complex project, for that matter) complex, including “introducing new technology”, “multi-source data migration”, “industry regulations” and more. We’ll then cover some lessons learned from project scoping/planning/estimation exercises and conclude by looking at the ideal team makeup for complex projects.

    All in all, should be a good time. If you happen to be attending this year, stop on by!

  • Doing a Multi-Cloud Deployment of an ASP.NET Web Application

    The recent Azure outage once again highlighted the value in being able to run an application in multiple clouds so that a failure in one place doesn’t completely cripple you. While you may not run an application in multiple clouds simultaneously, it can be helpful to have a standby ready to go. That standby could already be deployed to backup environment, or, could be rapidly deployed from a build server out to a cloud environment.

    https://twitter.com/#!/jamesurquhart/status/174919593788309504

    So, I thought I’d take a quick look at how to take the same ASP.NET web application and deploy it to three different .NET-friendly public clouds: Amazon Web Services (AWS), Iron Foundry, and Windows Azure. Just for fun, I’m keeping my database (AWS SimpleDB) separate from the primary hosting environment (Windows Azure) so that my database could be available if my primary, or backup (Iron Foundry) environments were down.

    My application is very simple: it’s a Web Form that pulls data from AWS SimpleDB and displays the results in a grid. Ideally, this works as-is in any of the below three cloud environments. Let’s find out.

    Deploying the Application to Windows Azure

    Windows Azure is a reasonable destination for many .NET web applications that can run offsite. So, let’s see what it takes to push an existing web application into the Windows Azure application fabric.

    First, after confirming that I had installed the Azure SDK 1.6, I right-clicked my ASP.NET web application and added a new Azure Deployment project.

    2012.03.05cloud01

    After choosing this command, I ended up with a new project in this Visual Studio solution.

    2012.03.05cloud02

    While I can view configuration properties (how many web roles to provision, etc), I jumped right into Publishing without changing any settings. While there was a setting to add an Azure storage account (vs. using local storage), but I didn’t think I had a need for Azure storage.

    The first step in the Publishing process required me to supply authentication in the form of a certificate. I created a new certificate, uploaded it to the Windows Azure portal, took my Azure account’s subscription identifier, and gave this set of credentials a friendly name.

    2012.03.05cloud03

    I didn’t have any “hosted services” in this account, so I was prompted to create one.

    2012.03.05cloud04

    With a host created, I then left the other settings as they were, with the hope of deploying this app to production.

    2012.03.05cloud05

    After publishing, Visual Studio 2010 showed me the status of the deployment that took about 6-7 minutes.

    2012.03.05cloud06

    An Azure hosted service and single instance were provisioned. A storage account was also added automatically.

    2012.03.05cloud07

    I had an error and updated my configuration file to show the error, and that update took another 5 minutes (upon replacing the original). The error was that the app couldn’t load the AWS SDK component that was referenced. So, I switched the AWS SDK dll to “copy local” in the ASP.NET application project and once again redeployed my application. This time it worked fine, and I was able to see my SimpleDB data from my Azure-hosted ASP.NET website.

    2012.03.05cloud08

    Not too bad. Definitely a bit of upfront work to do, but subsequent projects can reuse the authentication-related activities that I completed earlier. The sluggish deployment times really stunt momentum, but realistically, you can do some decent testing locally so that what gets deployed is pretty solid.

    Deploying the Application to Iron Foundry

    Tier3’s Iron Foundry is the .NET-flavored version of VMware’s popular Cloud Foundry platform. Given that you can use Iron Foundry in your own data center, or in the cloud, it’s something that developers should keep a close eye on. I decided to use the Cloud Foundry Explorer that sits within Visual Studio 2010. You can download it from the Iron Foundry site. With that installed, I can right-click my ASP.NET application and choose to Push Cloud Foundry Application.

    2012.03.05cloud09

    Next, if I hadn’t previously configured access to the Iron Foundry cloud, I’d need to create a connection with the target API and my valid credentials. With the connection in place, I set the name of my cloud application and clicked Push.

    2012.03.05cloud18

    In under 60 seconds, my application was deployed and ready to look at.

    2012.03.05cloud19

    What if a change to the application is needed? I updated the HTML, right clicked my project and chose to Update Cloud Foundry Application. Once again, in a few seconds, my application was updated and I could see the changes. Taking an existing ASP.NET and moving to Iron Foundry doesn’t require any modifications to the application itself.

    If you’re looking for a multi-language, on-or-off premises PaaS, that is easy to work with, then I strongly encourage you to try Iron Foundry out.

    Deploying the Application to AWS via CloudFormation

    While AWS does not have a PaaS, per se, they do make it easy to deploy apps in a PaaS-like way via CloudFormation. Via CloudFormation, I can deploy a set of related resources and manage them as one deployment unit.

    From within Visual Studio 2010, I right-clicked my ASP.NET web application and chose Publish to AWS CloudFormation.

    2012.03.05cloud11

    When the wizard launches, I was asked to choose one of two deployment templates (single instance or multiple, load balanced instances).

    2012.03.05cloud12

    After selecting the single instance template, I kept the default values in the next wizard page. These settings include the size of the host machine, security group and name of this stack.

    2012.03.05cloud13

    On the next wizard pages, I kept the default settings (e.g. .NET version) and chose to deploy my application. Immediately, I saw a window in Visual Studio that showed the progress of my deployment.

    2012.03.05cloud14

    In about 7 minutes, I had a finished deployment and a URL to my application was provided. Sure enough, upon clicking that link, I was sent to my web application running successfully in AWS.

    2012.03.05cloud15

    Just to compare to previous scenarios, I went ahead and made a small change to the HTML of the web application and once again chose Publish to AWS CloudFormation from the right-click menu.

    2012.03.05cloud16

    As you can see, it saw my previous template, and as I walked through the wizard, it retrieved any existing settings and allowed me to make any changes where possible. When I clicked Deploy again, I saw that my package was being uploaded, and in less than a minute, I saw the changes in my hosted web application.

    2012.03.05cloud17

    So while I’m still leveraging the AWS infrastructure-as-a-service environment, the use of CloudFormation makes this seem a bit more like an application fabric. The deployments were very straightforward and smooth, arguably the smoothest of all three options shown in this post.

    Summary

    I was able to fairly easily take the same ASP.NET website and from Visual Studio 2010, deploy to three distinct clouds.  Each cloud has their own steps and processes, but each are fairly straightforward. Because Iron Foundry doesn’t require new VMs to be spun up, it’s consistently the faster deployment scenario. That can make a big difference during development and prototyping and should be something you factor into your cloud platform selection. Windows Azure has a nice set of additional services (like queuing, storage, integration), and Amazon gives you some best-of-breed hosting and monitoring. Tier 3’s Iron Foundry lets you use one of the most popular open source, multi-environment PaaS platforms for .NET apps. There are factors that would lead you to each of these clouds.

    This is hopefully a good bit of information to know when panic sets in over the downtime of a particular cloud. However, as you build your application with more and more services that are specific to a given environment, this multi-cloud strategy becomes less straightforward. For instance, if an ASP.NET application leverages SQL Azure for database storage, then you are still in pretty good shape when an application has to move to other environments. ASP.NET talks to SQL Server using the same ports and API, regardless of whether it’s using SQL Azure or a SQL instance deployed on an Amazon instance. But, if I’m using Azure Queues (or Amazon SQS for that matter), then it’s more difficult to instantly replace that component in another cloud environment.

    Keep all these portability concerns in mind when building your cloud-friendly applications!

  • Using SignalR To Push StreamInsight Events to Client Browsers

    I’ve spent some time recently working with the asynchronous web event messaging engine SignalR. This framework uses JavaScript (with jQuery) on the client and ASP.NET on the server to enable very interactive communication patterns. The coolest part is being able to have the server-side application call a JavaScript function on each connected browser client. While many SignalR demos you see have focused on scenarios like chat applications, I was thinking  of how to use SignalR to notify business users of interesting events within an enterprise. Wouldn’t it be fascinating if business events (e.g. “Project X requirements document updated”, “Big customer added in US West region”, “Production Mail Server offline”, “FAQ web page visits up 78% today”) were published from source applications and pushed to a live dashboard-type web application available to users? If I want to process these fast moving events and perform rich aggregations over windows of events, then Microsoft StreamInsight is a great addition to a SignalR-based solution. In this blog post, I’m going to walk through a demonstration of using SignalR to push business events through StreamInsight and into a Tweetdeck-like browser client.

    Solution Overview

    So what are we building? To make sure that we keep an eye on the whole picture while building the individual components, I’ve summarized the solution here.

    2012.03.01signalr05

    Basically, the browser client will first (through jQuery) call a server operation that adds that client to a message group (e.g. “security events”). Events are then sent from source applications to StreamInsight where they are processed. StreamInsight then calls a WCF service that is part of the ASP.NET web application. Finally, the WCF Service uses the SignalR framework to invoke the “addEventMsg()” function on each connected browser client. Sound like fun? Good. Let’s jump in.

    Setting up the SignalR application

    I started out by creating a new ASP.NET web application. I then used the NuGet extension to locate the SignalR libraries that I wanted to use.

    2012.03.01signalr01

    Once the packages were chosen from NuGet, they got automatically added to my ASP.NET app.

    2012.03.01signalr02

    The next thing to do was add the appropriate JavaScript references at the top of the page. Note the last one. It is a virtual JavaScript location (you won’t find it in the design-time application) that is generated by the SignalR framework. This script, which you can view in the browser at runtime, holds all the JavaScript code that corresponds to the server/browser methods defined in my ASP.NET application.

    2012.03.01signalr04

    After this, I added the HTML and ASP.NET controls necessary to visualize my Tweetdeck-like event viewer. Besides a column where each event shows up, I also added a listbox that holds all the types of events that someone might subscribe to. Maybe one set of users just want security-oriented events, or another wants events related to a given IT project.

    2012.03.01signalr03

    With my look-and-feel in place, I then moved on to adding some server-side components. I first created a new class (BizEventController.cs) that uses the SignalR “Hubs” connection model. This class holds a single operation that gets called by the JavaScript in the browser and adds the client to a given messaging group. Later, I can target a SignalR message to a given group.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Web;
    
    //added reference to SignalR
    using SignalR.Hubs;
    
    ///
    <summary> /// Summary description for BizEventController /// </summary>
    
    public class BizEventController : Hub
    {
        public void AddSubscription(string eventType)
        {
            AddToGroup(eventType);
        }
    }
    

    I then switched back to the ASP.NET page and added the JavaScript guts of my SignalR application. Specifically, the code below (1) defines an operation on my client-side hub (that gets called by the server) and (2) calls the server side controller that adds clients to a given message group.

    $(function () {
                //create arrays for use in showing formatted date string
                var days = ['Sun', 'Mon', 'Tues', 'Wed', 'Thur', 'Fri', 'Sat'];
                var months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'June', 'July', 'Aug', 'Sept', 'Oct', 'Nov', 'Dec'];
    
                // create proxy that uses in dynamic signalr/hubs file
                var bizEDeck = $.connection.bizEventController;
    
                // Declare a function on the chat hub so the server can invoke it
                bizEDeck.addEventMsg = function (message) {
                    //format date
                    var receiptDate = new Date();
                    var formattedDt = days[receiptDate.getDay()] + ' ' + months[receiptDate.getMonth()] + ' ' + receiptDate.getDate() + ' ' + receiptDate.getHours() + ':' + receiptDate.getMinutes();
                    //add new "message" to deck column
                    $('#deck').prepend('</pre>
    <div>' + message + '' + formattedDt + ' via StreamInsight</div>
    <pre>
    ');
                };
    
                //act on "subscribe" button
                $("#groupadd").click(function () {
                    //call subscription function in server code
                    bizEDeck.addSubscription($('#group').val());
                    //add entry in "subscriptions" section
                    $('#subs').append($('#group').val() + '</pre>
    
    <hr />
    
    <pre>');
                });
    
                // Start the connection
                $.connection.hub.start();
            });
    

    Building the web service that StreamInsight will call to update browsers

    The UI piece was now complete. Next, I wanted a web service that StreamInsight could call and pass in business events that would get pushed to each browser client. I’m leveraging a previously-built StreamInsight WCF adapter that can be used to receive web service request and call web service endpoints. I built a WCF service and in the underlying class, I pull the list of all connected clients and invoke the JavaScript function.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Runtime.Serialization;
    using System.ServiceModel;
    using System.Text;
    
    using SignalR;
    using SignalR.Infrastructure;
    using SignalR.Hosting.AspNet;
    using StreamInsight.Samples.Adapters.Wcf;
    using Seroter.SI.AzureAppFabricAdapter;
    
    public class NotificationService : IPointEventReceiver
    {
    	//implement the operation included in interface definition
    	public ResultCode PublishEvent(WcfPointEvent result)
    	{
    		//get category from key/value payload
    		string cat = result.Payload["Category"].ToString();
    		//get message from key/value payload
    		string msg = result.Payload["EventMessage"].ToString();
    
    		//get SignalR connection manager
    		IConnectionManager mgr = AspNetHost.DependencyResolver.Resolve();
    		//retrieve list of all connected clients
    		dynamic clients = mgr.GetClients();
    
    		//send message to all clients for given category
    		clients[cat].addEventMsg(msg);
    		//also send message to anyone subscribed to all events
    		clients["All"].addEventMsg(msg);
    
    		return ResultCode.Success;
    	}
    }
    

    Preparing StreamInsight to receive, aggregate and forward events

    The website is ready, the service is exposed, and all that’s left is to get events and process them. Specifically, I used a WCF adapter to create an endpoint and listen for events from sources, wrote a few queries, and then sent the output to the WCF service created above.

    The StreamInsight application is below. It includes the creation of the embedded server and all other sorts of fun stuff.

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    
    using Microsoft.ComplexEventProcessing;
    using Microsoft.ComplexEventProcessing.Linq;
    using Seroter.SI.AzureAppFabricAdapter;
    using StreamInsight.Samples.Adapters.Wcf;
    
    namespace SignalRTest.StreamInsightHost
    {
        class Program
        {
            static void Main(string[] args)
            {
                Console.WriteLine(":: Starting embedded StreamInsight server ::");
    
                //create SI server
                using(Server server = Server.Create("RSEROTERv12"))
                {
                    //create SI application
                    Application app = server.CreateApplication("SeroterSignalR");
    
                    //create input adapter configuration
                    WcfAdapterConfig inConfig = new WcfAdapterConfig()
                    {
                        Password = "",
                        RequireAccessToken = false,
                        Username  = "",
                        ServiceAddress = "http://localhost:80/StreamInsightv12/RSEROTER/InputAdapter"
                    };
    
                    //create output adapter configuration
                    WcfAdapterConfig outConfig = new WcfAdapterConfig()
                    {
                        Password = "",
                        RequireAccessToken = false,
                        Username = "",
                        ServiceAddress = "http://localhost:6412/SignalRTest/NotificationService.svc"
                    };
    
                    //create event stream from the source adapter
                    CepStream input = CepStream.Create("BizEventStream", typeof(WcfInputAdapterFactory), inConfig, EventShape.Point);
                    //build initial LINQ query that is a simple passthrough
                    var eventQuery = from i in input
                                     select i;
    
                    //create unbounded SI query that doesn't emit to specific adapter
                    var query0 = eventQuery.ToQuery(app, "BizQueryRaw", string.Empty, EventShape.Point, StreamEventOrder.FullyOrdered);
                    query0.Start();
    
                    //create another query that latches onto previous query
                    //filters out all individual web hits used in later agg query
                    var eventQuery1 = from i in query0.ToStream()
                                      where i.Category != "Web"
                                      select i;
    
                    //another query that groups events by type; used here for web site hits
                    var eventQuery2 = from i in query0.ToStream()
                                      group i by i.Category into EventGroup
                                      from win in EventGroup.TumblingWindow(TimeSpan.FromSeconds(10))
                                      select new BizEvent
                                      {
                                          Category = EventGroup.Key,
                                          EventMessage = win.Count().ToString() + " web visits in the past 10 seconds"
                                      };
                    //new query that takes result of previous and just emits web groups
                    var eventQuery3 = from i in eventQuery2
                                      where i.Category == "Web"
                                      select i;
    
                    //create new SI queries bound to WCF output adapter
                    var query1 = eventQuery1.ToQuery(app, "BizQuery1", string.Empty, typeof(WcfOutputAdapterFactory), outConfig, EventShape.Point, StreamEventOrder.FullyOrdered);
                    var query2 = eventQuery3.ToQuery(app, "BizQuery2", string.Empty, typeof(WcfOutputAdapterFactory), outConfig, EventShape.Point, StreamEventOrder.FullyOrdered);
    
                    //start queries
                    query1.Start();
                    query2.Start();
                    Console.WriteLine("Query started. Press [Enter] to stop.");
    
                    Console.ReadLine();
                    //stop all queries
                    query1.Stop();
                    query2.Stop();
                    query0.Stop();
                    Console.Write("Query stopped.");
                    Console.ReadLine();
    
                }
            }
    
            private class BizEvent
            {
                public string Category { get; set; }
                public string EventMessage { get; set; }
            }
        }
    }
    

    Everything is now complete. Let’s move on to testing with a simple event generator that I created.

    Testing the solution

    I built a simple WinForm application that generates business events or a user-defined number of simulated website visits. The business events are passed through StreamInsight, and the website hits are aggregated so that StreamInsight can emit the count of hits every ten seconds.

    To highlight the SignalR experience, I launched three browser instances with two different group subscriptions. The first two subscribe to all events, and the third one subscribes just to website-based events. For the latter, the client JavaScript function won’t get invoked by the server unless the events are in the “Web” category.

    The screenshot below shows the three browser instances launched (one in IE, two in Chrome).

    2012.03.01signalr06

    Next, I launched my event-generator app and StreamInsight host. I sent in a couple of business (not web) events and hoped to see them show up in two of the browser instances.

    2012.03.01signalr07

    As expected, two of the browser clients were instantly updated with these events, and the other subscriber was not. Next, I sent in a handful of simulated website hit events and observed the results.

    2012.03.01signalr08

    Cool! So all three browser instances were instantly updated with ten-second-counts of website events that were received.

    Summary

    SignalR is an awesome framework for providing real-time, interactive, bi-directional communication between clients and servers. I think there’s a lot of value of using SignalR for dashboards, widgets and event monitoring interfaces. In this post we saw a simple “business event monitor” application that enterprise users could leverage to keep up to date on what’s happening within enterprise systems. I used StreamInsight here, but you could use BizTalk Server or any application that can send events to web services.

    What do you think? Where do you see value for SignalR?

    UPDATE:I’ve made the source code for this project available and you can retrieve it from here.
  • My New Pluralsight Course, “AWS Developer Fundamentals”, Is Now Available

    I just finished designing, building and recording a new course for Pluralsight. I’ve been working with Amazon Web Services (AWS) products for a few years now, and I jumped at the chance to build a course that looked at the AWS services that have significant value for developers. That course is AWS Developer Fundamentals, and it is now online and available for Pluralsight subscribers.

    In this course, I  and cover the following areas, and

    • Compute Services. A walkthrough of EC2 and how to provision and interact with running instances.
    • Storage Services. Here we look at EBS and see examples of adding volumes, creating snapshots, and attaching volumes made from snapshots. We also cover S3 and how to interact with buckets and objects.
    • Database Services. This module covers the Relational Database Service (RDS) with some MySQL demos, SimpleDB and the new DynamoDB.
    • Messaging Services. Here we look at the Simple Queue Service (SQS) and Simple Notification Service (SNS).
    • Management and Deployment. This module covers the administrative components and includes a walkthrough of the Identity and Access Management (IAM) capabilities.

    Each module is chock full of exercises that should help you better understand how AWS services work. Instead of JUST showing you how to interact with services via an SDK, I decided that each set of demos should show how to perform functions using the Management Console, the raw (REST/Query) API, and also the .NET SDK. I think that this gives the student a good sense of all the viable ways to execute AWS commands. Not every application platform has an SDK available for AWS, so seeing the native API in action can be enlightening.

    I hope you take the time to watch it, and if you’re not a Pluralsight subscriber, now’s the time to jump in!

  • Comparing AWS/Box/Azure for Managed File Transfer Provider

    As organizations continue to form fluid partnerships and seek more secure solutions than “give the partner VPN access to our network”, cloud-based managed file transfer (MFT) solutions seem like an important area to investigate. If your company wants to share data with another organization, how do you go about doing it today? Do you leverage existing (aging?) FTP infrastructure? Do you have an internet-facing extranet? Have you used email communication for data transfer?

    All of those previous options will work, but an offsite (cloud-based) storage strategy is attractive for many reasons. Business partners never gain direct access to your systems/environment, the storage in cloud environments is quite elastic to meet growing needs, and cloud providers offer web-friendly APIs that can be used to easily integrate with existing applications. There are downsides related to loss of physical control over data, but there are ways to mitigate this risk through server-side encryption.

    That said, I took a quick look at three possible options. There are other options besides these, but I’ve got some familiarity with all of these, so it made my life easier to stick to these three. Specifically, I compared the Amazon Web Services S3 service, Box.com (formerly Box.net), and Windows Azure Blob Storage.

    Comparison

    The criteria along the left of the table are primarily from the Wikipedia definition of MFT capabilities, along with a few additional capabilities that I added.

    Feature

    Amazon S3

    Box.com

    Azure Storage

    Multiple file transfer protocols HTTP/S (REST, SOAP) HTTP/S (REST, SOAP) HTTP/S (REST)
    Secure transfer over encrypted protocols HTTPS HTTPS HTTPS
    Securely storage of files AES-256 provided AES-256 provided (for enterprise users) No out-of-box; up to developer
    Authenticate users against central factors AWS Identity & Access Management Uses Box.com identities, SSO via SAML and ADFS Through Windows Azure Active Directory (and federation standards like OAuth, SAML)
    Integrate to existing apps with documented API Rich API Rich API Rich API
    Generate reports based on user and file transfer activities Can set up data access logs Comprehensive controls Apparently custom; none found.
    Individual file size limit 5 TB 2 GB (for business and enterprise users) 200GB for block blob, 1TB for page blob
    Total storage limits Unlimited Unlimited (for enterprise users) 5 PB
    Pricing scheme Pay monthly for storage, transfer out, requests Per user Pay monthly for storage, transfer out, requests
    SLA Offered 99.999999999% durability and 99.99% availability of objects ? 99.9% availability
    Other Key Features Content expiration policies, versioning, structured storage options Polished UI tools or users and administrators; integration with apps like Salesforce.com Access to other Azure services for storage, compute, integration

    Summary

    Overall, there are some nice options out there. Amazon S3 is great for pay-as-you go storage with a very mature foundation and enormous size limits. Windows Azure is new at this, but they provide good identity federation options and good pricing and storage limits. Box.com is clearly the most end-user-friendly option and a serious player in this space. All have good-looking APIs that developers should find easy to work with.

    Have any of you used these platforms for data transfer between organizations?