Richard Seroter's Architecture Musings

Category: General Architecture

Microsoft CEP Solution Available for Download

Charles Young points out that Microsoft’s answer for Complex Event Processing is now available for Tech Preview download. I wrote a summary of what I saw of this solution at TechEd, so now I have to dig through the documentation (separate download for those not brave enough to install the CTP!) and see what’s new.

Next step, installing and trying to feed BizTalk events into it, or receive events out as BizTalk messages. Also up next, adding more hours to the day and stopping time.

August 21, 2009
Review: Pro Business Activity Monitoring in BizTalk Server 2009

I recently finished reading the book Pro Business Activity Monitoring in BizTalk Server 2009 and thought I should write a brief review.

To start with, this is a extremely well-written, easy to understand investigation of a topic long overdue for a more public unveiling. Long the secret tool of a BizTalk developer, BAM has never really stretched outside the BizTalk shadow despite its ability to support a wide range of input clients (WF, WCF, custom .NET code).

This book is organized in a way that first introduces the concepts and use cases of Business Activity Monitoring and then transitions into how to actually accomplish things with Microsoft BAM platform. The book closes with an architectural assessment that describes how BAM really works.

Early in the book, the authors looked at a number of situations (B2B, B2C, CSC, SOA, ESB, BPM, and mashups) and explained the relevance of BAM in each. This was a wise way to encourage the reader to think about BAM for more than just straight BizTalk solutions. It also showcases the value of capturing business metrics across applications and tiers.

The examples in the book were excellent, and one nice touch I liked was after the VERY first “build a BAM solution” demonstration, there was a solid section explaining how to troubleshoot the various things that might have gone wrong. Given how many times the first demonstration goes wrong for a reader, this was a very thoughtful addition and indicative of the care given to this topic by the authors.

You’ll also find a quite thorough quest to explain how to use the WCF and WF BAM interceptors including descriptions of key configuration attributes in addition to numerous examples of those configurations in action.

The book goes to great lengths to try and shine a light on aspects of BAM that may have been poorly understood and it offers concrete ways to address them. You’ll find suggestions for how to manage the numerous BAM solution artifacts, descriptions of the databases and tables that make up a BAM installation and it is one of the only places you can find a clear write up of the flow of data driven by the SSIS/DTS packages. The authors also talk about topics such as relationships and continuations which may have not been clear to developers in the past.

What else will you find here? You’ll see how to create all sorts of observation models in Excel, how to exploit the BAM portal or use other reporting environments, how to use either the TPE or the BAM API to feed the BAM interceptors, a well explained discussion on archiving, and how to encourage organizational acceptance and adoption of BAM.

I’d contend that if this book came out in 2005 (which it could have, given that there have only been a few core changes to the offering since then), you’d see BAM as a mainstream option for Microsoft-based activity monitoring. That didn’t happen, so countless architects and developers have missed out on a pretty sophisticated architecture that is fairly easy to use. Will this book change all that? Probably not, but if you are a BizTalk architect today, or simply find the idea of flexibly modeling, capturing and reporting key business indicators to be compelling, you really should read this book.

Technorati Tags: BizTalk

August 11, 2009
ESB Toolkit: Executing Multiple Maps In Sequence

There are a few capabilities advertised in the Microsoft ESB Toolkit for BizTalk Server that I have yet to try out. One thing that seemed possible, although I hadn’t seen demonstrated, was the ability to sequentially call a set of BizTalk maps.

Let’s say that you have maps from “Format1 to Format2” and “Format2 to Format3.” These are already deployed and running live in production. Along comes a new scenario where a message comes in and must be transformed from Format1 to Format3.

There are a few “classic BizTalk” ways to handle this. First, you could apply one map on the receive port and another on the send. Not bad, but this definitely means that this particular receive port can’t be one reused from another solution as this could cause unintended side effects on others. Second, you could write an orchestration that takes the inbound message and applies consecutive maps. This is common, but also requires new bits to be deployed into production. Thirdly, you could write a new map that directly transforms from Format1 to Format3. This also requires new bits, and, may force you to consolidate transformation logic that was unique to each map.

So what’s the ESB way to do it? If we see BizTalk as now just a set of services, we can build up an itinerary that directs the bus to execute a countless set of consecutive maps, each as a distinct service. This is a cool paradigm that allows me to reuse existing content more freely than before by introducing new ways to connect components that weren’t originally chained together.

First, we make sure our existing maps are deployed. In my case, I have two maps that follow the example given above.

I’ve also gone ahead and created a new receive port/location and send port for this demonstration. Note that I could have also added a new receive location to an existing receive port. The ESB service execution is localized to the specific receive location, unlike the “classic BizTalk” model where maps are applied across all of the receive locations. My dynamic send port has a ESB-friendly subscription.

We’ll look at the receive location settings in a moment. First, let’s create the itinerary that makes this magic happen. The initial shape in our itinerary is the On-Ramp. Here, I tell the itinerary to use my new receive port.

Next, I set up a messaging service that the Off-Ramp will use to get its destination URI. In my case, I used a STATIC resolver that exploits the FILE adapter and specifies a valid file path.

Now the games begin. I next added a new messaging service which is used for transformation. I set another STATIC resolver, and chose the “Format1 to Format2” map deployed in my application.

Then we add yet another transformation messaging service, this time telling the STATIC resolver to apply the “Format2 to Format3” map.

Great. Finally, we need an Off-Ramp. We then associate the three previous shapes (messaging service and two transformation services) with this Off-Ramp. Be sure to verify that the order of transformation resolvers is correct in the Off-Ramp. You don’t want to accidentally execute the “Format2 to Format3” map first!

Once our itinerary is connected up and ready to roll, we switch the itinerary status to “deployed” in the itinerary’s property window. This ensures that ESB runtime can find this itinerary when it needs it. To publish the itinerary to the common database, simply chose “Export Model.”

Fantastic. Now let’s make sure our BizTalk messaging components are up to snuff. First, open the FILE receive location and make sure that the ItinerarySelectReceiveXml pipeline is chosen. Then open the pipeline configuration window and set the resolver key and resolver string. The itinerary factkey is usually “Resolver.Itinerary” (which tells the pipeline in which resolver object property to find the XML itinerary content) and the resolver connection string itself is ITINERARY-STATIC:\\name=DoubleMap; The ITINERARY-STATIC directive enables me to do server-side itinerary lookup. It’ll use the name provided to find my itinerary record in the database and yank out the XML content. Note that I used a FILE receive location here. These ESB pipeline components can be used with ANY inbound adapter which really increases the avenues for publishing itinerary-bound messages to the bus.

Finally, go to the dynamic send port and make sure the ItinerarySendPassthrough pipeline is chosen. We need to ensure that the ESB services (like transformation) have a context in which to run. If you only had the standard passthrough pipeline selected here, you’d be subtracting the environment (pipelines) in which the ESB components do much of their work.

That is it. If we drop a “Format1” message in, we get a “Format3” message out. And all of this, POTENTIALLY, without deploying a single new BizTalk component. That said, you may still need to create a new dynamic send port if you don’t already have one reuse, and would probably want to create a new receive location, OR, if the itinerary was being looked up via the business rules engine (BRI resolver), then you could just update the existing business rule. Either way, this is a pretty quick and easy way to do something that wasn’t quick and easy before.

Technorati Tags: BizTalk, ESB

June 29, 2009
Can Software Architecture Attributes Also Be Applied to Business Processes?
I’m in San Diego attending the Drug Information Association conference with the goal of getting smarter in the functional areas that make up a bio-pharma company. I’m here with two exceptional architecture colleagues which means that most meals have consisted of us talking shop.

During dinner tonight, we were discussing the importance (or imperative, really) of having a central business champion that can envision what they need and communicate that vision to the technical team. The technical team shouldn’t be telling the business what their vision is.

Within that conversation, we talked about the value of having good business analysts who deeply understand the business and are in the position to offer actual improvements to the processes they uncover and document. I then asked if it’s valid to hijack many of the attributes that architects think about in the confines of a technical solution, but also have them applied by a business analyst to a business process. Maybe it’s crazy, but on first pass, most of the solutions architecture things I spend my day thinking about have direct correlation to what a good business process should address or mitigate as well:
- Scalability. How well does my process handle an increase in input requests? Is it built to allow for us to ramp up personnel or are there eventual bottlenecks we need to consider now?
- Flexibility. Can my process support modifications in sequencing or personnel? Or did we define a process that only works in a rigid order with little room for the slightest tweak?
- Reusability. Is the process modular enough that an entire series of steps could be leveraged by another organization that has an identical need?
- Encapsulation. If I’ve chained processes together, have I insulated each one from another so that fundamental internal modifications to one process doesn’t necessarily force a remodeling of a connected process?
- Security. Have I defined the explicit roles of the users in my process and identified who can see (or act on) what information as the process moves through its lifecycle?
- Maintainability. Is the process efficient and supportable in the long term?
- Availability. If someone is sick for two weeks, does the process grind to a halt? What if a key step in the process itself cannot be completed for a given period of time? What’s the impact of that?
- Concurrency. What happens if multiple people want to work on different pieces of the same process simultaneously? Should the process support this or does it require a sequential flow?
- Globalization/localization. Can this process be applied to a global work force or conversely, does the process allow for local nuances and modifications to be added?
Just like with solutions architecture, where you often may trade one attribute for another (e.g. “I’ll pick a solution which give up efficiency because I demand extreme flexibility”), the same can apply to a well-considered business process.

So what do you think? Do the business analysts you work with think along these lines? Are we properly “future-proofing” our business processes or are we simply documenting the status quo without enough rigor around quality attributes and a vision around the inevitable business/industry changes? I’ll admit that I haven’t done a significant amount of business process modeling in my career so maybe everyone already does this. But, I haven’t seen much of this type of analysis in my current environment.

Or, I just ate too much chicken tikka masala tonight and am completely whacked out.
June 22, 2009
Books I’ve Recently Finished Reading
Other obligations have quieted down over the past two months and I’ve been able to get back to some voracious reading. I thought I’d point out a few of the books that I’ve recently knocked out, and let you know what I think of them.
- SOA Governance. This is a book from Todd Biske and published by my book’s publisher, Packt. It follows a make-believe company through their efforts to establish SOA best practices at their organization. Now, that doesn’t mean that the book reads like a novel, but, this isn’t a “reference book” to me as much as an “ideas” book. When I finished it, I had a better sense of the behavioral changes, roles required and processes that I should consider when evangelizing SOA behavior in my own company. Todd does a good job identifying the underlying motivations of the people that will enable SOA to succeed or fail within a company. You’ll find some useful thinking around identifying the “right” services, versioning considerations, SLA definition, and even some useful checklists to verify if you’re asking the right questions at each phase of the service lifecycle. Whether you’re “doing SOA” or not, this is a easy read that can help you better digest the needs of stakeholders in an enterprise software solution.
- Mashup Patterns : Designs and Examples for the Modern Enterprise. I’ve been spending a fair amount of time digging into mashups lately, and it was great to see a book on the topic come out. The author breaks down the key aspects of designing a mashup (harvesting data, enriching data, assembling results and managing the deliverable). Each of the 30+ patterns is comprised of: (a) a problem statement that describes the issue at hand, (b) a conceptual solution to the problem, (c) a “fragility score” which indicates how brittle the solution is, (d) and finally 2 or more examples where this solution is applied to a very specific case. The examples for each pattern are where I found the most value. This helped drive home the problem being solved and provided a bit more meat on the conceptual solution being offered. That said, don’t expect this book to tell you WHAT can help you create these solutions. There is very much the tone of “we just need to get this data from here, combine it with this, and even our business analyst can do it!” However, nowhere does the author dig into how all this MAGIC really happens (e.g. products, tools, etc). That was the only weakness of the book to me. Otherwise, this was quite a well put together book that added a few things to my arsenal of options when architecting solutions.
- Thank You for Arguing: What Aristotle, Lincoln, and Homer Simpson Can Teach Us About the Art of Persuasion. I really enjoyed reading this. In essence, it’s a look at the lost art of rhetoric and covers a wide set of tools we can use to better frame an argument and win it. The author has a great sense of humor and I found myself actually taking notes while reading the book (which I never really do). There are a mix of common sense techniques for setting up your own case, but I also found the parts outlining how to spot a bad argument quite interesting. So, if you want to get noticeably better at persuading others and also become more effective at identifying when someone’s trying to bamboozle you, definitely pick this up.
- Leaving Microsoft to Change the World. A co-worker suggested this book to me. It’s the story of John Wood, a former Microsoft executive during the 90s glory days, who chucked his comfortable lifestyle and started a non-profit organization (Room to Read) with the mission of improving education in the poorest countries in the world. John’s epiphany came during a backpacking trip through Nepal and seeing the shocking lack of reading materials available to kids who desperately wanted to learn and lift themselves out of poverty. Even if the topic doesn’t move you, this book has a fascinating look at how to start up a global organization with a focused objective and a shoestring budget. This is one of those “perspective books” that I try and make sure I read from time to time.
- Microsoft .NET: Architecting Applications for the Enterprise. I actually had this book sent to me by a friend at Microsoft. Authored by Dino Esposito and Andrea Saltarello, this is an excellent look at software architecture. It starts off with a very clear summary of what architecture really is, and raised a point that struck home for me: architecture should be about the “hard decisions.” An architect isn’t going to typically get into the weeds on every project, but instead should be seeking out the trickiest or most critical parts of a proposed solution and focus their energies there. The book contains a good summary of core architecture patterns and spends much of the time digging into how to design a business layer, data access layer, service layer, and presentation layer. Clearly this book has a Microsoft bent, but, don’t discount it as a valid introduction to architecture for any technologist. They address a wide set of core principles that are technology agnostic in a well-written fashion.
I’m trying to queue up some books for my company’s annual “summer shutdown” and always looking for suggestions. Technology, sports, erotic thrillers, you name it.
June 19, 2009
Interview Series: Four Questions With … Charles Young

This month’s chat in my ongoing series of discussions with “connected systems” thought leaders is with Charles Young. Charles is a steady blogger, Microsoft MVP, consultant for Solidsoft Ltd, and all-around exceptional technologist.

Those of you who read Charles’ blog regularly know that he is famous for his articles of staggering depth which leave the reader both exhausted and noticeably smarter. That’s a fair trade off to me.

Let’s see how Charles fares as he tackles my Four Questions.

Q: I was thrilled that you were a technical reviewer of my recent book on applying SOA patterns to BizTalk solutions. Was there anything new that you learned during the read of my drafts? Related to the book’s topic, how do you convince EAI-oriented BizTalk developers to think in a more “service bus” sort of way?

A: Well, actually, it was very useful to read the book.    I haven’t really had as much real-world experience as I would like of using the WCF features introduced in BTS 2006 R2.   The book has a lot of really useful tips and potential pitfalls that are, I assume, drawn from real life experience.    That kind of information is hugely valuable to readers…and reviewers.

With regard to service buses, developers tend to be very wary of TLAs like ‘ESB’. My experience has been that IT management are often quicker to understand the potential benefits of implementing service bus patterns, and that it is the developers who take some convincing.   IT managers and architects are thinking about overall strategy, whereas the developers are wondering how they are going to deliver on the requirements of the current project   I generally emphasise that ‘ESB’ is about two things –first, it is about looking at the bigger picture, understanding how you can exploit BizTalk effectively alongside other technologies like WCF and WF to get synergy between these different technologies, and second, it is about first-class exploitation of the more dynamic capabilities of BizTalk Server.   If the BizTalk developer is experienced, they will understand that the more straight-forward approaches they use often fail to eliminate some of the more subtle coupling that may exist between different parts of their BizTalk solution.    Relating ESB to previously-experienced pain is often a good way to go.

Another consideration is that, although BizTalk has very powerful dynamic capabilities, the basic product hasn’t previously provided the kind of additional tooling and metaphors that make it easy to ‘think’ and implement ESB patterns.   Developers have enough on their plates already without having to hand-craft additional code to do things like endpoint resolution.   That’s why the ESB Toolkit (due for a new release in a few weeks) is so important to BizTalk, and why, although it’s open source, Microsoft are treating it as part of the product.   You need these kinds of frameworks if you are going to convince BizTalk developers to ‘think’ ESB.

Q: You’ve written extensively on the fundamentals of business rules and recently published a thorough assessment of complex event processing (CEP) principles. These are two areas that a Microsoft-centric manager/developer/architect may be relatively unfamiliar given Microsoft’s limited investment in these spaces (so far). Including these, if you’d like, what are some industry technologies that interest you but don’t have much mind share in the Microsoft world yet? How do you describe these to others?

A: Well, I’ve had something of a focus on rules for some time, and more recently I’ve got very interested in CEP, which is, in part, a rules-based approach.    Rule processing is a huge subject.   People get lost in the detail of different types of rules and different applications of rule processing.   There is also a degree of cynicism about using specialised tooling to handle rules.   The point, though, is that the ability to automate business processes makes little sense unless you have a first-class capability to externalise business and technical policies and cleanly separate them from your process models, workflows and integration layers.    Failure to separate policy leads directly to the kind of coupling that plagues so may solutions.   When a policy changes, huge expense is incurred in having to amend and change the implemented business processes, even though the process model may not have changed at all.   So, with my technical architect’s hat on, rule processing technology is about effective separation of concerns.

If readers remain unconvinced about the importance of rules processing, consider that BizTalk Server is built four-square on a rules engine – we call it the ‘pub-sub’ subscription model which is exploited via the message agent.   It is fundamental to decoupling of services and systems in BizTalk.    Subscription rules are externalised and held in a set of database tables.   BizTalk Server provides a wide range of facilities via its development and administrative tools for configuring and managing subscription rules.   A really interesting feature is that way that BizTalk Server injects subscription rules dynamically into the run-time environment to handle things like correlation onto existing orchestration instances.

Externalisation of rules is enabled through the use of good frameworks, repositories and tooling.    There is a sense in which rule engine technology itself is of secondary importance.   Unfortunately, no one has yet quite worked out how to fully separate the representation of rules from the technology that is used to process and apply rules.   MS BRE uses the Rete algorithm.   WF Rules adopts a sequential approach with optional forward chaining.    My argument has been that there is little point in Microsoft investing in a rules processing technology (say WF Rules) unless they are also prepared to invest in the frameworks, tooling and repositories that enable effective use of rules engines.

As far as CEP is concerned, I can’t do justice to that subject here.   CEP is all about the value bound up in the inferences we can draw from analysis of diverse events.   Events, themselves, are fundamental to human experience, locked as we are in time and space.   Today, CEP is chiefly associated with distinct verticals – algorithmic trading systems in investment banks, RFID-based manufacturing processes, etc.   Tomorrow, I expect it will have increasingly wider application alongside various forms of analytics, knowledge-based systems and advanced processing.   Ironically, this will only happen if we figure how to make it really simple to deal with complexity. If we do that, then with the massive amount of cheap computing resource that will be available in the next few years all kinds of approaches that used to be niche interests, or which were pursed only in academia, will begin to come together and enter the mainstream.   When customers start clamouring for CEP facilities and advanced analytics in order to remain competitive, companies like Microsoft will start to deliver.   It’s already beginning to happen.

Q: If we assume that good architects (like yourself) do not live in a world of uncompromising absolutes, but rather understand that the answer to most technical questions contain “it depends”, what is an example of a BizTalk solution you’ve built that might raise the eyebrows of those without proper context, but make total sense given the client scenario.

A: It would have been easier to answer the opposite question.   I can think of one or two BizTalk applications where I wish I had designed things differently, but where no one has ever raised an eyebrow.   If it works, no one tends to complain!

To answer your question, though, one of the more complex designs I worked on was for a scenario where the BizTalk system had only to handle a few hundred distinct activities a day, but where an individual message might represent a transaction worth many millions of pounds (I’m UK-based).   The complexity lay in the many different processes and sub-processes that were involved in handling different transactions and business lines, the fact that each business activity involved a redemption period that might extend for a few days, or as long as a year, and the likelihood that parts of the process would change during that period, requiring dynamic decisions to be made as to exactly which version of which sub-process must be invoked in any given situation.   The process design was labyrinthine, but we needed to ensure that the implementation of the automated processes was entirely conformant to the detailed process designs provided by the business analysts.   That meant traceability, not just in terms of runtime messages and processing, but also in terms of mapping orchestration implementation directly back to the higher-level process definitions.    I therefore took the view that the best design was a deeply layered approach in which top-level orchestrations were constructed with little more that Group and Send orchestration shapes, together with some Decision and Loop shapes, in order to mirror the highest-level process definition diagrams as closely as possible.   These top-level orchestrations would then call into the next layer of orchestrations which again closely resembled process definition diagrams at the next level of detail.   This pattern was repeated to create a relatively deep tree structure of orchestrations that had to be navigated in order to get to the finest-level of functional granularity.    Because the non-functional requirements were so light-weight (a very low volume of messages with no need for sub-second responses, or anything like that), and because the emphasis was on correctness and strict conformance process definition and policy, I traded the complexity of this deep structure against the ability to trace very precisely from requirements and design through to implementation and the facility to dynamically resolve exactly which version of which sub-process would be invoked in any given situation using business rules.

I’ve never designed any other BizTalk application in quite the same way, and I think anyone taking a casual look at it would wonder which planet I hail from.   I’m the first to admit the design looked horribly over-engineered, but I would strongly maintain that it was the most appropriate approach given the requirements.   Actually, thinking about it, there was one other project where I initially came up with something like a mini-version of that design.   In the end, we discovered that the true requirements were not as complex as the organisation had originally believed, and the design was therefore greatly simplified…by a colleague of mine…who never lets me forget!

Q [stupid question]: While I’ve followed Twitter’s progress since the beginning, I’ve resisted signing up for as long as I can. You on the other hand have taken the plunge. While there is value to be extracted by this type of service, it’s also ripe for the surreal and ridiculous (e.g. Tweets sent from toilets, a cat with 500,000 followers). Provide an example of a made-up silly use of a Twitter account.

A: I resisted Twitter for ages.   Now I’m hooked.   It’s a benign form of telepathy – you listen in on other people’s thoughts, but only on their terms.    My suggestion for a Twitter application?   Well, that would have to be marrying Wolfram|Alpha to Twitter, using CEP and rules engine technologies, of course.    Instead of waiting for Wolfram and his team to manually add enough sources of general knowledge to make his system in any way useful to the average person, I envisage a radical departure in which knowledge is derived by direct inference drawn from the vast number of Twitter ‘events’ that are available.   Each tweet represents a discrete happening in the domain of human consciousness, allowing us to tap directly into the very heart of the global cerebral cortex.   All Wolfram’s team need to so is spend their days composing as many Twitter searches as they can think of and plugging them into a CEP engine together with some clever inferencing rules.   The result will be a vast stream of knowledge that will emerge ready for direct delivery via Wolfram’s computation engine.   Instead of being limited to comparative analysis of the average height of people in different countries whose second name starts with “P”, this vastly expanded knowledge base will draw only on information that has proven relevance to the human race – tweet epigrams, amusing web sites to visit, ‘succinct’ ideas for politicians to ponder and endless insight into the lives of celebrities.

Thanks Charles. Insightful as always.

Technorati Tags: BizTalk

June 1, 2009
Delivering and Surviving a Project’s Architecture Peer Review
Architecture reviews at my company are brutal. Not brutal in a bad way, per se, but in the sense that if you are not completely prepared and organized, you’ll leave with a slight limp and self doubt that you know anything about anything at any time, ever.

So what should an architect do when preparing to share their project and corresponding design with a group of their distinguished peers? I’ve compiled a short list that stems from my own failings as well as observations from the architecture bloodbaths involving other victims.

During the Project
- Be a critical thinker on your project. A vital part of the architect’s job in the early phases of a project is to challenge both assumptions and initial strategies. This can be difficult when the architect is deeply embedded within a project team and begins to lose a bit of perspective about the overall enterprise architecture objectives. It’s our responsibility to always wear the “architect” hat (and not slide into the “generic team member” hat) and keep a close watch on where the solution to the business problem is heading.
- Understand the reasons behind the vision and requirements of the projects. If an architect blindly accepts the scope and requirements of a project, there is a great chance that they will miss an opportunity to offer improvements or force the team to dig further into the need for a particular capability request or even the project itself. We can only confidently know that a new system will actually satisfy business complaints if we’ve fully digested their underlying needs. What’s the business problem? What is the current solution failing to do? I’ve come across many cases where a delivered requirement was actually something we eventually discovered was either (a) something to address a negative behavior of the legacy solution that wouldn’t be relevant in a new solution, (b) a technology requirement for what is actually a business process problem, or (c) a requirement that was dictating a solution versus addressing a core business issue. We can only determine the validity of a requirement by fully understanding the context of the problem and the stakeholders involved.
- Know the project’s team structure, timeline and work streams. The architect needs to intimately know who the key members of the team are, what their roles are, and the overall plan for project delivery. This helps us better align with other enterprise initiatives as well as communicate when important parts of the solution will begin to come online.
Preparing and Delivering the Review
- Know your audience and their expertise. Our architecture team contains serious horsepower in the areas of software engineering, computer science, infrastructure, data architecture, collaboration, security strategy and process modeling. This means that you can’t get away with glossing over areas relevant to an attendee or presenting half-baked concepts that bastardize a particular discipline.
- Explain the business vision and what role technology plays in solving the problem. One of the key objectives of an architecture peer review is sharing the business problem and providing enough context about the core issues being addressed to effectively explain why you’ve proposed a particular solution. Since most of us know that technology is typically not the only part of the problem, it’s important to call out the role of process improvement and logistics changes in a proposed solution.
- Don’t include any “fluff” slides or slides with content you don’t agree with. I’ve learned the hard way to not just inject individual slides from decks authored by my business counterparts unless I am fully on board with the content they produced. A good architecture team is looking for meaty content that makes them think, not vaguely relevant “business value” slides that contain non-measurable metrics or items I can’t share with a straight face.
- Be direct in any bullet points on the slides. Don’t beat around the bush in your slide bullets. Instead of saying “potential challenges with sourcing desired skill sets” you should say “we don’t have the right developers in place.” Or, saying something like “Solution will be more user friendly” means nothing, while “Solution will be built on a modern web platform that won’t crash daily and consolidates redundant features into a small set of application pages” actually begins to convey what you’re shooting for.
- Carefully select your vocabulary as to not misuse terms. When you have an experienced set of architects in the room, you have little wiggle room for using overloaded or inappropriate terms. For instance, I could use the term “data management” in a presentation to my project team without cause for alarm, but our data architects have a clear definition for “data management” that is NOT some sort of catch all for data-related activities. In an architecture meeting, terms like authentication, high availability, disaster recovery, reporting, reusability or service all have distinct meanings that must be properly used.
- Highlight the key business processes and system-oriented use cases. As you begin to convey the actual aspects of the solution, focus on the business process that represent what this thing is supposed to do. Of even more interest to this particular audience, highlight the system use cases and what the expected capabilities are. This properly frames the capabilities you need and helps the audience think about options for addressing them.
- Show a system dependency diagram. Since members of an architecture team are typically dispersed among projects all across the organization, they need to see where your solution fits in the enterprise landscape. Show your solution and at least the first level of systems that you depend on, or that depend on you.
- Know the specific types and formats of data that make up the solution. You can’t only say that this solution works with lots of data. What data? What entities, formats, sizes, sources are we talking about? Are these enterprise defined entities, local entities, entities that MAY be reusable outside the project?
- Explain critical interfaces. What are the key interfaces within the system and between this new system and existing ones? We need to share the direction, data, and strategy for exposing and consuming both data and functionality. It’s important to identify which interfaces are new ones vs. reused one.
- Spell out the key data sharing strategies employed. The means for HOW you actually share data is an absolutely critical part of the architect’s job. Are you sharing data through batch processing or a message bus? Or are you relying on a shared operational data store (ODS) that stores aggregated entities? Whether you share data through distributed copies, ODSs, or virtual aggregation services has a large impact on this system as well as other systems in the enterprise.
- List off existing in-house platforms and technologies that are being leveraged. This helps us outline what the software dependencies are, and which services or systems we were able to get reuse out of. It also creates discussion around why those platforms were chosen and alternatives that may offer a more effective solution.
- Outline the core design constraints and strategy. This is arguably the most important part of the review. We need to point out the “hard questions” and what our answers were. As for “constraints”, what are the confines in which we must build a solution? Do we have to use a specific product purchased by the business team? Are users of the system geographically dispersed and on mobile devices? Is the delivery timeline hyper-aggressive and requires a simplified approach? My strategy for a given solution reveals how I’ve answer the “hard questions” and which options I considered and how I reached my conclusions.
There you go. The primary reason that I enjoy my job so much is because I work with so many people smarter than me. Architecture reviews are a healthy part of the architect’s growth and development and only make us better on the next project. To make my future architecture reviews less frightening, I’m considering a complimentary strategy of “donuts for all” which should put my peers into a loopy, sugar-induced coma and enable me to sail through my presentation.
May 29, 2009
Building a RESTful Cloud Service Using .NET Services
On of the many actions items I took away from last week’s TechEd was to spend some time with the latest release of the .NET Services portion of the Azure platform from Microsoft. I saw Aaron Skonnard demonstrate an example of a RESTful, anonymous cloud service, and I thought that I should try and build the same thing myself. As an aside, if you’re looking for a nice recap of the “connected system” sessions at TechEd, check out Kent Weare’s great series (Day1, Day2, Day3, Day4, Day5).

So what I want is a service, hosted on my desktop machine, to be publicly available on the internet via .NET Services. I’ve taken the SOAP-based “Echo” example from the .NET Services SDK and tried to build something just like that in a RESTful fashion.

First, I needed to define a standard WCF contract that has the attributes needed for a RESTful service.
```
using System.ServiceModel;
using System.ServiceModel.Web;

namespace RESTfulEcho
{
    [ServiceContract(
        Name = "IRESTfulEchoContract", 
        Namespace = "http://www.seroter.com/samples")]
    public interface IRESTfulEchoContract
    {
        [OperationContract()]
        [WebGet(UriTemplate = "/Name/{input}")]
        string Echo(string input);
    }
}
```
In this case, my UriTemplate attribute means that something like http://<service path>/Name/Richard should result in the value of “Richard” being passed into the service operation.

Next, I built an implementation of the above service contract where I simply echo back the name passed in via the URI.
```
using System.ServiceModel;

namespace RESTfulEcho
{
    [ServiceBehavior(
        Name = "RESTfulEchoService", 
        Namespace = "http://www.seroter.com/samples")]
    class RESTfulEchoService : IRESTfulEchoContract
    {
        public string Echo(string input)
        {
            //write to service console
            Console.WriteLine("Input name is " + input);

            //send back to caller
            return string.Format(
                "Thanks for calling Richard's computer, {0}", 
                input);
        }
    }
}
```
Now I need a console application to act as my “on premises” service host. The .NET Services Relay in the cloud will accept the inbound requests, and securely forward them to my machine which is nestled deep within a corporate firewall. On this first pass, I will use a minimum amount of service code which doesn’t even explicitly include service host credential logic.
```
using System.ServiceModel;
using System.ServiceModel.Web;
using System.ServiceModel.Description;
using Microsoft.ServiceBus;

namespace RESTfulEcho
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Host starting ...");

            Console.Write("Your Solution Name: ");
            string solutionName = Console.ReadLine();

            // create the endpoint address in the solution's namespace
            Uri address = ServiceBusEnvironment.CreateServiceUri(
                "http", 
                solutionName, 
                "RESTfulEchoService");

            //make sure to use WEBservicehost
            WebServiceHost host = new WebServiceHost(
                typeof(RESTfulEchoService), 
                address);

            host.Open();

            Console.WriteLine("Service address: " + address);
            Console.WriteLine("Press [Enter] to close ...");

            Console.ReadLine();

            host.Close();
        }
    }
}
```
So what did I do there? First, I asked the user for the solution name. This is the name of the solution set up when you register for your .NET Services account.

Once I have that solution name, I use the Service Bus API to create the URI of the cloud service. Based on the name of my solution and service, the URI should be:

http://richardseroter.servicebus.windows.net/RESTfulEchoService.

Note that the URI template I set up in the initial contract means that a fully exercised URI would look like:

http://richardseroter.servicebus.windows.net/RESTfulEchoService/Name/Richard

Next, I created an instance of the WebServiceHost. Do not use the standard “ServiceHost” object for a RESTful service. Otherwise you’ll be like me and waste way too much time trying to figure out why things didn’t work. Finally, I open the host and print out the service address to the caller.

Now, nowhere in there are my .NET Services credentials applied. Does this mean that I’ve just allowed ANYONE to host a service on my solution? Nope. The Service Bus Relay service requires authentication/authorization and if none is provided here, then a Windows CardSpace card is demanded when the host is started up. In my Access Control Service settings, you can see that I have a Windows CardSpace card associated with my .NET Services account.

Finally, I need to set up my service configuration file to use the new .NET Services WCF bindings that know how to securely communicate with the cloud (and hide all the messy details from me). My straightforward configuration file looks like this:
```
<configuration>
  <system.servicemodel>
      <bindings>
          <webhttprelaybinding>
              <binding opentimeout="00:02:00" name="default">
                  <security relayclientauthenticationtype="None" />
              </binding>
          </webhttprelaybinding>
      </bindings>
      <services>
          <service name="RESTfulEcho.RESTfulEchoService">
              <endpoint name="RelayEndpoint" 
	      address="" contract="RESTfulEcho.IRESTfulEchoContract" 
	      bindingconfiguration="default" 
	      binding="webHttpRelayBinding" />
          </service>
      </services>
  </system.servicemodel>
</configuration>
```
Few things to point out here. First, notice that I use the webHttpRelayBinding for the service. Besides my on-premises host, this is the first mention of anything cloud-related. Also see that I explicitly created a binding configuration for this service and modified the timeout value from the default of 1 minute up to 2 minutes. If I didn’t do this, I occasionally got an “Unable to establish Web Stream” error. Finally, and most importantly to this scenario, see the RelayClientAuthenticationType is set to None which means that this service can be invoked anonymously.

So what happens when I press “F5” in Visual Studio? After first typing in my solution name, I am asked to chose a Windows Card that is valid for this .NET Services account. Once selected, those credentials are sent to the cloud and the private connection between the Relay and my local application is established.

I can now open a browser and ping this public internet-addressable space and see a value (my dog’s name) returned to the caller, and, the value printed in my local console application.

Neato. That really is something pretty amazing when you think about it. I can securely unlock resources that cannot (or should not) be put into my organization’s DMZ, but are still valuable to parties outside our local network.

Now, what happens if I don’t want to use Windows CardSpace for authentication? No problem. For now (until .NET Services is actually released and full ADFS federation is possible with Geneva), the next easiest thing to do is apply username/password authorization. I updated my host application so that I explicitly set the transport credentials:
```
static void Main(string[] args)
 {
   Console.WriteLine("Host starting ...");

   Console.Write("Your Solution Name: ");
   string solutionName = Console.ReadLine();
   Console.Write("Your Solution Password: ");
   string solutionPassword = ReadPassword();

   // create the endpoint address in the solution's namespace
   Uri address = ServiceBusEnvironment.CreateServiceUri(
       "http", 
       solutionName, 
       "RESTfulEchoService");

   // create the credentials object for the endpoint
  TransportClientEndpointBehavior userNamePasswordServiceBusCredential= 
       new TransportClientEndpointBehavior();
  userNamePasswordServiceBusCredential.CredentialType = 
       TransportClientCredentialType.UserNamePassword;
  userNamePasswordServiceBusCredential.Credentials.UserName.UserName= 
       solutionName;
  userNamePasswordServiceBusCredential.Credentials.UserName.Password= 
       solutionPassword;

   //make sure to use WEBservicehost
   WebServiceHost host = new WebServiceHost(
       typeof(RESTfulEchoService), 
       address);
   host.Description.Endpoints[0].Behaviors.Add(
	userNamePasswordServiceBusCredential);

   host.Open();

   Console.WriteLine("Service address: " + address);
   Console.WriteLine("Press [Enter] to close ...");

   Console.ReadLine();

   host.Close();
}
```
Now, I have a behavior explicitly added to the service which contains the credentials needed to successfully bind my local service host to the cloud provider. When I start the local host again, I am prompted to enter credentials into the console. Nice.

One last note. It’s probably stupidity or ignorance on my part, but I was hoping that, like the other .NET Services binding types, that I could attach a ServiceRegistrySettings behavior to my host application. This is what allows me to add my service to the ATOM feed of available services that .NET Services exposes to the world. However, every time that I add this behavior to my service endpoint above, my service starts up but fails whenever I call it. I don’t have the motivation to currently solve that one, but if there are restrictions on which bindings can be added to the ATOM feed, that’d be nice to know.

So, there we have it. I have a application sitting on my desktop and if it’s running, anyone in the world could call it. While that would make our information security team pass out, they should be aware that this is a very secure way to expose this service since the cloud-based relay has hidden all the details of my on-premises application. All the public consumer knows is a URI in the cloud that the .NET Services Relay then bounces to my local app.

As I get the chance to play with the latest bits in this release of .NET Services, I’ll make sure to post my findings.

Technorati Tags: .NET Services, Cloud Computing, SOA
May 18, 2009
TechEd 2009: Day 2 Session Notes (CEP First Look!)
Missed the first session since Los Angeles traffic is comical and I thought “side streets” was a better strategy than sitting still on the freeway. I was wrong.

Attended a few sessions today, with the highlight for me being the new complex event processing engine that’s part of SQL Server 2008 R2. Find my notes below from today’s session.

BizTalk Goes Mobile : Collecting Physical World Events from Mobile Devices

I have admittedly spent virtually no time looking at the BizTalk RFID bits, but working for a pharma company, there are plenty of opportunities to introduce supply chain optimization that both increase efficiency and better ensure patient safety.
- You have the “systems world” where things are described (how many items exist, attributes), but there is the “real world” where physical things actually exist
  - Can’t find products even though you know they are in the store somewhere
  - Retailers having to close their stores to “do inventory” because they don’t know what they actually have
- Trends
  - 10 percent of patients given wrong medication
  - 13 percent of US orders have wrong item or quantity
- RFID
  - Provide real time visibility into physical world assets
  - Put unique identifier on every object
    
    E.g. tag on device in box that syncs with receipt so can know if object returned in a box actually matches the product ordered (prevent fraud)
  - Real time observation system for physical world
  - Everything that moves can be tracked
- BizTalk RFID Server
  - Collects edge events
  - Mobile piece runs on mobile devices and feeds the server
  - Manage and monitor devices
  - Out of the box event handlers for SQL, BRE, web services
  - Direct integration with BizTalk to leverage adapters, orchestration, etc
  - Extendible driver model for developers
  - Clients support “store and forward” model
- Supply Chain Demonstration
  - Connected RFID reader to WinMo phone
    
    Doesn’t have to couple code to a given device; device agnostic
  - Scan part and sees all details
  - Instead of starting with paperwork and trying to find parts, started with parts themselves
  - Execute checklist process with questions that I can answer and even take pictures and attach
- RFID Mobile
  - Lightweight application platform for mobile devices
  - Enables rapid hardware agnostic RFID and Barcode mobile application development
  - Enables generation of software events from mobile devices (events do NOT have to be RFID events)
- Questions:
  - How receive events and process?
    
    Create “DeviceConnection” object and pass in module name indicating what the source type is
    Register your handler on the NotificationEvent
    Open the connection
    Process the event in the handler
  - How send them through BizTalk?
    
    Intermittent connectivity scenario supported
    Create RfidServerConnector object
    Initialize it
    Call post operation with the array of events
  - How get those events from new source?
    
    Inherit DeviceProvider interface and extend the PhysicalDeviceProxy class
Low Latency Data and Event Processing with Microsoft SQL Server

I eagerly anticipated this session to see how much forethought Microsoft put into their first CEP offering. This was a fairly sparsely attended session, which surprised me a bit. That, and the folks who ended up leaving early, apparently means that most people here are unaware of this problem/solution space, and don’t immediately grasp the value. Key Takeaway: This stuff has a fairly rich set of capabilities so far and looks well thought out from a “guts” perspective. There’s definitely a lot of work left to do, and some things will probably have to change, but I was pretty impressed. We’ll see if Charles agrees, based on my hodge podge of notes 😉
- Call CEP the continuous and incremental processing of event streams from multiple sources based on declarative query and pattern specifications with near-zero latency.
- Unlike DB app with ad hoc queries that have range of latency from seconds/hours/days and hundreds of events per second, with event driven apps, have continuous standing queries with latency measured in milliseconds (or less) and up to tens of thousands of events per second (or more).
- As latency requirements become stricter, or data rates reach a certain point, then most cost effective solution is not standard database application
  - This is their sweet spot for CEP scenarios
- Example CEP scenarios …
  - Manufacturing (sensor on plant floor, react through device controllers, aggregate data, 10,000 events per second); act on patterns detected by sensors such as product quality
  - Web analytics, instrument server to capture click-stream data and determine online customer behavior
  - Financial services listening to data feeds like news or stocks and use that data to run queries looking for interesting patterns that find opps to buy or sell stock; need super low latency to respond and 100,000 events per second
  - Power orgs catch energy consumption and watch for outages and try to apply smart grids for energy allocation
  - How do these scenarios work?
    
    Instrument the assets for data acquisitions and load the data into an operational data store
    Also feed the event processing engine where threshold queries, event correlation and pattern queries are run over the data stream
    Enrich data from data streams for more static repositories
  - With all that in place, can do visualization of trends with KPI monitoring, do automated anomaly detection, real-time customer segmentation, algorithmic training and proactive condition-based maintenance (e.g. can tell BEFORE a piece of equipment actually fails)
- Cycle: monitor, manage, mine
  - General industry trends (data acquisition costs are negligible, storage cost is cheap, processing cost is non-negligible, data loading costs can be significant)
  - CEP advantages (process data incrementally while in flight, avoid loading while still doing processing you want, seamless querying for monitoring, managing and mining
- The Microsoft Solution
  - Has a circular process where data is captured, evaluated against rules, and allows for process improvement in those rules
- Deployment alternatives
  - Deploy at multiple places on different scale
  - Can deploy close to data sources (edges)
  - In mid tier where consolidate data sources
  - At data center where historical archive, mining and large scale correlation happens
- CEP Platform from Microsoft
  - Series of input adapters which accept events from devices, web servers, event stores and databases; standing queries existing in the CEP engine and also can access any static reference data here; have output adapters for event targets such as pagers and monitoring devices, KPI dashboards, SharePoint UIs, event stores and databases
  - VS 2008 are where event driven apps are written
  - So from source, through CEP engine, into event targets
  - Can use SDK to write additional adapters for input or output adapters
    
    Capture in domain format of source and transform to canonical format that the engine understands
  - All queries receive data stream as input, and generate data stream as output
  - Queries can be written in LINQ
- Events
  - Events have different temporal characteristics; may be point in time events, interval events with fixed duration or interval events with initially known duration
  - Rich payloads cpature all properties of an event
- Event types
  - Use the .NET type system
  - Events are structured and can have multiple fields
  - Each field is strongly typed using .NET framework type
  - CEP engine adds metadata to capture temporal characteristics
  - Event SOURCES populate time stamp fields
- Event streams
  - Stream is a possibly infinite series of events
    
    Inserting new events
    Changes to event durations
  - Stream characteristics
    
    Event/data arrival patterns
    
    Steady rate with end of stream indication (e.g. files, tables)
    Intermittent, random or burst (e.g. retail scanners, web)
    
    Out of order events
    
    CEP engine does the heavy lifting when dealing with out-of-order events
- Event stream adapters
  - Design time spec of adapter
    
    For event type and source/sink
    Methods to handle event and stream behavior
    Properties to indicate adapter features to engine
    
    Types of events, stream properties, payload spec
- Core CEP query engine
  - Hosts “standing queries”
    
    Queries are composable
    Query results are computed incrementally
  - Query instance management (submit, start, stop, runtime stats)
- Typical CEP queries
  - Complex type describes event properties
  - Grouping, calculation, aggregation
  - Multiple sources monitored by same query
  - Check for absence of data
- CEP query features …
  - Calculations
  - Correlation of streams (JOIN)
  - Check for absence (EXISTS)
  - Selection of events from stream (FILTER)
  - Aggregation (SUM, COUNT)
  - Ranking (TOP-K)
  - Hopping or sliding windows
  - Can add NEW domain-specific operators
  - Can do replay of historical data
- LINQ examples shown (JOIN, FILTER)
from e1 in MyStream1

join e2 in MyStream2

e1.ID equals e2.ID

where e1.f2 = “foo”

select new { e1.f1, e2.f4)
- Extensibility
  - Domain specific operators, functions, aggregates
  - Code written in .NET and deployed as assembly
  - Query operations and LINQ queries can refer to user defined things
- Dev Experience
  - VS.NET as IDE
  - Apps written in C#
  - Queries in LINQ
- Demos
  - Listening on power consumption events from laptop with lots of samples per second
  - Think he said that this client app was hosting the CEP engine in process (vs. using a server instance)
  - Uses Microsoft.ComplexEventProcessing namespace (assembly?)
  - Shows taking initial stream of just getting all events, and instead refining (through Intellisense!) query to set a HoppingWindow attribute of 1 second. He then aggregates on top of that to get average of the stream every second.
    
    This all done (end to end) with 5 total statements of code
  - Now took that code, and replaced other aggregation with new one that does grouping by ID and then can aggregate by each group separately
  - Showed tool with visualized query and you can step through the execution of that query as it previous ran; can set a breakpoint with a condition (event payload value) and run tool until that scenario reached
    
    Can filter each operator and only see results that match that query filter
    Can right click and do “root cause analysis” to see only events that potentially contributed to the anomaly result
- Same query can be bound to different data sources as long as they deliver the required event type
  - If new version of upstream device became available, could deploy new adapter version and bind it to new equipment
- Query calls out what data type it requires
- No changes to query necessary for reuse if all data sources of same type
- Query binding is a configuration step (no VS.NET)
- Recap: Event driven apps are fundamentally different from traditional database apps because queries are continuous, consume and produce streams and compute results incrementally
- Deployment scenarios
  - Custom CEP app dev that uses instance of engine to put app on top of it
  - Embed CEP in app for ISVs to deliver to customers
  - CEP engine is part of appliance embedded in device
  - Put CEP engine into pipeline that populates data warehouse
- Demo from OSIsoft
I have lots of questions after this session. I’m not fully grasping the role of the database (if at all). Didn’t show much specifically around the full lifecycle (rules, results, knowledge, rule improvement), so I’d like to see what my tooling is for this. Doesn’t look like much business tooling is part of the current solution plan which might hinder doing any business driven process improvement. Liked the LINQ way of querying, and I could see someone writing a business friendly DSL on top.

All in all, this will be fun to play with once it’s available. When is that? SQL team tells us that we’ll have a TAP in July 2009 with product availability targeted for 1H 2010.
May 12, 2009
TechEd 2009: Day 1 Session Notes
Good first day. Keynote was relatively interesting (even though I don’t fully understand why the presenters use fluffy “CEO friendly” slides and language in a room of techies) and had a few announcements. The one that caught my eye was the public announcement of the complex event processing (CEP) engine being embedded in SQL Server 2008 R2. In my book I talk about CEP and apply the principles to a BizTalk solution. However, I’m much happier that Microsoft is going to put a real effort into this type of solution instead of the relative hack that I put together. The session at TechEd on this topic is Tuesday. Expect a write up from me.

Below are some of the session notes from what I attended today. I’m trying to balance sessions that interest me intellectually, and sessions that help me actually do my job better. In the event of a tie, I choose the latter.

Data Governance: A Solution to Privacy Issues

This session interested me because I work for a healthcare organization and we have all sorts of rules and regulations that direct how we collect, store and use data. Key Takeaway: New website from Microsoft on data governance at http://www.microsoft.com/datagovernance.
- Low cost of storage and needs to extend offerings with new business models have led to unprecedented volume of data stored about individuals
- You need security to achieve privacy, but security is not a guarantee of privacy
- Privacy, like security, has to be embedded into application lifecycle (not a checkbox to “turn on” at the end)
- Concerns
  - Data breach …
  - Data retention
    
    66% of data breaches in 2008 involved data that was not known to reside on the affected system at the time of incident
    
    Verizon report
- Statutory and Regulatory Landscape
  - In EU, privacy is a fundamental right
    
    Defined in 95/46/EC
    
    Defines rules for transfer of personal data across member states’ borders
    
    Data cannot be transported outside of EU unless citizen gives consent or legal framework, like Safe Harbor, is in place
    
    Switzerland, Canada and Argentina have legal framework
    US has “Safe Harbor” where agreement is signed with US Dept of Commerce which says we will comply with EU data directives
    
    Even data that may individually not identify you, but if aggregated, might lead you to identify an individual; can’t do this as still considered “personal data”
  - In US, privacy is not a fundamental right
    
    Unlike EU, in US you have patchwork of federal laws specific to industries, or specific to a given law (like data breach notification)
    Personally identifiable information (PII) – info which can be used to distinguish or trace an individual’s identity
    
    Like SSN, or drivers license #
  - In Latin America, some countries have adopted EU-style data protection legislation
  - In Asia, there are increased calls for unified legislation
- How to cope with complexity?
  - Standards
    
    ISO/IEC CD 29100 information technology – security techniques – privacy framework
    
    How to incorp. best practices and how to make apps with privacy in mind
    
    NIST SP 800-122 (Draft) – guidelines for gov’t orgs to identify PII that they might have and provides guidelines for how to secure that information and plan for data breach incident
  - Standards tell you WHAT to do, but not HOW
- Data governance
  - Exercise of decision making and authority for data related matters (encompasses people, process and IT required for consistent and proper handling across the enterprise)
  - Why DG?
    
    Maximize benefits from data assets
    
    Improve quality, reliability and availability
    Establish common data definitions
    Establish accountability for information quality
    
    Compliance
    
    Meet obligations
    Ensure quality of compliance related data
    Provide flexibility to respond to new compliance requirements
    
    Risk Management
    
    Protection of data assets and IP
    Establish appropriate personal data use to optimally balance ROI and risk exposure
  - DG and privacy
    
    Look at compliance data requirements (that comes from regulation) and business data requirements
    Feeds the strategy made up of documented policies and procedure
    ONLY COLLECT DATA REQUIRED TO DO BUSINESS
    
    Consider what info you ask of customers and make sure it has a specific business use
- Three questions
  - Collecting right data aligned with business goals? Getting proper consent from users?
  - Managing data risk by protecting privacy if storing personal information
  - Handling data within compliance of rules and regulations that apply
- Think about info lifecycle
  - How is data collected, processed and shared and who has access to it at each stage?
    
    Who can update? How know about access/quality of attribute?
    What sort of processing will take place, and who is allowed to execute those processes?
    What about deletion? How does removal of data at master source cascade?
    New stage: TRANSFER
    
    Starts whole new lifecycle
    Move from one biz unit to another, between organizations, or out of data center and onto user laptop
- Data Governance and Technology Framework
  - Secure infrastructure – safeguard against malware, unauthorized access
  - Identity and access control
  - Information protection – while at risk, or while in transit; protecting both structured and unstructured data
  - Auditing and reporting – monitoring
- Action plan
  - Remember that technology is only part of the solution
  - Must catalog the sensitive info
  - Catalog it (what is the org impact)
  - Plan the technical controls
    
    Can do a matrix with stages on left (collect/update/process/delete/transfer/storage) and categories at top (infrastructure, identity and lifecycle, info protection, auditing and reporting)
    For collection, answers across may be “secure both client and web”, “authN/authZ” and “encrypt traffic”
    
    Authentication and authorization
    
    For update, may log user during auditing and reporting
    For process, may secure host (infra) and “log reason” in audit/reporting
- Microsoft Privacy Standard for Developers 3.1
  - Guidelines for creating notice and consent into applications
  - Best practices for privacy that MS follows
- Other tools
  - IT Compliance Management Guide
    
    Compliance Planning Guide (Word)
    Compliance Workbook (Excel)
- Data Governance Website
  - www.microsoft.com/datagovernance
Programming Microsoft .NET Services

I hope to spend a sizeable amount of time this year getting smarter on this topic, so Aaron’s session was a no-brainer today. Of course I’ll be much happier if I can actually call the damn services from the office (TCP ports blocked). Must spend time applying the HTTP ONLY calling technique. Key Takeaway: Dig into queues and routers and options in their respective policies and read the new whitepapers updated for the recent CTP release.
- Initial focus of the offering is on three key developer challenges
  - Application integration and connectivity
    
    Communication between cloud and on-premises apps
    Clearly we’ve solved this problem in some apps (IM, file sharing), but lots of plumbing we don’t want to write
  - Access control (federation)
    
    How can our app understand the various security tokens and schemes present in our environment and elsewhere?
  - Message orchestration
    
    Coordinate activities happening across locations centrally
- .NET Service Bus
  - What’s the challenge?
    
    Give external users secure access to my apps
    Unknown scale of integration or usage
    Services may be running behind firewalls not typically accessible from the outside
  - Approach
    
    High scale, high availability bus that supports open Internet protocols
  - Gives us global naming system in the cloud and don’t have to deal with lack of IP v4 available addresses
  - Service registry provides mapping from URIs to service
    
    Can use ATOM pub interface to programmatically push endpoint entries to the cloud
  - Connectivity through relay or direct connect
    
    Relay means that you actually go through the relay service in the bus
    For direct, the relay helps negotiate a direct connection between the parties
  - The NetOneWayRelayBinding and NetEventRelayBinding don’t have a OOB WCF binding comparison, but both are set up for the most aggressive network traversal of the new bindings
  - For standard (one way) relay, need TCP 828 open on the receiver side (one way messages through TCP tunnel)
  - Q: Do relay bindings encrypt username/pw credentials sent to the bus? Must be through ACS.
  - Create specific binding config for binding in order to set connection mode
  - Have new “connectionstatechangedevent” so that client can respond to event after connection switches from relay to direct connection as result of relay negotiations based on “direct” binding config value
    
    Similar thing happens with IM when exchanging files; some clients are smart enough to negotiate direct connections after the session is established
  - Did quick demo showing performance of around 900 messages per second until the auto switch to direct when all of sudden we saw 2600+ messages per second
  - For multi-cast binding (netEventRelayBinding), need same TCP ports open on receivers
  - How deal with durability for unavailable subscribers? Answer: queues
  - Now can create queue in SB account, and clients can send messages and listeners pull, even if online at different times
    
    Can set how long queue lives using queue policy
    Also have routers using router policy; now you can set how you want to route messages to listeners OR queues; sets a distribution policy and say distribute to “all” or “one” through a round-robin
    Routers can feed queues or even other routers
- .NET Access Control Service
  - Challenges
    
    Support many identities, tokens and such without your app having to know them all
  - Approach
    
    Automate federation through hosted STS (token service)
    Model access control as rules
  - Trust established between STS and my app and NOT between my app and YOUR app
  - STS must transform into a claim consumable by your app (it really just does authentication (now) and transform claims)
  - Rules are set via web site or new management APIs
    
    Define scopes, rules, claim types and keys
  - When on solution within management portal, manage scopes; set your solution; if pick workflow, can manage in additional interface;
    
    E.g. For send rule, anytime there is a username token with X (and auth) then produce output claim with value of “Send”
    Service bus is looking at “send” and “listen” rules
  - Note that you CAN do unauthenticated senders
- .NET Workflow Service
  - Challenge
    
    Describe long-running processes
  - Approach
    
    Small layer of messaging orchestration through the service bus
May 11, 2009