How to validate Xml Documents against schemas in BizTalk

最新推荐文章于 2022-05-19 00:17:57 发布

proxyman

最新推荐文章于 2022-05-19 00:17:57 发布

阅读量1.7k

点赞数

分类专栏： BizTalk Server 2006 R2 文章标签： xml validation schema service properties exception

BizTalk Server 2006 R2 专栏收录该内容

8 篇文章

订阅专栏

Copy from：http://www.bizbert.com/bizbert/2007/09/01/How+To+Validate+Xml+Documents+Against+Schemas+In+BizTalk.aspx

How to validate Xml Documents against schemas in BizTalk

I got asked a question the other day: How would you validate an incoming message against a schema if the message was the request part of a request-response pair and you wanted to return a response if the request wasn't valid?

In the example given, an orchestration had been exposed as a web service, and the requirement was to validate the incoming message. If the message did not validate they wanted to return a response message with an error message in it.

I gave two of the ways I would do it, but that wasn't what they were expecting: they were expecting the simplest (and computationally slowest) way of doing it. And I realised that many people use this mechanism as they don't know there's any other way.

Why do I say this? I'll explain as I give my solutions.
First of all: The solution that was expected was to use an orchestration to do the validation - as the person explained to me, that was the only way to get the response message back to the same "connection" i.e. have it go back out as a response to the matching request.
As you'll see this is not true.
In this post I'll cover the ways to do validation.
In the next post, I'll cover how you correlate the response back to the client who is waiting for a response.

Let me say one thing: BizTalk is not magic. There is no magic (thanks Nakor). There's simply some COM+ applications, some .NET assemblies, instances of a Windows Service, some database tables... and a lot of unmanaged code.
What gets confusing are all the concepts layered on top of this - BizTalk does its best to "hide" what's really going on from you, and unfortunately a lot of BizTalk developers don't dig any deeper than that.

Schema Validation and SOA
When you create a web service, you are explicitly defining a contract between that service and a client of that service. There are many parts to that interface, but to keep things simple, I'm only interested in the schema part of it - i.e. what message does the interface accept as input, and what message does it return as output. In a doc/lit world, it should be one XML message in, one XML message out.

Options for validating Schemas

1. Validating at the End Point
So where's the best place to validate your schemas? At the end point.
That is, in the Web Service ASMX code itself.
More importantly, if the incoming message isn't valid then you should raise a SOAP fault - you shouldn't return an error message. To me, this is a fundamental tenet of good SOA design.

Think about what happens if you call a method in a class.
Say the method signature was:
string DoSomething(string number)

Assume that this method expects a number passed in as a string, and returns some information about that number (I'll gloss over why you'd ever have a method like this!).

If you pass it "fred" (instead of "123") you'd expect the method to throw an exception - not to return you a string with a message saying an error had occurred.
Why should a Web Service be any different?

Why go to all the trouble of rolling your own message schema for dealing with invalid messages when you have a system already for returning detailed error information: SOAP Faults.
Additionally, when you're using BizTalk why would you knowingly allow an invalid message into BizTalk? You wouldn't allow a stranger into your home whilst you checked their credentials would you? Why waste processor cycles on the BizTalk server (and trips to the MessageBox) dealing with a message it can't process?

[If you want a hassle-free way of validating messages in the Web Service, look at the sample code I posted in this post: Validating Schemas in Web Methods using Attributes
It provides for a way of decorating a WebMethod with an attribute which does all the validation work for you, so no code needs to be placed in your methods.
Additionally, it explains the problem with using auto-generated schemas in your WSDL (which is what happens when you use the Web Services Publishing Wizard in BizTalk).

Aside 1: I have to add that you also need to question the wisdom of validating a schema. You can never guarantee that a message is valid. It might pass schema validation and still be invalid. Unfortunately, XML Schema Definitions don't allow for a completely unambiguous specification of a message - you have to accept this when you choose to use XSDs and therefore understand the complexities they can add.]

2. Validating in the Pipeline
This is probably the most common way of dealing with things.
You create a custom receive pipeline, with both the XmlDisassembler and XmlValidator components in it, and you set the "Validate document structure" to true on the XmlDisassembler.
(It's important to know the difference between the two here: the XmlDisassembler will validate the document structure, the XmlValidator will (additionally) validate any restrictions specified in the schema).
Note: for a send pipeline, you can just use the XmlValidator, and additionally the XmlAssembler if you wish to demote Context Properties into your sent message.

What happens if the document doesn't validate?
In BizTalk 2004, an exception would be thrown in the pipeline - if you didn't handle this then the message would be lost, and the only way you'd know something had gone wrong is when your client timed out (if the process started from a Web Service call), and you found an entry in the event log.
To get around this we ended up using components like Stephane Bouillon's EnhancedValidator, which would wrap the XmlValidator, catch the validation exception and generate a new message which was placed in the MessageBox. We could then write an orchestration or send port which could process this message

In BizTalk 2006, if you turn off Failed Message Routing then you get the 2004 behaviour.
If you turn on Failed Message Routing, then an Error Report is created by demoting certain Context Properties promoting some new Context Properties to indicate that the message is now an Error Report and dropping this message in the MessageBox (it's important to realise that an Error Report is not a new message - it's the received message with some additional Context properties).

3. Validating in an Orchestration
This is the option that people seem to go for when they want to send a response back to a waiting Web Service client, as it's the easiest way of doing this.
For validating in an orchestration, you have to write the validation code in a C# class, and call that class from within your orchestration (examples of how to validate an XML instance against a schema in C# can be found here).
It's interesting to note that if you use a Transform in your orchestration, this will also cause the schema to be semi-validated - under the covers, BizTalk is using XslTransform and XPathDocument classes, which need valid XML to work correctly. However, this might be a bit late to discover that your message is invalid.
Personally, I've never seen the point of doing this validation in an orchestration - why write code when it's written for you already in the XmlValidator component?? ;-)

If you've found another mechanism for validating instances (or I've missed one) please let me know in the comments, or using the mail icon at the bottom of the page.

The next post will cover how to respond when the request part of a request/response pair of messages is invalid - how do you send a response back to the client.