SOAP on HTTP的先天缺陷 - Where HTTP Fails SOAP

最新推荐文章于 2022-05-10 09:56:39 发布

houzy

最新推荐文章于 2022-05-10 09:56:39 发布

阅读量1k

点赞数

分类专栏：企业信息化文章标签： soap application scalability server asynchronous jms

企业信息化专栏收录该内容

2 篇文章 0 订阅

订阅专栏

本人的主要讲述了由于HTTP本身不具有request identifier（请求标识符）机制，导致HTTP协议不能实现高并发情况下的连接共享机制，限制了它在关键领域的应用。
本文建议在使用SOAP时可以采用其它协议，如IIOP,MQ,JMS等，将它们与HTTP混合使用。这样，即可以保证互通性又可以满足企业级应用的需求。这也是ESB(extensible service bus)的核心思想。判断一个产品是不是真正的ESB，就要看它是不是支持多种传输协议，传输协议支持的多少直接反映了一个ESB产品的档次。

Web services allow for the delivery of SOAP messages over any protocol. A common misconception is that all SOAP messages must be transmitted over HTTP. While that approach is useful in many cases, there are situations where it makes sense to use alternatives. This paper investigates situations where HTTP does not scale sufficiently for enterprise Web service deployments and looks at available alternatives.

HTTP and Scalability

HTTP was designed for serving Web pages under the assumption that the protocol would only be required to send a request and receive a response. This paradigm has worked very well for the World Wide Web and has been ubiquitously accepted as its standard protocol. When a person makes a request on an interactive Web site, they are typically interacting with an application server tier (J2EE, .NET, or scripting languages such as Python and Perl). Each person is running a browser client and doing only one thing at a time on that particular site. In a typical enterprise, however, an application server fronts a number of back-office systems that provide critical business services. The application server usually supports a number of users concurrently. This implies that it typically needs to make a number of concurrent requests on those back-office systems.

Many people believe that a move to service-oriented architecture (SOA) implies a move to SOAP/HTTP as the ubiquitous protocol throughout the enterprise. What few seem to realize, however, is that the SOAP/HTTP approach has inherent scalability limitations under certain circumstances. Simple Web browsers have been the de facto HTTP client to date, and they are in essence single-threaded clients as far as the server is concerned, making only one request at a time over a given connection. This has created a perception that HTTP can be scaled as needed. To date, it has been scaled only for communication between browsers and application/Web servers, typically through clustering, replication, and the use of hardware load balancers. Unfortunately, scaling communications between an application server and back-office services cannot be solved satisfactorily using the same techniques.

For example, assume we have an online banking system with support for up to 4000 concurrent users. The Web tier comprises a cluster of application server instances behind a hardware HTTP load balancer. In order to fulfill the online banking business function, there are three Unix-hosted services and a mainframe-hosted service utilized by the application server. In a world where SOAP/HTTP is the only protocol, the application server will have to support an incoming connection from the browser, and one additional connection out to each of the four back-office services for each concurrent user. This is because HTTP demands that you wait for a response before you send your next request over that same connection. It has no concept of a request identifier, which is a core requirement to enable connection sharing.

One could of course just serialize the requests, awaiting each response before sending the next request on each given connection. However, this is a waste of resources because the back-office server is not doing anything with this connection until the response is sent back to the client application, and most back-office systems have the capacity to handle multiple concurrent requests.

Interleaving requests over a single connection would be the ideal. It would allow an enterprise to achieve the same level of concurrency while using fewer resources. One would send a number of requests to a server over a single connection and receive responses as they become available. The client can correlate responses based on a request identifier. This would allow responses to be returned as soon as they are ready (which may differ from the order in which requests were sent). Unfortunately, the HTTP specification forbids such interleaving.

The obvious conclusion is that a standards-based protocol that allows for request interleaving is needed. This would allow the sharing of a single connection between the application server and each back-office system. In the example previously outlined, if the application server had an upper limit of 1000 connections it can open at once (file descriptor limit), in our SOAP/HTTP world, each application server would be limited to concurrently supporting only 1000/5, or 200, clients. A typical workaround for this problem is to add application servers. If enough are added to support 1000 clients, the problems propagate into the back-office servers, which are now maxed out on the number of connections they can keep open. Creating pools of back-office server instances is prohibitively expensive, especially if they are hosted on a mainframe.

This problem has been solved in the past with connection concentrators, but because we cannot interleave HTTP POST requests, HTTP-based communication cannot be concentrated. Clearly, HTTP is not capable of scaling in such an environment.

HTTP 1.1 supports a feature known as "request pipelining." Pipelining allows a client to send multiple requests over a given connection without having to wait for each response. However, it is not as useful as interleaving, as the HTTP/1.1 specification (see the first entry in the References section) mandates that: A client that supports persistent connections MAY "pipeline" its requests (i.e., send multiple requests without waiting for each response). A server MUST send its responses to those requests in the same order that the requests were received."

Pipelining was designed to streamline the downloading of elements within Web pages over the Internet, supporting only HTTP request types that may be reissued without any change to the server state (idempotent requests). The HTTP/1.1 specification is very clear about this point: "Clients SHOULD NOT pipeline requests using non- idempotent methods or non-idempotent sequences of methods (see section 9.1.2)."

Web services, on the other hand, typically use the HTTP POST request type, which can be non-idempotent. Therefore, pipelining cannot be used.

Roy T. Fielding (the primary architect of the HTTP/1.1 protocol) spoke at ApacheCon November 2002, about a new protocol that he is working on called "Waka." In his presentation (see the second entry in the References section) he described Waka as "…a new protocol to solve HTTP's current problems in a generic way."

He went on to mention support for interleaved data and metadata delivery. Waka has not yet been fully specified, so the details on how Waka intends to support interleaving are not yet available. You can track the progress of Waka at the project Web site (see the third entry in the References section). At this time there are no implementations of Waka available.

Scaling Web Services (SOAP) in the Back Office
Clearly, a protocol is needed that allows the interleaving of requests over a single connection. HTTP could be extended to support request identifiers, but modifications to this standard have taken years to be accepted because of the sheer number of deployments. Solving this problem within the bounds of a SOAP-based specification - WS-ReliableMessaging, for example - will always be subject to the limitations that HTTP imposes. A variety of alternatives to HTTP exist. Some are described below.

MQSeries

MQSeries is a widely deployed enterprise messaging system from IBM. It has been in production for many years and has proven its robustness and scalability in enterprise deployments. It has traditionally been used in single-threaded applications (based on the age of many deployments), but there is no reason why an application could not have multiple threads posting to the same queue, even before responses are read back from a reply queue. This would solve the problem described, but MQSeries is proprietary and expensive. It is much better suited to asynchronous communication, and demands a pair of queues for pseudo-synchronous communication.

JMS
JMS is a standard messaging interface designed as part of the J2EE specification. It does not specify any details about wire-level implementation, so two separate JMS implementations are unlikely to interoperate. With a JMS-based solution, therefore, all communication must take place using the same implementation; however, the API is widely accepted and adopted, and it's very friendly to the J2EE domain. Just like MQSeries, JMS is asynchronous, thereby allowing interleaving of requests. JMS is completely Java-centric, but many of the core back-office systems in production today are not, which means that successful integration of these systems with JMS can prove to be a challenge.

IIOP
The Object Management Group (OMG) adopted the Internet Inter-ORB Protocol (IIOP) as part of the CORBA 2.0 specification. A number of groups have adopted IIOP as their standard protocol, not least of which was Sun Microsystems, who adopted it as the standard protocol for Java RMI. Given its CORBA heritage, a number of IIOP implementations exist in a variety of languages. The variety of available IIOP implementations covers the range of commercial, free, and open source software. Most implementations of IIOP have matured to a point where they interoperate seamlessly with each other, and IIOP has proven itself in some of the most demanding environments such as telecommunications provisioning and network management. IIOP offers support for multiple qualities of service, including optimal delivery of large messages over TCP/IP (which would be ideal for SOAP). Just like HTTP it natively supports a request-reply paradigm, but in addition it allows for the interleaving of requests, replies, and fragments thereof, all over a single connection.

SOAP over IIOP

IIOP presents a very strong case for adoption as the protocol of choice within the back office. It's one of the few standards-based protocols (thus offering a wire-level interoperable transport) proven to scale in the enterprise. If one wanted to build a Web services framework that supported SOAP messages over IIOP, integrating open-source projects such as Apache AXIS with the JDK IIOP stack might do it.

The Need for an Enterprise Service Bus
A key difference between a Web services toolkit and an Enterprise Service Bus (ESB) is the ability to switch message format and protocol as necessary. For the sake of this discussion, we are talking only about switching the underlying protocol used to deliver SOAP messages, and we're working under the assumption that there is no need to integrate with legacy systems that expose endpoints with other message formats. The programming model should insulate developers from the protocol and transport, which should instead be a deployment option as opposed to a decision made at development time.

Most ESBs, with minimal additional development effort, allow you to build distributed systems that communicate using any of the following:
· SOAP over HTTP
· SOAP over MQSeries
· SOAP over JMS

However, SOAP is only one among many protocols with which enterprise applications need to deal.

An ESB must also allow you to expose your business logic over more than one protocol/transport. In particular, it should be capable of exposing the endpoint over an enterprise-strength transport without sacrificing support for SOAP/HTTP. The vast majority of data (in the back office) should be transmitted over an enterprise-strength transport while still allowing for use of SOAP/HTTP where applicable. The forthcoming WS-Addressing standard from W3C will support metadata within its endpoint reference (EPR) construct, thus making it possible to describe and reference endpoints regardless of the protocol and transport used.

In some cases, however, you will not be able to control the volumes of requests coming from SOAP/HTTP-based clients. In such situations the ESB should provide you with a relay that accepts SOAP messages over the HTTP transport and sends them over the enterprise-strength transport to the ESB-enabled back-office server. This is effectively a concentrator implementation, instance-pooling inexpensive relays (typically behind a hardware HTTP load balancer) rather than attempting to create pools of expensive and complex back-office systems.

In the past there was a barrier to accessing back-office systems in the learning curve associated with the middleware technologies involved. With the advent of Web services, this barrier has been lowered substantially; this is a step in right direction and is essential for effective deployment of an SOA. An ESB insulates the developer from the middleware used in deployment. Given this empowerment granted by the ESB, the traditional vendor lock-ins in the back office can be removed. To be truly services oriented you should not be beholden to any individual middleware vendor, regardless of the scalability requirements. Most ESB vendors will provide proprietary alternatives to HTTP where scalability requirements demand it, e.g., IBM is encouraging the use of MQ to deliver SOAP messages. Alas all direct consumers of the service need to have this vendors' technology installed. Your ESB should provide you with a higher quality standards-based alternative, such as SOAP/IIOP.

Conclusion

SOAP/HTTP has one major thing going for it: its ubiquity and widespread support. While HTTP does a great job at serving Web pages, it is not an enterprise-strength protocol, and does not scale well in the back office. Clearly an open, interoperable, standards-based, enterprise-strength protocol is needed here. The most widely deployed protocol that fulfills all of these criteria today is the OMG's IIOP.

One thing an ESB allows you to do that a Web services toolkit cannot is adopt an SOA using SOAP without sacrificing the qualities of service required in the back office. In other words, an ESB allows you to apply SOAP where it fits best without forcing you to also apply it to problems for which it's a poor solution. An ESB should allow the developer to preserve the loose coupling that SOAP affords us, while taking advantage of the qualities of service demanded in the back office that are available with IIOP.

References
· R. T. Fielding, et al. "Hypertext Transfer Protocol -- HTTP/1.1", Internet RFC 2616, June 1999: www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8.1.2
· R. Fielding, ApacheCon presentation, November 2002:
· http://gbiv.com/protocols/waka/200211_fielding_apachecon.ppt
· Waka protocol progress page: www.apache.org/~fielding/waka/

by Frank Lynch; Mark Fynes

from : http://webservices.sys-con.com/read/114115.htm