Thursday, December 29, 2011

SOAP vs. REST









SOAP vs. REST







Developers new to web services are often intimidated by parade of technologies and concepts required to understand it: REST, SOAP, WSDL, XML Schema, Relax NG, UDDI, MTOM, XOP, WS-I, WS-Security, WS-Addressing, WS-Policy, and a host of other WS-* specifications that seem to multiply like rabbits. Add to that the Java specifications, such as JAX-WS, JAX-RPC, SAAJ, etc. and the conceptual weight begins to become heavy indeed. In this series of articles I hope to shed some light on the dark corners of web services and help navigate the sea of alphabet soup (1). Along the way I'll also cover some tools for developing web services, and create a simple Web Service as an example. In this article I will give a high-level overview of both SOAP and REST.

Introduction

There are currently two schools of thought in developing web services: the traditional, standards-based approach (SOAP) and conceptually simpler and the trendier new kid on the block (REST). The decision between the two will be your first choice in designing a web service, so it is important to understand the pros and cons of the two. It is also important, in the sometimes heated debate between the two philosophies, to separate reality from rhetoric.

SOAP


In the beginning there was...SOAP. Developed at Microsoft in 1998, the inappropriately-named "Simple Object Access Protocol" was designed to be a platform and language-neutral alternative to previous middleware techologies like CORBA and DCOM. Its first public appearance was an Internet public draft (submitted to the IETF) in 1999; shortly thereafter, in December of 1999, SOAP 1.0 was released. In May of 2000 the 1.1 version was submitted to the W3C where it formed the heart of the emerging Web Services technologies. The current version is 1.2, finalized in 2005. The examples given in this article will all be SOAP 1.2.

Together with WSDL and XML Schema, SOAP has become the standard for exchanging XML-based messages. SOAP was also designed from the ground up to be extensible, so that other standards could be integrated into it--and there have been many, often collectively referred to as WS-*: WS-Addressing, WS-Policy, WS-Security, WS-Federation, WS-ReliableMessaging, WS-Coordination, WS-AtomicTransaction, WS-RemotePortlets, and the list goes on. Hence much of the perceived complexity of SOAP, as in Java, comes from the multitude of standards which have evolved around it. This should not be reason to be too concerned: as with other things, you only have to use what you actually need.

The basic structure of SOAP is like any other message format (including HTML itself): header and body. In SOAP 1.2 this would look something like










Note that the
element is optional here, but the is mandatory.

The SOAP


SOAP uses special attributes in the standard "soap-envelope" namespace to handle the extensibility elements that can be defined in the header. The most important of these is the mustUnderstand attribute. By default, any element in the header can be safely ignored by the SOAP message recipient unless the the mustUnderstand attribute on the element is set to "true" (or "1", which is the only value recognized in SOAP 1.1). A good example of this would be a security token element that authenticates the sender/requestor of the message. If for some reason the recipient is not able to process these elements, a fault should be delivered back to the sender with a fault code of MustUnderstand.

Because SOAP is designed to be used in a network environment with multiple intermediaries (SOAP "nodes" as identified by the element), it also defines the special XML attributes role to manage which intermediary should process a given header element and relay, which is used to indicate that this element should be passed to the next node if not processed in the current one.

The SOAP

The SOAP body contains the "payload" of the message, which is defined by the WSDL's part. If there is an error that needs to be transmitted back to the sender, a single element is used as a child of the .

The SOAP

The is the standard element for error handling. When present, it is the only child element of the SOAP . The structure of a fault looks like:



env:Sender

m:MessageTimeout



Sender Timeout


P5M



Here, only the and child elements are required, and the child of is also optional. The body of the Code/Value element is a fixed enumeration with the values:

VersionMismatch: this indicates that the node that "threw" the fault found an invalid element in the SOAP envelope, either an incorrect namespace, incorrect local name, or both.
MustUnderstand: as discussed above, this code indicates that a header element with the attribute mustUnderstand="true" could not be processed by the node throwing the fault. A NotUnderstood header block should be provided to detail all of the elements in the original message which were not understood.
DataEncodingUnknown: the data encoding specified in the envelope's encodingSytle attribute is not supported by the node throwing the fault.
Sender: This is a "catch-all" code indicating that the message sent was not correctly formed or did not have the appropriate information to succeed.
Receiver: Another "catch-all" code indicating that the message could not be processed for reasons attributable to the processing of the message rather than to the contents of the message itself.

Subcodes, however, are not restricted and are application-defined; these will commonly be defined when the fault code is Sender or Receiver. The element is there to provide a human-readable explanation of the fault. The optional element is there to provide additional information about the fault, such as (in the example above) the timeout value. also has optional children and , indicating which node threw the fault and the role that the node was operating in (see role attribute above) respectively.

SOAP Encoding

Section 5 of the SOAP 1.1 specification describes SOAP encoding, which was originally developed as a convenience for serializing and de-serializing data types to and from other sources, such as databases and programming languages. Problems, however, soon arose with complications in reconciling SOAP encoding and XML Schema, as well as with performance. The WS-I organization finally put the nail in the coffin of SOAP encoding in 2004 when it released the first version of the WS-I Basic Profile, declaring that only literal XML messages should be used (R2706). With the wide acceptance of WS-I, some of the more recent web service toolkits do not provide any support for (the previously ubiquitous) SOAP encoding at all.

A Simple SOAP Example

Putting it all together, below is an example of a simple request-response in SOAP for a stock quote. Here the transport binding is HTTP.

The request:

GET /StockPrice HTTP/1.1
Host: example.org
Content-Type: application/soap+xml; charset=utf-8
Content-Length: nnn


xmlns:s="http://www.example.org/stock-service">


IBM




The response:

HTTP/1.1 200 OK
Content-Type: application/soap+xml; charset=utf-8
Content-Length: nnn


xmlns:s="http://www.example.org/stock-service">


45.25




If you play your cards right, you may never have to actually see a SOAP message in action; every SOAP engine out there will do its best to hide it from you unless you really want to see it. If something goes wrong in your web service, however, it may be useful to know what one looks like for debugging purposes.

REST

Much in the way that Ruby on Rails was a reaction to more complex web application architectures, the emergence of the RESTful style of web services was a reaction to the more heavy-weight SOAP-based standards. In RESTful web services, the emphasis is on simple point-to-point communication over HTTP using plain old XML (POX).

The origin of the term "REST" comes from the famous thesis from Roy Fielding describing the concept of Representative State Transfer (REST). REST is an architectural style that can be summed up as four verbs (GET, POST, PUT, and DELETE from HTTP 1.1) and the nouns, which are the resources available on the network (referenced in the URI). The verbs have the following operational equivalents:

HTTP CRUD Equivalent
==============================
GET read
POST create,update,delete
PUT create,update
DELETE delete

A service to get the details of a user called 'dsmith', for example, would be handled using an HTTP GET to http://example.org/users/dsmith. Deleting the user would use an HTTP DELETE, and creating a new one would mostly likely be done with a POST. The need to reference other resources would be handled using hyperlinks (the XML equivalent of HTTP's href, which is XLinks' xlink:href) and separate HTTP request-responses.

A Simple RESTful Service

Re-writing the stock quote service above as a RESTful web service provides a nice illustration of the differences between SOAP and REST web services.

The request:

GET /StockPrice/IBM HTTP/1.1
Host: example.org
Accept: text/xml
Accept-Charset: utf-8

The response:

HTTP/1.1 200 OK
Content-Type: text/xml; charset=utf-8
Content-Length: nnn



IBM
45.25


Though slightly modified (to include the ticker symbol in the response), the RESTful version is still simpler and more concise than the RPC-style SOAP version. In a sense, as well, RESTful web services are much closer in design and philosophy to the Web itself.

Defining the Contract

Traditionally, the big drawback of REST vis-a-vis SOAP was the lack of any way of specifying a description/contract for the web service. This, however, has changed since WSDL 2.0 defines a full compliment of non-SOAP bindings (all the HTTP methods, not just GET and POST) and the emergence of WADL as an alternative to WSDL. This will be discussed in more detail in coming articles.

Summary and Pros/Cons

SOAP and RESTful web services have a very different philosophy from each other. SOAP is really a protocol for XML-based distributed computing, whereas REST adheres much more closely to a bare metal, web-based design. SOAP by itself is not that complex; it can get complex, however, when it is used with its numerous extensions (guilt by association).

To summarize their strengths and weaknesses:

*** SOAP ***

Pros:

Langauge, platform, and transport agnostic
Designed to handle distributed computing environments
Is the prevailing standard for web services, and hence has better support from other standards (WSDL, WS-*) and tooling from vendors
Built-in error handling (faults)
Extensibility

Cons:

Conceptually more difficult, more "heavy-weight" than REST
More verbose
Harder to develop, requires tools

*** REST ***

Pros:

Language and platform agnostic
Much simpler to develop than SOAP
Small learning curve, less reliance on tools
Concise, no need for additional messaging layer
Closer in design and philosophy to the Web

Cons:

Assumes a point-to-point communication model--not usable for distributed computing environment where message may go through one or more intermediaries
Lack of standards support for security, policy, reliable messaging, etc., so services that have more sophisticated requirements are harder to develop ("roll your own")
Tied to the HTTP transport model

WebService

Web Service is a collection of protocols and standards used for exchanging data between applications

Three types of Web Services

1) XML-RPC (Remote Procedure Call)
2) SOAP (Simple Object Access Protocol)
3) REST (Representative State Transfer)

XML-RPC

XML-RPC is a simple, portable way to make remote procedure calls over HTTP


SOAP

"Simple Object Access Protocol" was designed to be a platform and language-neutral alternative to previous middleware techologies like CORBA and DCOM.

REST

RESTful web services, the emphasis is on simple point-to-point communication over HTTP using plain old XML (POX).


SOAP


As communications protocols and message formats are standardized in the web community, it becomes increasingly possible and important to be able to describe the communications in some structured way. WSDL addresses this need by defining an XML grammar for describing network services as collections of communication endpoints capable of exchanging messages. WSDL service definitions provide documentation for distributed systems and serve as a recipe for automating the details involved in applications communication.
A WSDL document defines services as collections of network endpoints, or ports. In WSDL, the abstract definition of endpoints and messages is separated from their concrete network deployment or data format bindings. This allows the reuse of abstract definitions:messages, which are abstract descriptions of the data being exchanged, and port types which are abstract collections ofoperations. The concrete protocol and data format specifications for a particular port type constitutes a reusable binding. A port is defined by associating a network address with a reusable binding, and a collection of ports define a service. Hence, a WSDL document uses the following elements in the definition of network services:
  • Types– a container for data type definitions using some type system (such as XSD).
  • Message– an abstract, typed definition of the data being communicated.
  • Operation– an abstract description of an action supported by the service.
  • Port Type–an abstract set of operations supported by one or more endpoints.
  • Binding– a concrete protocol and data format specification for a particular port type.
  • Port– a single endpoint defined as a combination of a binding and a network address.
  • Service– a collection of related endpoints.