A guide to the elements of the SOAP and WSDL specifications and how SOAP and WSDL interact.
WSDL stands for Web Service Definition Language. It is essentially an abstract interface definition that spells out concrete bindings to on-the-wire formatting of the messages.
Here is an example WSDL file that we will use in this guide:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:element name="User" type="UserRecordType"/>
<xsd:element name="name" type="xsd:string"/>
<xsd:element name="id" type="xsd:int"/>
<part name="user" element="User"/>
<part name="return" type="xsd:int"/>
<binding name="RecordBindings" type="RecordOperations">
<soap:binding style="document" transport="http://schemas.xmlsoap.org/soap/http"/>
<soap:operation style="document" soapAction="addRecord"/>
<soap:operation style="document" soapAction="deleteRecord"/>
<port name="RecordServicePort" binding="RecordBindings">
The root element (disregard the <?xml ...?> prolog) of any WSDL must be the wsdl:definitions element:
This declares the named WSDL (in this case TestWSDL) and defines any namespaces that will be used in the rest of the document. You will almost always see the following namespace declarations:
This is the WSDL namespace and it is often declared as the default namespace so the WSDL elements don't have to have namespace prefixes.
This is the XSD (XML Schema Definitions) namespace and it is used to declare schemas and any simple types.
This is the SOAP binding namespace and is used to tie this WSDL to SOAP messages (see below).
Sometimes you will see targetNamespace declarations, which essentially place any defined objects into the specified namespace, so they will have to be accessed through that URI. This is especially useful when you have multiple WSDL files which may define similar operations or types.
User-defined types may be declared in one of two ways: either in the WSDL itself, or in a separate schema file. To import a schema file, use the wsdl:import element:
<import namespace="[URI in which to place the declared types]" location="[file name]"/>
This will pull the given schema into the WSDL and all of the types will be available through the given namespace URI. The other way to declare types is to place the schema in a wsdl:types element. The types element contains one child: the xsd:schema element. For more information on schemas and schema structure, please see <link to XSD KB article or other Schema informational source>.
For our purposes in our example, we will deal with a simple user-defined type.
Messages are the basis for all input/output for a WSDL and its operations. They form the core of all data transfer mechanisms for WSDLs.
All messages are essentially maps of named parts. In this case, we have two messages, each with one part. Below, we will show an example of messages with more than one part.
Parts must have a name and either an element or type declaration. If the part is defined as an element via the element="foo" directive, then it must directly correspond to a <xsd:element name="foo" type="..."> definition in a schema. If the part is defined as a type via the type="fooType" directive, then it must correspond to either a <xsd:complexType name="fooType"> or a <xsd:simpleType name="fooType"> in a schema.
The distinction between type and element here is slight but important. Parts that are elements will contain the specified element, whereas parts that are types will become that specified type. This will become important when we begin to go over actual instance documents in the SOAP message section below.
Operations in WSDL are grouped by interface grouping, or port type. These operations are specified in abstract in the WSDL and the concrete implementation essentially implements this abstract interface. The concrete specifications are supplied in the bindings section.
This defines a port type grouping called RecordOperations with two operations: addRecord and deleteRecord. Both of these operations take the same input and output: a RecordInput message in and a RecordOperationResult out.
A port type declaration can have 0 to N operations, although you will typically have at least one operation for each port type declaration. Each operation can define an input message, an output message, and as many fault messages as necessary. The combination of input and/or output declarations determines the operation types as well as the types of errors the operation can return:
Request-Response is the most commonly-used standard WSDL operation. A request is sent, the operation is executed, and a response is returned. While the operation is being executed, the client connection (the side that sent the request) will wait for the response. This can cause problems with HTTP, which has a timeout period, after which the socket will be reset with an error. Any number of faults (including none) can be specified. These dictate what sorts of error responses the client can expect to receive from the operation. It is much like the throws specifier in Java and C++.
One-way is other standard supported WSDL operation type. Basically, a request is sent and the operation is executed, but the client need not wait around for a response, because no response will ever be sent. This is useful with operations that are performing some asynchronous operation that does not need to return a success condition, or any data.
Solicit-Response and Notification operations are only supported by extensions to the WSDL bindings or by specialized WSDL implementations.
These diagrams illustrate the differences between the operation types:
The binding section attaches an abstract interface to a concrete messaging structure. By far, the most common type of binding is a SOAP binding (discussed below in the SOAP section). But basically, the binding section of a WSDL has as its first child element, a concrete binding element. The binding element namespace dictates the concrete binding to use. Different concrete binding elements expect different attributes. The next elements under the WSDL binding are the operations. These should match the operations specified in the port type element. Under each operation is a concrete operation element. The concrete operation element's namespace and structure is dictated by the specific concrete implementation (such as the SOAP Binding namespace in our example). The same input, output, and fault elements should be present here as were declared in the port type section above.
These concrete bindings are all wrapped up into a named binding, which will be exposed on a port (see the next section).
This WSDL defines a binding called RecordBindings, tied concretely to a SOAP binding and using HTTP as its transport mechanism. The style set in the soap:binding here is applicable to all of the operations for this binding, unless a concrete SOAP binding overrides the setting individually. This binding contains within it two operations: addRecord, and deleteRecord, which were defined in the port type section. Both of these operations are bound concretely as SOAP operations and will be sent in document style. The input and output for both operations are both bound to SOAP bodies and will be sent without any encoding (literal). Neither operation defines any specific faults that they will send. Note that the operations set their own style, although they match document style set in the soap:binding. In this case, the style attributes in the operations can be removed, but the style CAN be overridden at an operation level if desired. Another point of interest is to notice that different operations can specify different styles, encodings, and whether to use parts of elements or of types, than other operations, or even different options in the request than the response. However, it is highly suggested that all of the SOAP options be kept consistent for ease-of-use, maintainability, and compliance.
The service section of the WSDL exposes this binding to the outside world.
In order to make an operation available to the outside world, it must be exposed by a port. A port in a WSDL and a port in TCP are similar concepts. In TCP, you can have multiple ports on an IP that are entry points to services on a single machine. In WSDL, one server can expose operations on different ports. These ports are then bound with a concrete address element, which, like the bindings above, are declared in a different namespace and have their own attributes. The most commonly used address is the SOAP address as in this example:
This WSDL snippet creates a WSDL service named RecordService and populates it with one port named RecordServicePort. This port ties the RecordBindings binding declared above to an HTTP address: http://localhost:8090/RecordOps. A WSDL service can have multiple ports, which can tie different binding objects to different addresses. This WSDL service is the external endpoint used to access the ports and the operations defined in the WSDL.
For a SOAP address, the port ties a binding to an HTTP URI. This URI is sent in the HTTP message as the location. When that location is received, the server knows which binding is attached to that address, and by the SOAP message (either through RPC, or through the soapAction header), the server knows which operation in that binding. Thus, it follows that only one binding can be mapped to a particular location, because otherwise the system wouldn't know which binding to pick. However, a single binding can be mapped to multiple addresses, because the system would still know which binding to use based on the address.
Figure 2: The Connected Pieces of the WSDL Puzzle.
SOAP stands for Simple Object Access Protocol.
SOAP is a type of on-the-wire formatting that can encapsulate entire object trees as XML text. It is typically used with an HTTP-based transport to call an operation in a web service (i.e. through a WSDL). SOAP is a very open specification and how a SOAP message is structured is largely a function of its usage and environment.
All SOAP messages, though, follow a similar format. SOAP currently has two versions of its specification out, 1.1 and 1.2. 1.2 is a popular choice, but as it has not yet been fully adopted by the industry, this article will stick with SOAP 1.1 for its examples.
The base SOAP element is the Envelope. This element can hold an optional Headers element, which can in turn hold any number or kind of child elements. The Envelope then must either have a Body or a Fault. Faults are only allowed in a response. The Body element can be either in a request or a response.
The Headers element contains optional headers for the service. Only one SOAP header is defined by the SOAP specification: the mustUnderstand header, which, if true, states all supplied headers must be parsed and validated by the receiving service. All other headers are defined by the specific application.
Basically the Body element contains the data for the web service. For requests, it can contain which operation to call, and it always holds all of the data needed for the call. For responses, it can contain the operation that was called and will always contain the return information when the operation completed successfully. Faults, on the other hand, are present when an error or exception occurs. This can be anything from an operation not being found to invalid data to internal system problems.
The Body element can contain any child element, but usually only one child element is allowed. The child element can be formatted in a number of different ways, depending on the scenario (see below).
The fault element MUST contain at least the fault code, the fault string, and the detail, and it MAY contain the fault actor. SOAP defines a subset of fault codes to use, but any valid qualified name can be used. The two most commonly used are soap:Client and soap:Server errors. soap:Client errors describe a problem with the received message or the client communication. soap:Server errors describe a problem which occurred on the server during execution of the service. In any case, the fault code is a fully qualified name which is composed of a namespace prefix (that must already be defined) and a local name (which must be a valid name within that namespace).
For example, these fault codes are invalid:
These are valid:
The fault string is any summary of the error, and the fault actor is a URI defining the source of the fault.
The detail is essentially an element that specifies detail about the fault, and generally contains an XML element that matches one of the fault messages specified in the corresponding operation's definition (see Operations in the WSDL section above). It is almost always present, although it can be omitted in certain circumstances (e.g. if the fault was encountered outside the SOAP Body element), and it can be empty.
SOAP is very often sent as an HTTP payload. This uses standard HTTP (with its benefits and limitations) to send the SOAP message as a text payload. The standard HTTP request looks like:
and the standard response:
The request <method> is usually POST or GET (although other methods are available). The <location> maps to the address URI in the WSDL as described above. The <version> is usually 1.1 (although it can also be 1.0). The headers are all in the form of <headerName>: <headerValue>, and the header value can also be a comma delimited list. At the end of the headers is an extra newline, separating the headers from the actual payload. A few headers are almost always present: Content-Length, Content-Type, and Host, but the SOAPAction header is frequently present in web service applications as well.
An example HTTP request with a SOAP payload might be:
An example HTTP response to the above request might look like:
<See the HTTP specification for more information on message formats and types>.
The only other peculiarity with SOAP over HTTP is that any error in processing the SOAP request will result in a SOAP fault sent and an HTTP status code of 500 (Internal Server Error). Any errors with the SOAP request will result in a response with an HTTP status code of 500 and a SOAP fault containing the error information. All other HTTP-related errors will be sent back with the appropriate HTTP status code as defined by the HTTP specification.
The most common usage of WSDL is with SOAP bindings. However, there are many types of bindings and many permutations of possible SOAP messages, which can be a bit confusing.
SOAP messages are categorized into two main styles: document and RPC. Document styles are based around sending XML documents back and forth while RPC (Remote Procedure Call) messages are based around sending function calls in and getting a return value. To further complicate matters, message parts (as described above) can either be a type or an element. This will also affect how the SOAP message is formatted. Finally, SOAP messages can either be encoded or literal (no encoding). The encoding rarely affects the SOAP message a great deal, but in some circumstances (HREFs for example) encoding can change the resulting SOAP message.
Here are the five different SOAP styes:
Figure 3: The Five SOAP Styles
Document style SOAP messages are based around XML documents. The SOAP Body element, in effect, becomes the root element of the document. This means that document style messages are really not supposed to have more than one part, because the message is supposed to be a document, not a parameter list.
If the part is a type, then the SOAP Body element becomes that type. For the XSD and WSDL message:
The response would look similar, except with the output message instead. Note that the SOAP Body element becomes the fooType type and since the fooType type holds a sequence of two elements: sub1 and sub2, the SOAP Body will hold two elements: sub1 and sub2.
If the part is an element, then the SOAP Body will contain that element as a child. For the above XSD and this WSDL message section (the WSDL binding section is the same):
The resulting document-style SOAP message would look like:
The element name will be printed as a fully qualified name. In this example <MyElement> is a simple local name with no namespace, but if a namespace were present, it would get printed as <ns0:MyElement xmlns:ns0="myURI">. In general, the subelements of this parent element are NOT fully qualified. There is a flag in the XML Schema specification that says to qualify all children of all elements, but it is not common to use it.
Note that with multiple parts, one COULD define different elements and those elements could be placed in the Body in sequence. This is usually what happens when multiple parts are specified for document-style, element messages, but still bear in mind that this is not standard nor advised. Also, note that the part name doesn't matter in the slightest. There is no place that the part name gets printed in document-style SOAP messages, not even with elements.
There is one more form of document-style SOAP messages called document-wrapped. If you look at the above messages, you will notice that there is no indication of which operation to call. Looking at the WSDLs above, there are two pieces of information necessary to locate a web service: the URI or "location", and the name of the operation. The URI is provided by the transport protocol, in this case the HTTP location in the HTTP request line. However, the operation name is still missing. As you will see below with RPC, the operation name CAN be sent in the SOAP payload, but another common (and usually necessary for document-style messages) way to transmit the operation name via the SOAPAction header in the HTTP header section.
This can be cumbersome and prone to error, because we are now relying on the transport to relay the information about which specific operation to call. The transport directing us to the proper WSDL port makes a lot of sense, but after that, the transport's duties are finished. However, as you will see below, RPC style SOAP messages have their own drawbacks. Fortunately there is a pretty clever way of getting the best of both worlds here.
If we take a document-style SOAP message with an element part, and make sure to name the part the exact operation name, then we, in effect, transmit the operation as part of the SOAP message, making the SOAPAction HTTP header unnecessary. This style of SOAP message is really document-style with an element part, but it is commonly referred to as doc-wrapped and is most commonly used in .NET applications.
Here's an example of doc-wrapped.
The SOAPAction header can be omitted (but, if present, it MUST be equal to the operation name, in this case: doTestOperation, and the parameters to this operation are easy to specify and read. In effect, this style gets many of the benfits of RPC without a lot of the drawbacks (we'll cover RPC in the next section). For this reason, it is a very common SOAP style, used in many applications (as noted, most commonly in .NET).
So, we've covered the document styles and the drawbacks. Basically, document styles are based around sending XML documents, whereas web services in general tend to represent function calls. These documents have problems representing simple function parameters and there is also the problem of needing a transport level header to specify which function to invoke. Document-style messages can also only specify one part and send a single XML document in the message.
RPC, on the other hand, was created to represent function calls. RPC-style SOAP messages contain a wrapping element that specifies the operation name, and that element contains one child element for each of the function parameters. This allows multiple parts to be specified as either; simple types, complex types, or elements. It also means that the SOAPAction header is unnecessary and can be omitted with RPC-style messages.
Both RPC with typed parts and with element parts have the same structure, with the only difference being the information under the part element.
Here is the general format for RPC-style SOAP messages.
You'll notice that, like document-style, the part with the element contains the fully-qualified element, while the part that's a type becomes the specified type.
You'll also notice that RPC, unlike document-style, treats web service calls as functions with parameters and return values. Whereas document-style just passes around documents for requests and responses, RPC passes around function name, parameters, and result. The operation name in the request is qualified by a namespace specified by the SOAP input body declaration in the WSDL as the namespace attribute. By convention, the operation name in the response will have the tag Response appended to it to show that it is indeed a response, and it also is qualified by a namespace, in this case the namespace attribute in the SOAP output body declaration. This is also known as the wrapper namespace, or the on-the-wire namespace, because it is only used when the SOAP message is being sent and received. After the SOAP message is parsed, that namespace is no longer relevant.
There is one more "axis" to consider when formatting and parsing a SOAP message: encoding. This is specified in the SOAP input and output body tags in the WSDL for each concrete operation in a binding. The use attribute is either set to literal to specify no encoding, or to encoded to specify that an encoding scheme will be used/expected to format the message. The only real encoding scheme being used is SOAP encoding. The SOAP encoding mainly allows for references and SOAP arrays, among some other lesser features. It also will generally add an attribute to specify the type for each of the elements, using the XSI (XML Schema for Instances) type tag.
The use of any encoding is not WS-I compliant because there isn't any way to standardize on encodings (because they are WSDL extensions by their definition). This doesn't mean that it is illegal in WSDL to use encodings, but in real-world applications, its use is limited due to the lack of standardization.
Figure 4: SOAP Style Combinations
Note: Typed parts must refer to XSD-defined types (xsd:simpleType or xsd:complexType definitions). Element parts must refer to an XSD-defined element (xsd:element definition).
So, the question becomes, which style should we use? The different styles have their pros and cons. There is also the WS-I profile to consider. Although this profile isn't a "specification" per se, it is essentially a standard that WSDL/SOAP users can rally around. The SOAP and WSDL specifications are somewhat vague in places and downright ambiguous in others (for example, whether or not multiple parts are allowed in doc-literal isn't clearly stated in the WSDL or SOAP specifications). The WS-I profile is a way to answer some of these questions by limiting the types of WSDL and SOAP messages allowed in order to clear up ambiguities and irregularities. WS-I compliance is by no means necessary, but you'll find that the WS-I compliant styles are more common in real-world applications.
Each of the above styles (except for document/encoded which is never used) are all common. Doc-wrapped is newer but it is definitely gaining ground in standard usage because it gains the benefits of RPC and can still be validated as a full document. RPC messages are useful when mapping web service calls to C++ or Java methods, because they can be overloaded with different parameters, and the parameters are pretty easy to match up to C++ and Java function arguments. Document-literal can be useful when passing full XML documents around to web service implementations that don't implement function calls (BPEL, HTTP servlets, etc.). It is also useful when you need overloaded functions but only have one parameter, or you don't want the extra overhead of the extra RPC elements.
WS-I Profile 1.2: http://www.ws-i.org/Profiles/BasicProfile-1.2.html
WSDL 1.1 spec: http://www.w3.org/TR/wsdl
SOAP 1.1 spec: http://www.w3.org/TR/2000/NOTE-SOAP-20000508/
XSD data types spec: http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/datatypes.html
Guide to choosing WSDL styles: http://www-128.ibm.com/developerworks/webservices/library/ws-whichwsdl/