Examining SOAP

Summary

Introduction

While it is entirely possible to create Web Services without ever needing to understand the Simple Object Access Protocol (SOAP) messages exchanged between the provider and its consumers, developers will be much more enabled by understanding the structure and format of SOAP messages.

It is important to understand that the ASP.NET Web Services implementation of SOAP attempts to make available only those features of SOAP that make sense within the context of Web Services. For example, ASP.NET Web Services are by default bound to HTTP. The SOAP specification does not require HTTP - SOAP can be used with SMTP (Simple Mail Transfer Protocol) and File Transfer Protocol (FTP).

What is SOAP?

SOAP is a message format that allows two software systems to communicate with each other - regardless of what software or hardware platforms the two participating computers are using. This task is accomplished using industry standards such as HTTP and XML. The communication typically (but not necessarily) takes the form of a request for information and then the response to that request.

SOAP stands for Simple Object Access Protocol. 'Simple' in relation to other protocols such as DCOM and CORBA. 'Protocol' denotes a system where two parties can agree on the format of the message. 'Object Access' is a bit misleading because procedures are accessed and not objects. Although you can use SOAP to access object properties and methods, SOAP has nothing to do with Object Oriented Programming per se. SOAP does not require that a procedure be part of a class definition. SOAP is simply a message format that enables RPC between two remote computers.

SOAP and XML

RPC (Remote Procedure Call) is a technology that enables an application to call a procedure that resides somewhere else. This 'somewhere else' can be across the room, across the LAN or across the world over the Internet. SOAP utilizes XML to express RPC calls

Because XML is the way SOAP expresses RPCs, those calls are easy to read, from both a human and a computer perspective. Expressing RPCs as XML is helpful in two scenarios: First, because XML is human-readable, debugging can be fairly easy. In the second scenario, the developer can quickly create a filter that reads the SOAP message and modify it or monitor it as required. From a computer's perspective, SOAP is plain text and its specification is open.

The beauty of SOAP is that because it is just creating XML in a specific way (according to the SOAP specification), any programming language or scripting engine that can accept, parse, concatenate and return strings from procedures can provide and consume Web Services. With its roots in XML, SOAP is the perfect way to communicate between applications because there is no affinity for one programming language, platform, hardware type, or network.

Because SOAP is XML, it can be transmitted on any port to complement any number of other protocols. SOAP is often sent over HTTP on port 80, but it is not limited to HTTP or to port 80. SOAP can be transmitted over any port. SOAP can also be transmitted using any other protocol such as SMTP, FTP, or any other conceivable protocol.

What is important to remember is that because SOAP is plain text sent over HTTP via port 80, you gain the benefit of offering Web Services to the world without requiring your system administrator to open any holes (data ports) in your firewall. Being able to use port 80 is an important benefit over other RPC technologies that require other ports to be opened for them to function properly. Recall that opening other ports to accommodate other technologies is more of a security risk.

SOAP vs. Other RPC Technologies

Over the last decade, several RPC technologies emerged, such as DCOM and CORBA. One problem with DCOM and CORBA is that when implementing one, you are tied to a single vendor. Another problem is that both DCOM and CORBA rely on a predictable environment. Still another problem is that both DCOM and CORBA are fairly difficult to build when compared with the rather open and text-based approach of SOAP.

However, from a performance perspective, it is generally believed that DCOM and CORBA still have the advantage over SOAP when you are building a system that involves primarily server-to-server communication within the same organization. These situations demand high speed and do not require the openness of Web Services. Whereas in an enterprise, SOAP might have the advantage of being open (thereby encouraging cross-system compatibility), discoverable (thereby encouraging reuse), and self-documenting (further encouraging reuse). Each enterprise must evaluate performance against openness individually based on the goals and risk of each. 

In short, when there is a need to communicate with different systems across an unstable network such as the Internet, SOAP fits the bill nicely.

SOAP's Benefits and Drawbacks

SOAP is the RPC of choice for Web Services for the following reasons:

SOAP allows each platform to address the following features in the way it deems best:

Components of a SOAP Message

This section focuses on the SOAP features that ASP.NET Web Services use. Just like a Web page, a SOAP message has several major sections, each serving a special purpose. There are three major sections to a SOAP message:

The following figure illustrates the composition of a SOAP message:

SOAP Envelop

The SOAP envelope is the XML contained in the SOAP message, which includes the SOAP header and the SOAP body, as well as additional sub-elements you can define. The envelope is therefore referred to as the XML payload. To send a message using a protocol, the intended recipient's addressing information is added to conform to the protocol's standard.

SOAP Header

The first element you might see in a SOAP envelop is the SOAP header. It is optional, but its purpose is to include additional information that is not part of the method call but might be important to your application. Think of the SOAP header as a type of metadata that gives context to the method call or the Web Service response. For example, you could include customer account information (server ID, login, client ID, and so on) here to notify the Web Service about the client who is using the service. The server could use this information for billing or logging purposes, for authentication, and so forth. Another possible use for the SOAP header is transactions. The message could contain information to synchronize the partners in a two-phase commit protocol.

The following is a sample SOAP message with a header:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="SomeURL">

   <SOAP-ENV:Header>
        <ACCT:AccountID xmlns:ACCT="AnotherURL" SOAP-ENV:root="1">12345</ACCT:AccountID>
    </SOAP-ENV:Header>

    <SOAP-ENV:Body>
        <!-- SOAP body comes here-->
    </SOAP-ENV:Body>

</SOAP-ENV: Envelope>

There are several rules to keep in mind when dealing with SOAP headers:

ASP.NET Web Services allow you to specify whether the SOAP header should be sent only on the request message, only on the response message, or on both the request and response.

SOAP Body

Depending on the nature of the SOAP message whether it is a method call (request) or a response to that call, the contents of the SOAP body change. The body's contents also change when an error or exception occurs (referred to as a fault):

SOAP Body: Call

The call is responsible for specifying the method name to be executed and all the parameters that must be passed to the method. To do this, the first child node beneath the Body node is the actual name of the method. The method's parameters are then listed as child nodes of the method node. The following example illustrates:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="SomeURL">

   <SOAP-ENV:Header>
        ...    
   </SOAP-ENV:Header>

    <SOAP-ENV:Body xmlns:MyMethod="...">
        <MyMethod:CalcNumbers>
            <lFirstNumber xsi:type="long">10</lFirstNumber>
            <lSecondNumber xsi:type="long">20</lSecondNumber>
        </MyMethod:CalcNumbers>
    </SOAP-ENV:Body>

</SOAP-ENV: Envelope>

Note that the first child node of the Body node is the method name. Under that node are two additional child nodes that represent the parameters to the CalcNumber method. As a point of reference, the above XML corresponds to the following C# method 

[WebMethod]
public long CalcNumber( long lFirstNumber, long lSecondNumber);

SOAP Body: Response

After the Web Service provider received, interprets, and processes the method call, it sends a response or fault message. Assuming that the processing was successful, the response to the previous example might be the following:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="SomeURL">

   <SOAP-ENV:Header>
        ...    
   </SOAP-ENV:Header>

    <SOAP-ENV:Body xmlns:MyMethod="...">
        <MyMethod:CalcNumbersResponse>
            <value>200</value>
        </MyMethod:CalcNumbersResponse>
    </SOAP-ENV:Body>

</SOAP-ENV: Envelope>

Node that that child node of the Body node has the term Response appended to denote it as a response.

SOAP Body: Fault

After the Web Service provider received, interprets, and processes the method call, it sends a response or fault message. Assuming that the processing was unsuccessful, the response to the previous example might be the following:

<SOAP-ENV:Envelope xmlns:SOAP-ENV="SomeURL">

   <SOAP-ENV:Header>
        ...    
   </SOAP-ENV:Header>

    <SOAP-ENV:Body xmlns:MyMethod="...">
        <SOAP-ENV:Fault>
            <faultcode>SOAP-ENV:Server</faultcode>
            <faultstring>Server Error</faultstring>
            <detail xmlns:MyMethod="...">
                <MyMethod:MyErrorMessage>Numbers were out of range</MyMethod:MyErrorMessage>
                <MyMethod:ErrorCode>1234</MyMethod:ErrorCode>
            </detail>
        </SOAP-ENV:Fault>
     </SOAP-ENV:Body>

</SOAP-ENV: Envelope>

Note the faultcode, faultstring, and detail sub-elements under the parent <SOAP-ENV:Fault> node. There are four possible fault elements:

Supported Data Types

The SOAP specification enables many data types and structures to be passed in the SOAP message. It defers to the XML schema specification's Structure and DataType definitions which include simple types (integers, string, longs, etc.) and enumerations, as well as structures defined by the SOAP specification including Structs, Arrays, and generic compound types, which resemble a typical XML document and will probably be used when you pass objects and ASP.NET Datasets via Web Services.

Note that ASP.NET Web Services classes take care of serializing your data for you and making decisions about how best to represent your data in the SOAP message. 

Single-Reference Vs. Multi-Reference Accessors

A single-reference accessor is a piece of data referred to only once in the SOAP message. The following listing shows an example:

<xyz:PurchaseOrder>
    <CustomerName>Yazan Diranieh</CustomerName>
    <ShipTo>
        <Street>...</Street>
        <City>...</City>
        <State>,,,</State>
        <Zip>...</Zip>
    </ShipTo>
    <PurchaseLineItems>
        <Order>
            <Product>Foo</Product>
            <Price>9.99</Price>
            <Quantity>1</Quantity>
        </Order>
        <Order>
            <Product>Bar</Product>
            <Price>99.99</Price>
            <Quantity>1</Quantity>
        </Order>
    </PurchaseLineItems>
</xyz:PurchaseOrder>

The product information for Foo and Bar is used only once in this document. However, when transmitting a large amount of data such as a dataset, chances are some of the data will repeat itself. What if you had 50 or 1000 orders sent via a SOAP message. To prevent redundant data from being sent, multi-reference accessors are used: 

<xyz:PurchaseOrder>
    <CustomerName>Yazan Diranieh</CustomerName>
    <ShipTo>
        <Street>...</Street>
        <City>...</City>
        <State>,,,</State>
        <Zip>...</Zip>
    </ShipTo>
    <PurchaseLineItems>
        <Order>
            <SelectedProduct href="#1001"/>
            <Price>9.99</Price>
            <Quantity>1</Quantity>
        </Order>
        <Order>
             <SelectedProduct href="#1002"/>
            <Price>99.99</Price>
            <Quantity>1</Quantity>
        </Order>
    </PurchaseLineItems>
</xyz:PurchaseOrder>

<xyz:PurchaseOrder>
    <CustomerName>John Smith</CustomerName>
    <ShipTo>
        <Street>...</Street>
        <City>...</City>
        <State>,,,</State>
        <Zip>...</Zip>
    </ShipTo>
    <PurchaseLineItems>
       <Order>
             <SelectedProduct href="#1002"/>
            <Price>99.99</Price>
            <Quantity>7</Quantity>
        </Order>
    </PurchaseLineItems>
</xyz:PurchaseOrder>

<Product id="1001">
    <Product>Foo</Product>
    <Price>9.99</Price>
</Product>
<Product id="1002">
    <Product>Bar</Product>
    <Price>99.99</Price>
</Product>

Product 1002 is ordered by both customers, and would have to be repeated for each customer's order if you did not reference the product information by simply using href="1002" statement in the <SelectedProduct> tag