SOAP and Web Services
By Mark Volkmann, OCI Partner
Simple Object Access Protocol (SOAP) and web services are emerging technologies that are getting a lot of press. What problems do they solve? What does their use mean for other distributed architectures (DAs) such as CORBA, DCOM, and EJB?
SOAP provides a new way of creating distributed applications where remote services are invoked by sending XML-based requests to a server and results are returned in XML-based responses. This is the SOAP "message protocol."
While SOAP doesn't require use of a specific transport protocol (called "underlying protocol" in the official documentation), HTTP is currently the most common.
IBM, Microsoft, DevelopMentor and UserLand Software originally submitted SOAP to the W3C as a note. A group within the W3C called the "XML Protocol (XMLP) Working Group" is actively working to advance this to a W3C recommendation.
The first working draft was published on July 9, 2001. It is a fairly mature document as W3C working drafts go. Representatives from many more companies than the original submitters are participating, including BEA Systems, Compaq, DevelopMentor, Hewlett Packard, Intel, IONA, Novell, Oracle and Sun Microsystems. SOAP is not likely to reach the recommendation stage until 2002.
For those unfamiliar with the process, the order of documents generated within the W3C is "note," "working draft," "candidate recommendation," "proposed recommendation," and "recommendation."
A related W3C Note entitled "SOAP Messages with Attachments" builds on the first note and specifies how SOAP messages can include attachments, such as binary image data. This allows all the data needed by a service to be sent in a single request.
When Is It Appropriate To Use SOAP?
SOAP is useful for invoking code that exhibits at least one of the following characteristics.
- Service code is outside the firewall. Other DAs have difficulty calling across firewall boundaries. SOAP does this easily by using HTTP and communicating on port 80, which is typically open.
- Client and server programming languages differ. No DA, not even CORBA, has been implemented in as many different programming languages as SOAP. Of course this is because SOAP doesn't provide all the services that CORBA does, so it is much easier to implement. All that a language needs to implement SOAP is support for HTTP and XML.
- Client and server DAs differ. Other DAs have difficulty communicating with each other. SOAP can be used to wrap services implemented in other DAs, making them accessible to each other. Die-hard fans of other DAs may dismiss many of the other stated benefits of using SOAP, but this remains a compelling reason to use it for at least some subset of services that comprises many applications.
- Services are course-grained. Remote calls are always more expensive than local calls. This difference is even more pronounced with SOAP than with other DAs. Course-grained services require clients to make fewer remote calls and are thus less expensive than achieving the same functionality through a series of fine-grained services.
- Service performance is not critical. Other DAs are currently faster than SOAP and are likely to remain so. It is perfectly reasonable to conclude that SOAP is not suitable for applications that have strict performance requirements.
Web Services Overview
SOAP is just one part of the concept of web services. Below is a summary of some of the other important parts.
Web Services Description Language (WSDL)
WSDL describes web service requests and responses using XML. It is similar to CORBA IDL but can also include the location of services via a URL.
There are two main types of descriptions, service interfaces and implementations. Separating these allows multiple implementations of the same interface.
WSDL service descriptions can be cataloged and searched in a registry such as UDDI.
Universal Description, Discovery, and Integration (UDDI)
UDDI provides a registry for web services similar to the CORBA Naming and Trader services. Clients can register and search for several types of information distinguished by different "colored" pages.
White Pages contain information about service providers such as business name, description and contact information. Access to descriptions of the services offered by each provider (yellow pages) is also supplied.
Yellow Pages contain high-level service information and references to low-level information (green pages).
Examples of high-level service information include the service name, a human-readable service description, a list of categories to which the service belongs, and a key used to access a description of the business providing the service.
Services can be listed by several taxonomies such as North American Industry Classification System (NAICS), Universal Standard Products and Services Classification (UNSPSC), and geographical location.
Green Pages contain low-level information needed to invoke services. This includes business processes, service descriptions and binding information. Binding information supplies the service location and protocol used to communicate with the service. Much of this data can be supplied by referencing a WSDL file.
IBM and Microsoft currently host free, public UDDI repositories. HP will host one by the end of 2001. Services advertised to any of them are replicated to the others within 24 hours, typically much faster. These UDDI registries are accessible to anyone with web access. More restrictive UDDI registries can also be created.
A consortium of 36 companies including IBM, Microsoft and Ariba created UDDI. As of May 2001, there were 260 member companies. The consortium plans to hand over control of the UDDI specification to a standards body at some point in the future.
Electronic Business XML (ebXML)
ebXML is primarily targeted toward B2B communication. Its goal is to allow businesses to automate the following with no human involvement.
- Find partners that support specific business processes
- Enter into trading partner agreements with them
- Invoke their web services
ebXML is attempting to define standard business processes. It is also defining standard message structures for carrying out those processes based on "SOAP Messages with Attachments." It defines its own registry for publishing and finding business processes and services, but implementations could use UDDI.
ebXML is supported by these standards organizations.
- United Nations Centre for Trade Facilitation and Electronic Business (UN/CEFACT)
- Organization for the Advancement of Structured Information Standards (OASIS)
- Object Management Group (OMG)
Several major vendors including IBM, Oracle and Sun Microsystems also support ebXML.
.NET, from Microsoft, has basically the same set of requirements as ebXML. It is a framework of server products. BizTalk is one of them. .NET builds on other standards including:
- SOAP for the message protocol
- WSDL to describe services
- UDDI for the service registry
- XLANG to model business processes
Which Web Service Components Are Necessary?
When the location and message structure of a web service are known at development time, SOAP can be used independently. When the location of a WSDL file is known, these details can be discovered at run-time.
UDDI is useful when suitable web services need to be found at run-time, perhaps based on criteria such as availability, performance, reliability, and cost.
ebXML and .NET are useful when applications wish to use standard business processes, not just individual web services.
Why Do SOAP Messages Look So Complicated?
On the surface, it seems that all that is needed in SOAP request and response messages is "plain" XML.
There are three aspects of typical SOAP messages that make them seem more complex than that:
- HTTP headers
- XML Namespaces
- XML Schema
It's a good idea to gain some understanding of these apart from SOAP, since they are helpful in other contexts. Once you do, SOAP messages won't seem so cryptic.
An example of a SOAP request message is provided later.
HTTP headers serve many purposes. They are needed to specify the target host and port. They are also needed to specify the message intent through a header called "SOAPAction." Typically, providing the service name conveys this. Firewalls call filter HTTP SOAP traffic based on this.
To block all HTTP SOAP traffic, firewalls can block all HTTP messages with a "Content-Type" header of "text/xml".
Quoting from the spec., "The SOAPAction HTTP request header can be used to indicate the intent of the SOAP HTTP request."
Different SOAP implementations use this in different ways. More will be said about SOAP implementations later in this article.
Apache SOAP doesn't use the SOAPAction header at all. The service being invoked is specified in the namespace URI of the first child element in the SOAP request body.
GLUE uses SOAPAction to specify the Java class and method that implements the service.
Namespaces have two primary uses, as follows:
- Associating a context with specific XML elements/attributes
- Associating an XML Schema file with them so that XML, such as a SOAP message, can be validated
The first use can be compared with Java packages. Java packages group sets of related classes. XML namespaces group sets of related XML elements and attributes.
XML Schemas provide a means for validating XML documents in a more detailed way than is possible with Document Type Definitions (DTDs). Among other things, they allow specification of data types for element text and attribute values.
SOAP messages can use XML Schema to specify data types of elements within SOAP request and response messages using the "xsi:type" attribute. Associating an XML Schema file with a SOAP message can also do this.
Example SOAP Request
Here's an example of a SOAP request using HTTP as the transport protocol.
This fictional service retrieves detailed information about a particular item in a company inventory. A company could provide a service like this to selected suppliers that are given permission to automatically ship items when their inventory quantity drops below specified levels.
- HTTP Headers: Lines 1-7
- XML Namespaces: Lines 11-13, 17
- XML Schema: Line 20
- POST /glue HTTP/1.1
- Content-Type: text/xml
- User-Agent: GLUE/1.0
- Host: localhost:8004
- Connection: Keep-Alive
- SOAPAction: "urn:inventory:getStockInfo"
- Content-Length: 683
- <?xml version='1.0' encoding='UTF-8'?>
- <n:getStockInfo xmlns:n="http://www.ociweb.com/inventory/Inventory"
- <partNumber xsi:type='xsd:string'>A14872</partNumber>
Use of a SOAP toolkit such as IBM Web Services Toolkit (WSTK), Apache Axis, or GLUE is not required to send and receive SOAP messages. SOAP client developers can write their own code to:
- Create SOAP request messages (XML)
- Wrap them in HTTP requests
- Send them to a SOAP server
- Read the HTTP response
- Parse the enclosed SOAP response message (XML)
SOAP service developers can write their own code to:
- Receive HTTP requests
- Parse the enclosed SOAP request message (XML)
- Perform the work of the service
- Create a SOAP response message (XML)
- Wrap it in an HTTP response
- Return it to the client
SOAP packages can automate all of these steps in a way that is tailored to a specific programming language.
For example, Java-based SOAP packages such as GLUE allow Java primitive types and objects to be passed to SOAP services. The XML representation of the parameters is automatically generated. All the details of constructing, sending and receiving HTTP messages are handled. The XML in SOAP responses is automatically parsed and turned into Java objects. No SAX or DOM (popular XML programming APIs) programming is needed to work with SOAP.
The SOAP specification only defines the content of the messages passed between clients and servers. It does not define an API for creating and passing SOAP messages.
This has its pros and cons. The good news is that each language-specific SOAP toolkit can be tailored to take advantage of the strengths and style of a particular programming language. The bad news is that experience gained in using SOAP from one programming language doesn't transfer over to using SOAP from a different language.
Is SOAP Object-Oriented?
The push toward service-oriented architectures certainly seems to move us away from pure OO. We see this in the recommended approach for using EJBs where clients are supposed to communicate with object-oriented entity beans through service-oriented session beans.
Classes that implement SOAP services are similar to EJB session beans in this respect. Does this mean SOAP is not OO? Not exactly. While the mechanism for invoking SOAP services is not particularly OO, the implementation of the services still can be.
Replacement For Other DAs?
SOAP should not be viewed as a replacement for other DAs such as CORBA, DCOM and EJB. While SOAP could be used instead of these DAs, there are many reasons to continuing using them.
The use of HTTP and the need to construct and parse XML documents make it unlikely that SOAP will ever be as efficient as other DAs.
To get the best of both worlds, consider implementing most services using a non-SOAP DA. Those can be used inside a firewall for maximum performance. Perhaps only a subset of those services will require access from outside the firewall. SOAP services that simply wrap calls to those services can provide that access.
Another reason to create SOAP wrapping services is to allow calls across DAs. For example, an EJB-based application that needs to utilize a CORBA service can do so by invoking a SOAP service, which invokes the CORBA service.
Why Mix Business Logic With DA Code?
You separate business logic from GUI code.
You separate business logic from data access code.
Why not separate your business logic from code that locates remote services and invokes them?
Suppose you have a class called Portfolio that provides business logic for operating on a stock portfolio. Don't make this class CORBA, EJB or SOAP-specific. When you need to access Portfolio functionality from DAs, create classes like PortfolioCORBA, PortfolioEJB, and PortfolioSOAP.
These classes could possibly be automatically generated from the business logic classes. It may not be feasible to mix usage of these classes, since they may be written to take advantage of specific features of a DA, such as transactions.
The concept of web services is built on many different specifications, some of which work together and some of which compete.
The more low-level specifications such as SOAP and WSDL are the most mature in terms of being specified well enough that implementations can be created. High-level specifications such as ebXML and .NET are much more ambitious and are not quite ready for prime-time use.
Tools for working with SOAP, WSDL, and UDDI are still maturing, but have made great strides in the past year. For a good example, see GLUE from The Mind Electric.
Using SOAP, WSDL, and UDDL can benefit your applications, regardless of whether ebXML and .NET are being used. As proof of this, consider that other DAs do not attempt to standardize business processes and messaging structures.
Start learning more about web services today so you'll be ready to take advantage of them as tool support improves and the number of advertised web services increases. For a list of free, publicly available web services, see http://www.xmethods.com.
-  .NET
-  Apache Axis
-  ebXML
-  GLUE
-  IBM Web Services Toolkit (WSTK)
-  JavaSoft
-  SOAP 1.2
-  SOAP Messages With Attachments
-  UDDI
-  WSDL
-  XMethods
Software Engineering Tech Trends (SETT) is a regular publication featuring emerging trends in software engineering.