On the Design of Application Protocols; Protocol Mechanisms - Feature Story, pg 2/6

Download This Article

This Mappa.Mundi feature article is available in the following formats:

By Marshall Rose:

Internet Messaging
at Amazon.com

Marshall T. Rose is Chief of Protocol at Invisible Worlds, Inc. where he is responsible both for the Blocks architecture and the server-side implementation. Rose lives with internetworking technologies, as a theorist, implementor, and agent provocateur. He formerly held the position of IETF Area Director for Network Management, one of a dozen individuals who oversee the Internet's standardization process. His current book is Internet Messaging (Prentice Hall).

On the Design of Application Protocols

By Marshall T. Rose

Protocol Mechanisms

The next step is to look at the tasks that an application protocol must perform and how it goes about performing them. Although an exhaustive exposition might identify a dozen (or so) areas, the ones we're interested in are:

framing, which tells how the beginning and ending of each message is delimited;
encoding, which tells how a message is represented when exchanged;
error reporting, which tells how errors are described;
multiplexing, which tells how independent parallel exchanges are handled;
user authentication, which tells how the peers at each end of the connection are identified and verified; and,
transport security, which tells how the exchanges are protected against third-party interception or modification.

A notable absence in this list is naming — we'll explain why later on.

Framing

There are three commonly used approaches to delimiting messages: octet-stuffing, octet-counting, and connection-blasting.

An example of a protocol that uses octet-stuffing is SMTP. Commands in SMTP are line-oriented (each command ends in a CR-LF pair). When an SMTP peer sends a message, it first transmits the "DATA" command, then it transmits the message, then it transmits a "." (dot) followed by a CR-LF. If the message contains any lines that begin with a dot, the sending SMTP peer sends two dots; similarly, when the other SMTP peer receives a line that begins with a dot, it discards the dot, and, if the line is empty, then it knows it's received the entire message. Octet-stuffing has the property that you don't need the entire message in front of you before you start sending it. Unfortunately, it's slow because both the sender and receiver must scan each line of the message to see if they need to transform it.

An example of a protocol that uses octet-counting is HTTP. Commands in HTTP consist of a request line followed by headers and a body. The headers contain an octet count indicating how large the body is. The properties of octet-counting are the inverse of octet-stuffing: before you can start sending a message you need to know the length of the whole message, but you don't need to look at the content of the message once you start sending or receiving.

An example of a protocol that uses connection-blasting is FTP. Commands in FTP are line-oriented, and when it's time to exchange a message, a new TCP connection is established to transmit the message. Both octet-counting and connection-blasting have the property that the messages can be arbitrary binary data; however, the drawback of the connection-blasting approach is that the peers need to communicate IP addresses and TCP port numbers, which may be "transparently" altered by NATS[11] and network bugs. In addition, if the messages being exchanged are small (say less than 32k), then the overhead of establishing a connection for each message contributes significant latency during data exchange.

Encoding

There are many schemes used for encoding data (and many more encoding schemes have been proposed than are actually in use). Fortunately, only a few are burning brightly on the radar.

The messages exchanged using SMTP are encoded using the 822-style[12]. The 822-style divides a message into textual headers and an unstructured body. Each header consists of a name and a value and is terminated with a CR-LF pair. An additional CR-LF separates the headers from the body.

It is this structure that HTTP uses to indicate the length of the body for framing purposes. More formally, HTTP uses MIME, an application of the 822-style to encode both the data itself (the body) and information about the data (the headers). That is, although HTTP is commonly viewed as a retrieval mechanism for HTML[13], it is really a retrieval mechanism for objects encoded using MIME, most of which are either HTML pages or referenced objects such as GIFs.

Error Reporting

An application protocol needs a mechanism for conveying error information between peers. The first formal method for doing this was defined by SMTP's "theory of reply codes". The basic idea is that an error is identified by a three-digit string, with each position having a different significance:

the first digit:

indicating success or failure, either permanent or transient;

the second digit:

indicating the part of the system reporting the situation (e.g., the syntax analyzer); and,

the third digit:

identifying the actual situation.

Operational experience with SMTP suggests that the range of error conditions is larger than can be comfortably encoded using a three-digit string (i.e., you can report on only 10 different things going wrong for any given part of the system). So, [14] provides a convenient mechanism for extending the number of values that can occur in the second and third positions.

Virtually all of the application protocols we've discussed thus far use the three-digit reply codes, although there is less coordination between the designers of different application protocols than most would care to admit.

Finally, in addition to conveying a reply code, most protocols also send a textual diagnostic suitable for human, not machine, consumption. (More accurately, the textual diagnostic is suitable for people who can read a widely-used variant of the English language.) Since reply codes reflect both positive and negative outcomes, there have been some innovative uses made for the text accompanying positive responses, e.g., prayer wheels.

Multiplexing

Few application protocols today allow independent parallel exchanges over the same connection. In fact, the more widely-implemented approach is to allow pipelining, e.g., command pipelining[15] in SMTP or persistent connections in HTTP 1.1. Pipelining allows a client to make multiple requests of a server, but requires the requests to be processed serially. (Note that a protocol needs to explicitly provide support for pipelining, since, without explicit guidance, many implementors produce systems that don't handle pipelining properly; typically, an error in a request causes subsequent requests in the pipeline to be discarded).

Pipelining is a powerful method for reducing network latency. For example, without persistent connections, HTTP's framing mechanism is really closer to connection-blasting than octet-counting, and it enjoys the same latency and efficiency problems.

In addition to reducing network latency (the pipelining effect), parallelism also reduces server latency by allowing multiple requests to be processed by multi-threaded implementations. Note that if you allow any form of asynchronous exchange, then support for parallelism is also required, because exchanges aren't necessarily occurring under the synchronous direction of a single peer.

Unfortunately, when you allow parallelism, you also need a flow control mechanism to avoid starvation and deadlock. Otherwise, a single set of exchanges can monopolize the bandwidth provided by the transport layer. Further, if the peer is resource-starved, then it may not have enough buffers to receive a message and deadlock results.

The flow control mechanism used by TCP is based on sequence numbers and a sliding window: each receiver manages a sliding window that indicates the number of data octets that may be transmitted before receiving further permission. However, it's now time for the third shoe of multiplexing to drop: segmentation. If you do flow control then you also need a segmentation mechanism to fragment messages into smaller pieces before sending and then re-assemble them as they're received.

All three of the multiplexing issues: parallelism, flow control, and segmentation have an impact on how the protocol does framing. Before we defined framing as "how to tell the beginning and end of each message" — in addition, we need to be able to identify independent messages, send messages only when flow control allows us to, and segment them if they're larger than the available window (or too large for comfort).

User Authentication

Perhaps for historical (or hysterical) reasons, most application protocols don't do authentication. That is, they don't authenticate the identity of the peers on the connection or the authenticity of the messages being exchanged. Or, if authentication is done, it is domain-specific for each protocol. For example, FTP and HTTP use entirely different models and mechanisms for authenticating the initiator of a connection. (Independent of mainstream HTTP, there is a little-used variant[16] that authenticates the messages it exchanges.)

A few years ago, SASL[17] (the Simple Authentication and Security Layer) was developed to provide a framework for authenticating protocol peers. SASL let's you describe how an authentication mechanism works, e.g., an OTP[18] (One-Time Password) exchange. It's then up to each protocol designer to specify how SASL exchanges are conveyed by the protocol. For example, [19] explains how SASL works with SMTP.

A notable exception to the SASL bandwagon is HTTP, which defines its own authentication mechanisms[20].There is little reason why SASL couldn't be introduced to HTTP, although to avoid race-conditions with the use of OTP, the persistent connection mechanism of HTTP 1.1 must be used.

SASL has an interesting feature in that in addition to explicit protocol exchanges to authenticate identity, it can also use implicit information provided from the layer below. For example, if the connection is running over IPsec[21], then the credentials of each peer are known and verified when the TCP connection is established.

Transport Security

HTTP is the first widely used protocol to make use of transport security to encrypt the data sent on the connection. The current version of this mechanism, TLS[22], is also available for SMTP and other application protocols such as ACAP[23] (the Application Configuration Access Protocol).

The key difference between the original mechanism and TLS, is one of provisioning. In the initial approach, a world-wide web server would listen on two ports, one for plaintext traffic and the other for secured traffic; in contrast, a server implementing an application protocol that is TLS-enabled listens on a single port for plaintext traffic; once a connection is established, the use of TLS is negotiated by the peers.

Let's Recap

Let's briefly compare the properties of the three main connection-oriented application protocols in use today:

Mechanism

SMTP

FTP

HTTP

Framing

Stuffing

Blasting

Counting

Encoding

822

Binary

MIME

Error Reporting

3-digit

Multiplexing

pipelining

User Authentication

SASL

user/pass

Transport Security

TLS

TLS (nee SSL)

Note that the username/password mechanisms used by FTP and HTTP are entirely different with one exception: both can be termed a "username/password" mechanism.

These three choices are broadly representative: as more protocols are considered, the patterns are reinforced. For example, POP[24] uses octet-stuffing, but IMAP[25] uses octet-counting, and so on.

Next » Protocol Properties

contact | about | site map | home