Protocol Mechanisms
The next step is to look at the tasks that an application protocol
must perform and how it goes about performing them. Although an
exhaustive exposition might identify a dozen (or so) areas, the ones
we're interested in are:
-
framing, which tells how the beginning and ending of each message
is delimited;
-
encoding, which tells how a message is represented when exchanged;
-
error reporting, which tells how errors are described;
-
multiplexing, which tells how independent parallel exchanges are handled;
-
user authentication, which tells how the peers at each end of the
connection are identified and verified; and,
-
transport security, which tells how the exchanges are protected
against third-party interception or modification.
A notable absence in this list is naming
we'll explain why later on.
There are three commonly used approaches to delimiting messages:
octet-stuffing, octet-counting, and connection-blasting.
An example of a protocol that uses octet-stuffing is SMTP. Commands
in SMTP are line-oriented (each command ends in a CR-LF pair). When
an SMTP peer sends a message, it first transmits the "DATA" command,
then it transmits the message, then it transmits a "." (dot)
followed by a CR-LF. If the message contains any lines that begin
with a dot, the sending SMTP peer sends two dots; similarly, when
the other SMTP peer receives a line that begins with a dot, it
discards the dot, and, if the line is empty, then it knows it's
received the entire message. Octet-stuffing has the property that
you don't need the entire message in front of you before
you start sending it. Unfortunately, it's slow because both the sender
and receiver must scan each line of the message to see if they
need to transform it.
An example of a protocol that uses octet-counting is HTTP. Commands
in HTTP consist of a request line followed by headers and a body.
The headers contain an octet count indicating how large the body is.
The properties of octet-counting are the inverse of octet-stuffing:
before you can start sending a message you need to know the length
of the whole message, but you don't need to look at the content of
the message once you start sending or receiving.
An example of a protocol that uses connection-blasting is FTP.
Commands in FTP are line-oriented, and when it's time to exchange a
message, a new TCP connection is established to transmit the
message. Both octet-counting and connection-blasting have the
property that the messages can be arbitrary binary data; however,
the drawback of the connection-blasting approach is that the peers
need to communicate IP addresses and TCP port numbers, which may be
"transparently" altered by NATS[11] and network bugs. In addition,
if the messages being exchanged are small (say less than 32k), then
the overhead of establishing a connection for each message
contributes significant latency during data exchange.
There are many schemes used for encoding data (and many more
encoding schemes have been proposed than are actually in use). Fortunately,
only a few are burning brightly on the radar.
The messages exchanged using SMTP are encoded using the 822-style[12].
The 822-style divides a message into textual headers
and an unstructured body. Each header consists of a name and a value
and is terminated with a CR-LF pair. An additional CR-LF separates
the headers from the body.
It is this structure that HTTP uses to indicate the length of the
body for framing purposes. More formally, HTTP uses MIME, an
application of the 822-style to encode both the data itself (the
body) and information about the data (the headers). That is,
although HTTP is commonly viewed as a retrieval mechanism for
HTML[13],
it is really a retrieval mechanism for objects encoded
using MIME, most of which are either HTML pages or referenced
objects such as GIFs.
An application protocol needs a mechanism for conveying error
information between peers. The first formal method for doing this
was defined by SMTP's "theory of reply codes". The basic idea is
that an error is identified by a three-digit string, with each
position having a different significance:
- the first digit:
-
indicating success or failure, either permanent or
transient;
- the second digit:
-
indicating the part of the system reporting the situation (e.g., the syntax analyzer); and,
- the third digit:
-
identifying the actual situation.
Operational experience with SMTP suggests that the range of error
conditions is larger than can be comfortably encoded using a
three-digit string (i.e., you can report on only 10 different things
going wrong for any given part of the system). So,
[14] provides a
convenient mechanism for extending the number of values that can
occur in the second and third positions.
Virtually all of the application protocols we've discussed thus far
use the three-digit reply codes, although there is less coordination
between the designers of different application protocols than most
would care to admit.
Finally, in addition to conveying a reply code, most protocols also
send a textual diagnostic suitable for human, not machine,
consumption. (More accurately, the textual diagnostic is suitable
for people who can read a widely-used variant of the English
language.) Since reply codes reflect both positive and negative
outcomes, there have been some innovative uses made for the text
accompanying positive responses, e.g., prayer wheels.
Few application protocols today allow independent parallel exchanges
over the same connection. In fact, the more widely-implemented
approach is to allow pipelining, e.g., command pipelining[15] in
SMTP or persistent connections in HTTP 1.1. Pipelining allows a
client to make multiple requests of a server, but requires the
requests to be processed serially. (Note that a protocol needs to
explicitly provide support for pipelining, since, without explicit
guidance, many implementors produce systems that don't handle
pipelining properly; typically, an error in a request causes
subsequent requests in the pipeline to be discarded).
Pipelining is a powerful method for reducing network latency. For
example, without persistent connections, HTTP's framing mechanism is
really closer to connection-blasting than octet-counting, and it
enjoys the same latency and efficiency problems.
In addition to reducing network latency (the pipelining effect),
parallelism also reduces server latency by allowing multiple requests
to be processed by multi-threaded implementations. Note that if you
allow any form of asynchronous exchange, then support for
parallelism is also required, because exchanges aren't necessarily
occurring under the synchronous direction of a single peer.
Unfortunately, when you allow parallelism, you also need a flow
control mechanism to avoid starvation and deadlock. Otherwise, a
single set of exchanges can monopolize the bandwidth provided by the
transport layer. Further, if the peer is resource-starved, then it
may not have enough buffers to receive a message and deadlock
results.
The flow control mechanism used by TCP is based on sequence numbers
and a sliding window: each receiver manages a sliding window that
indicates the number of data octets that may be transmitted before
receiving further permission. However, it's now time for the third
shoe of multiplexing to drop: segmentation. If you do flow control
then you also need a segmentation mechanism to fragment messages
into smaller pieces before sending and then re-assemble them as
they're received.
All three of the multiplexing issues: parallelism, flow control, and
segmentation have an impact on how the protocol does framing. Before
we defined framing as "how to tell the beginning and end of each
message" in addition, we need to be able to identify independent
messages, send messages only when flow control allows us to, and
segment them if they're larger than the available window (or too
large for comfort).
Perhaps for historical (or hysterical) reasons, most application
protocols don't do authentication. That is, they don't authenticate
the identity of the peers on the connection or the authenticity of
the messages being exchanged. Or, if authentication is done, it is
domain-specific for each protocol. For example, FTP and HTTP use
entirely different models and mechanisms for authenticating the
initiator of a connection. (Independent of mainstream HTTP, there is
a little-used variant[16] that authenticates the messages it
exchanges.)
A few years ago,
SASL[17]
(the Simple Authentication and Security
Layer) was developed to provide a framework for authenticating
protocol peers. SASL let's you describe how an authentication
mechanism works, e.g., an OTP[18] (One-Time Password) exchange.
It's then up to each protocol designer to specify how SASL exchanges are
conveyed by the protocol. For example,
[19] explains how SASL works with SMTP.
A notable exception to the SASL bandwagon is HTTP, which defines
its own
authentication mechanisms[20].There is little reason why
SASL couldn't be introduced to HTTP, although to avoid
race-conditions with the use of OTP, the persistent connection
mechanism of HTTP 1.1 must be used.
SASL has an interesting feature in that in addition to explicit
protocol exchanges to authenticate identity, it can also use
implicit information provided from the layer below. For example, if
the connection is running over IPsec[21],
then the credentials of
each peer are known and verified when the TCP connection is
established.
HTTP is the first widely used protocol to make use of transport
security to encrypt the data sent on the connection. The current
version of this mechanism,
TLS[22],
is also available for SMTP and
other application protocols such as
ACAP[23]
(the Application Configuration Access Protocol).
The key difference between the original mechanism and TLS, is one of
provisioning. In the initial approach, a world-wide web server would
listen on two ports, one for plaintext traffic and the other for
secured traffic; in contrast, a server implementing an application
protocol that is TLS-enabled listens on a single port for plaintext
traffic; once a connection is established, the use of TLS is
negotiated by the peers.
Let's briefly compare the properties of the three main
connection-oriented application protocols in use today:
Mechanism |
|
SMTP
|
|
FTP
|
|
HTTP
|
|
Framing
|
|
Stuffing
|
|
Blasting
|
|
Counting
|
|
|
Encoding
|
|
822
|
|
Binary
|
|
MIME
|
|
|
Error Reporting
|
|
3-digit
|
|
3-digit
|
|
3-digit
|
|
|
Multiplexing
|
|
pipelining
|
|
no
|
|
pipelining
|
|
|
User Authentication
|
|
SASL
|
|
user/pass
|
|
user/pass
|
|
|
Transport Security
|
|
TLS
|
|
no
|
|
TLS (nee SSL)
|
|
|
Note that the username/password mechanisms used by FTP and HTTP are
entirely different with one exception: both can be termed a
"username/password" mechanism.
These three choices are broadly representative: as more protocols
are considered, the patterns are reinforced. For example,
POP[24] uses octet-stuffing,
but IMAP[25] uses octet-counting,
and so on.
Next » Protocol Properties
Copyright © 1999, 2000 media.org.
|