Wednesday, January 17, 2007

Traffic Contracts

Asynchronous Transfer Mode (ATM) associates a traffic contract with every individual cell stream or virtual circuit. A traffic contract is a formal characterization of the desired emission behavior (if you are traffic shaping) or expected arrival behavior (if you are traffic policing) of an event stream. ATM isn't the only technology to use traffic contracts as one of the mechanisms to provide quality of service (QOS); Frame Relay comes to mind, and IP is also slowly providing similar capabilities, particularly for QOS sensitive applications like Voice Over IP (VOIP). But ATM is the one with which I am the most familiar, and its relatively simple traffic contracts are easily adaptable to other applications, from embedded systems to enterprise Service Oriented Architectures (SOA).

ATM offers several classes of service, but the three that I dealt with exclusively in my work were unspecified bit rate (UBR), constant bit rate (CBR), and variable bit rate (VBR). Each class of service has its own particular form of traffic contract.

The UBR contract is really no contract at all. The event stream makes no promises about how it will behave, and the network makes no promises that any of your events will get to their destination. If the network gets congested, the events in the UBR streams are the first to go. UBR event streams get no respect, but they are also the cheapest to lease from the network service provider, since everything is strictly best effort. Most IP traffic tunnelled over an ATM network is sent UBR. UBR is effectively what you are using if you are not policing or shaping traffic at all in your application.

The CBR contract is the simplest real traffic contract. The event stream guarantees that it will never send events above a specified rate, more or less, and the network guarantees that it will not drop the events in the stream as long as the stream abides by its contract. The problem with CBR contracts is that they are expensive. In order to meet the traffic contract, the network has to reserve the peak bandwidth specified in the traffic contract at all times. Customers paying for CBR service pay for the full bandwidth, 24x7, whether they use it or not. The only applications which really make sense for CBR service are highly time-synchronized event streams of a constant rate, like uncompressed voice, which is exactly what we used it for in the applications on which I worked. Voice encodings like G.711 send an eight-bit voice sample eight thousand times a second whether or not you are speaking. In ATM, because CBR traffic is often very jitter sensitive it typically is transmitted through the network at a very high priority, something like a Presidential motorcade with police escort running all the red lights.

Somewhere in between UBR and CBR is VBR. VBR is a complex traffic contract in which the event stream guarantees that it will never send events above a specified rate, that it will on average send at a typically much lower rate, and if it does send at the higher rate, it will limit the number of events it will send before returning to sending at a lower rate. VBR event streams get higher priority than UBR streams, cost less than CBR streams, and are allowed to occasionally burst at a higher rate when the situation demands it. In the application I worked on, we sent control messages through VBR streams. The control messages got priority over any UBR (typically IP) traffic, yet were guaranteed not to interfere with the CBR voice traffic. If some heavy-duty messaging was required, the VBR stream could briefly emit bursts of messages at a higher rate to clear its buffers. A typical application for the VBR class of service is compressed video.

The parameters of both the CBR and VBR traffic contracts are mapped into the parameters increment (i) and limit (l) of the Generic Cell Rate Algorithm (GCRA), which I described in such detail in my article by the same name that the reader's eyes will probably bleed. The literature uses the notation

GCRA(i,l)

when defining this mapping.

The traffic contract associated with the CBR class of service has two parameters: the peak cell rate (PCR) in cells per second, and the cell delay variation tolerance (CDVT) in microseconds. Of course, when you adapt the CBR traffic contract to your own application, you'll use whatever units you find convenient. The PCR places an upper bound on the bandwidth used by the event stream. The CDVT provides some slack in the contract to accommodate jitter.

The CBR traffic contract is simply mapped by computing the increment as

1/PCR

because the inverse of the rate is the inter-arrival time, which is what the increment is, and using the CDVT directly as the limit. This is represented as

GCRA(1/PCR, CDVT)

in our notation.

Of course we don't literally compute the reciprocal of the PCR, because that gives us a result in units of a fraction of a second. We divide it into the number of ticks per second in whatever time units we are using. If we are using microseconds in our implementation of the GCRA, then we compute

1 000 000/PCR

to get the increment in microseconds. The choice of time unit for a clock tick, or equivalently the choice of frequency, of our GCRA implementation depends on several factors, for example, the dynamic range of the data type we are using to store time durations in ticks, the maximum rates we need to support, the granularity of the underlying clock in our application, and the maximum time intervals we need to compute. I have implemented GCRA and other rate control algorithms using milliseconds, microseconds, nanoseconds, and CPU clock ticks, and used both 32-bit and 64-bit arithmetic. There is no right answer for all applications, and frequently the correct choice is a compromise that may limit the rate control algorithm in terms of its precision, accuracy, or performance.

The traffic contract associated with the VBR class of service has four parameters: the same PCR and CDVT as for CBR, plus the sustained cell rate (SCR) in cells per second, and the maximum burst size (MBS) in cells. The SCR is the long term average rate of the event stream. The MBS is the largest number of events the event stream can burst at PCR. The VBR contract uses two GCRAs, one that enforces the PCR and the CDVT, and one that enforces the SCR and the MBS. The VBR event stream must conform to both contracts simultaneously.

The first GCRA

GCRA(1/PCR, CDVT)

is the same as the contract used for the CBR class of service. The increment for the second contract is easy: it's the recipricol of the SCR, or

1/SCR

just as with the PCR. The event stream isn't restricted to just emitting at either PCR or SCR. It can emit events at any rate from zero to PCR. This is accommodated by computing a burst tolerance (BT). The BT is computed as

(MBS - 1) * ((1/SCR) - (1/PCR))

where

((1/SCR) - (1/PCR))

is the difference in inter-arrival times between events emitted at SCR and events emitted at PCR, and

(MBS - 1)

is the number of intervals between MBS cells. (There's actually a more complicated explanation for this formula that has to do with the fact that BT is not computed from MBS but rather vice versa, but let's stick with this simpler version for now.) The VBR contract is represented as

GCRA(1/SCR, CDVT + BT)

or equivalently

GCRA(1/SCR, CDVT + ((MBS-1) * ((1/SCR) - (1/PCR))))

where the quantity

CDVT + BT

is the limit. Why do we add CDVT? Because it is perfectly valid for SCR to equal PCR. (I remember vividly having to explain this at length to an engineer for another ATM switch vendor, and why this wasn't equivalent to the CBR class of service in typical ATM chip sets. I may have climbed on my soapbox and waved the TM 4.0 spec around. I apologize if you are reading this.) If this is the case, then unless we add CDVT, the second GCRA becomes

GCRA(1/SCR, 0)

or, because PCR==SCR, equivalently

GCRA(1/PCR, 0)

which is a more restrictive traffic contract than the first GCRA, which is clearly not our intent. What we really want is for the PCR==SCR case to degenerate to

GCRA(1/PCR, CDVT)

and adding the CDVT to the BT accomplishes this. (I didn't find the TM 4.0 specification to be terribly clear on this. It wasn't until I was debugging my traffic shaping firmware with a broadband network analyzer that cost more than I made in a year that I figured this out.)

The Digital Aggregates Buckaroo open source library of Java classes implements the GCRA as a Throttle in the class GenericCellRateAlgorithm with a tick of one microsecond, and that class can be used directly if you care to figure out your own increment and limit parameters. The class CellRateThrottle uses one or two GenericCellRateAlgorithm objects to provide either a CBR or a VBR traffic contract, depending on what constructor you use. These classes are useful for rate controlling in terms of events per second, for example messages per second, packets per second, and the like.

But if you want to emit variable length packets and rate control in terms of bytes per second, calling the methods in these classes for every byte in a packet is not efficient. The classes BandwidthAlgorithm and BandwidthThrottle extend the prior classes by allowing you to specify a traffic contract in terms of bytes per second, using a tick of one nanosecond, and indicate how many bytes you are going to emit in each packet. Because you are presumably handing your packet to some underlying operating system, or at least device driver, to emit at its own rate, these Throttles exhibit burstier behavior than their more ATM-centric counterparts, but the long term effective emission rate is correct.

The Digital Aggregates Desperado open source library of C++ classes has similar tools, but the Java versions represent a couple more years of pondering on my part. I consider the C++ classes to be deprecated, and will sooner or later port the Java implementations into the C++ library.

I believe that rate control becomes more and more crucial the bigger, more complex, and the more distributed your real-time system becomes. Even if you are implementing, say, an enterprise service bus (ESB) in a single JVM, applying traffic contracts to the various components that may interact with one another is an important step in delivering a scalable, reliable, and predictable system that can be debugged in the lab, troubleshot at a customer site, and understood by others. It is even more important if you are building a true distributed system where many traffic sources may be aggregated into a single traffic sink.

Sources

Buckaroo, Digital Aggregates Corp., 2007

Desperado, Digital Aggregates Corp., 2007

Chip Overclock, "The Generic Cell Rate Algorithm", January 2007

Chip Overclock, "Rate Control Using Throttles", January 2007

Chip Overclock, "Traffic Management", December 2006

N. Giroux, et al., Traffic Management 4.0, tm-0056.000, ATM Forum, April 1996, pp. 62-67

J. L. Sloan, "ATM Traffic Management", Digital Aggregates Corp., August 2005

1 comment:

Chip Overclock said...

In this article I mention that the Java Throttles available in Buckaroo represent a couple of years more thinking on my part over their C++ counterparts available in Desperado. The Desperado "imperial" release, circa 2007/02/01, has updated Throttles more similar in both function and interface to the Buckaroo Throttles.