Tuesday, January 16, 2007

The Generic Cell Rate Algorithm

The Generic Cell Rate Algorithm (GCRA) is remarkably adaptable. I first saw it described in the ATM Forum Traffic Management 4.0 specification about a decade ago, when I was writing traffic shaping firmware for ATM equipment at Bell Laboratories. But substitute just about anything for the word cell and you can use the GCRA to shape or police any kind of traffic, from data packets on a socket, to messages between tasks in an embedded system, to solicitations and notifications in a Service Oriented Architecture (SOA). This is why I like to talk about the GCRA as applying more generally to rate controlling events instead of cells, packets, or messages.

TM 4.0 offers two different explanations of how the GCRA works: the leaky bucket and the virtual scheduler. The two explanations are exactly equivalent, but to this day I really do not like term "leaky bucket", because for lots of folks it seems to suggest that there is some kind of data buffering going on when there isn't (or if there is, it isn't the GCRA doing the buffering). So I prefer to use the term "virtual scheduler", even though when I implement the GCRA, I tend to use the leaky bucket flowchart and terminology from TM 4.0.

As I mentioned briefly in my article "Rate Control Using Throttles", the GCRA has but two parameters: the increment and the limit. The increment is the expected inter-arrival time (IAT) between two successive events in an event stream. If you ideally want to transmit one hundred events evenly dispersed over every second, then you are describing a 100Hz event source whose emissions have an IAT of 1/100th of a second. Your increment would be 0.01 seconds. TM 4.0 specifies a time granularity of one microsecond as a standard for ATM traffic contracts, so in that case you would represent that increment as (1 000 000 / 100) or 10 000 microseconds. Of course, you can implement the GCRA in whatever time units you find appropriate for your application: nanoseconds, microseconds, milliseconds, or even some convenient time interval based on the frequency of your CPU clock. For this reason, I like to talk about time in the GCRA in units of ticks. So given an increment of i, the GCRA expects an ideal event source to emit an event every i ticks.

The increment assumes a smooth flow of data. Life is seldom that simple. Neither software, nor firmware, nor even hardware is capable of emitting events with perfect precision (although the old TDM-based land line telephone system sure tried). And once events are handled by other components, for example, nodes on a network, additional variation is inevitably inserted. This is especially true when multiple event sources are aggregated together. The aggregator (for example, a network router) can only emit one event at a time onto the aggregated output channel. So it selects an event from one input channel, hence delaying all the pending events on all the other input channels. It may select the next event to emit based on many factors, such as the quality of service parameters of the traffic contracts of the individual event streams, service level agreements with the event stream owners (a.k.a. paying customers), or simply round-robin.

Because small variations in the expected IAT of events, also known as jitter, are inevitable, a traffic policer implemented using the GCRA must provide for some slack in its traffic contract. This slack is the second parameter, the limit. The limit is the total cumulative time that successive events in an event stream may deviate from the expected IAT. The limit has a number of applications. When the GCRA is used for traffic policing incoming event streams, the limit prevents the policer from dropping events which have been jittered but are still within the tolerance of the limit. This allows, for example, a policer to pass otherwise well behaved event streams just because they got a little rowdy once in a while ("Events will be events" as my sainted mother never said). When used for traffic shaping, the limit allows the shaper to emit events in short bursts where the IAT between a small number of successive events is smaller than the specified increment. This allows, for example, a shaper to drain a perilously full memory buffer onto the network instead of dropping events on the floor.

I should mention here that jitter is quite different from drift, which is a more or less consistent, or at least larger, variation in the IAT than is associated with jitter. A frequent source of drift is a lack of a common time synchronization source between network nodes. This means each node has a different sense of the passage of time, kind of like that episode of Star Trek. Or maybe it was Twilight Zone. Anyway, because the nodes are timed to different clocks, the event source node believes it is emitting events more or less on time, while the event sink node may believe that every single event is arriving a little early. As we shall see, a GCRA used for traffic policing events streams on the sink node has limited patience with drift. Limited in fact by the limit. The limit is a time budget, and every single early event uses up more of that budget. Once the limit is exhausted, the traffic police will start dropping events, or at least marking them for possible deletion downstream, which is a kind of Scarlet Letter in the networking world.

Although it may sound complicated, the typical implementation of the GCRA is simple and efficient. This will be a little more obvious when we see how the increment and the limit play together, and see how little state the GCRA actually has to maintain and what simple calculations it has to do. The discussion below refers to the following set of diagrams, which show various scenarios of an event stream and a GCRA with an increment i and a limit l. (Click on the thumbnail to see a full size version.) Such a GCRA is typically represented in the literature using the notation GCRA(i, l). Yeah, the variables kinda suck, but I'm trying to stick with the TM 4.0 terminology.
GCRA and Virtual Scheduler
Figure [1] shows a time line as the GCRA sees it. It expects to see events E0, E1, etc. arrive evenly spread exactly i ticks apart, where i is the increment.

Figure [2] shows just such a well behaved event stream, with the actual events A0, A1, etc. all arriving exactly on time. These are the kind of events that got beat up in school and called "mama's boys".

Figure [3] shows event A2 arriving j ticks late. A2 is always arriving late, and always has some good excuse: its car broke down, it had to drop the kids of at school, whatever. You would think the GCRA would give the event stream some credit by maybe increasing its limit by j ticks. Or maybe it would penalize event A2. But instead, when an event arrives late, the GCRA shifts the entire time line down so that the IAT of all subsequent events will be measured relative to the new time line established by the late arriving A2.

Figure [4] shows that event A2 has been somehow dropped by the network and was never seen by the GCRA. The GCRA doesn't care one whit about the fact that A2 was dropped. From its perspective, A3 is the next event, and it arrived i ticks late. So once again the time line is shifted down so that the IAT of all subsequent events will be measured relative to the new time line established by the arrival of A3.

Figure [5] shows that although A1 and A3 arrived right on schedule, A2 was jittered in its passage though the network. Because A2 arrived late, the time line is shifted down just as before. Now, even though A3 arrived right on time as far as the original time line was concerned, the GCRA sees it as arriving early from the perspective of the new time line, just k ticks after A2. So while A2 arrived j ticks late, it is A3, which arrived (i-k) ticks early, which may be penalized. The GCRA compares (i-k) to the limit l and if it is less, A3 is not dropped and the event stream goes on its merry way. However, something interesting happens to all subsequent events A4, A5 and on to infinity. Even through they are arriving spot on in terms of the original time line, they are all (i-k) ticks early from the perspective of the new time line. The limit l has been reduced by (i-k) ticks, and will remain so until an event arrives on or after its expected time on the new time line.

Because jitter is a more or less random mechanism, and it is about as likely that events in the same event stream arrive late as they do early, assuming they were emitted with the correct timing from their original event source, this works. When an event arrives early, part of the limit budget is consumed. But when a subsequent event in the same event stream arrives on time or late, the time line is shifted down and all prior sins are forgiven.

This is also why drift is a problem. If an event source is emitting events too quickly for any reason (drift being just a common one), each early event uses up part of the limit budget until it is exhausted. This can happen because a single event arrived very very early. But it can also happen gradually, as each successive event uses up a tiny fraction of the limit. This can result in network links that appear to work just fine for days, until they suddenly and mysteriously quit working altogether. This is one of the reasons that telephony networks carrying real-time voice samples are carefully time synchronized to each other and to common timing sources of very high precision, typically atomic clocks. Timing synchronization can be a big deal.

Figure [6] shows event A3 arriving early but within the limit of the GCRA. But A4 arrives equally early, and its expected arrival time E4 is beyond the limit of the GCRA. A4 is dropped, as would be all subsequent events, until one arrives at or after its expected time on the current time line. In this example, A5 arrives at its expected time E5, and the limit budget is restored.

Finally, it is instructive to show some code snippets for a GCRA implementation just to show how simple it is. Below are some Java snippets (with some distracting details removed) borrowed from the GenericCellRateAlgorithm class in the Digital Aggregates Buckaroo library. First, there is the state maintained by the GCRA. The Java class maintains this state in long variables because they contain time durations in ticks. Because ticks are typically in units of microseconds or nanoseconds, the large range of the 64-bit integer is necessary. Since most modern processors implement 64-bit integers directly, this isn't typically a problem (although I have implemented 32-bit versions of the GCRA on embedded systems).

long increment;
long limit;
long now; // time of this event
long then; // time of prior event
long x = 0; // expected ticks
long x1 = 0; // actual ticks

This next snippet computes the delay (if any) for an event when the current time is now, and the prior event was emitted then. The difference between now and then is the IAT if the current event were to be emitted.

long delay = 0;
long iat = now - then;
if (x <= iat) {
x1 = 0;
} else {
x1 = x - iat;
if (x1 > limit) {
delay = x1 - limit;
if (delay == 0) {
then = now;
x = x1 + increment;

If the delay is zero, the event conforms to the traffic contract implemented by the GCRA and may be emitted, in which case the state of the GCRA has to be updated with the expected IAT of the next event. If the delay is greater than zero, the emission of the event must be delayed (traffic shaping) or the event violates the traffic contract of its event stream and may be dropped (traffic policing).

That's it! No loops. No complicated math. No floating point.

This has been a whirlwind tour of the Generic Cell Rate Algorithm. But the GCRA is seldom used directly. One or more GCRAs are typically combined, with their increment and limit parameters computed from traffic contract parameters in more useful forms, like events per second. We will talk about how traffic contracts are mapped into one or more GCRAs in a later article.


Buckaroo, Digital Aggregates Corp., 2007

Chip Overclock, "Rate Control Using Throttles", January 2007

N. Giroux, et al., Traffic Management 4.0, tm-0056.000, ATM Forum, April 1996, pp. 62-67

J. L. Sloan, "ATM Traffic Management", Digital Aggregates Corp., August 2005


Anonymous said...

I have always felt that the rate policing mechanism you discuss is unfair. I think ATM might have done better (in terms of adoption) with a mechanism that did not inflict an unjustly severe punishment on traffic streams with minor violations. As you correctly point out, enforcing the maximum rate limit based on a single nearest-neighbor spacing places a heavy burden on the traffic shaping function in the transmitter. To continue to label subsequent cells non-conforming until the entire stream shifts to the new "beat", is punishment of biblical proportions.
Surely an algorithm that provides some credit to earlier long spacing and/or does not count discarded cells when determining the conformity of the next cell would have made the technology more acceptable for end-users.

I characterize the method as unfair since the burden seems to fall mostly on the transmitters (users), while giving wide latitude to the policer (network).


Chip Overclock said...

I wasn't privy to the discussion that resulted in the TM 4.0 spec, and given your experience in optical networking, you probably have a better clue than I do. But I'd guess the GCRA described in TM 4.0, which is the basis for both ATM shaping and policing, was designed to be easily implementable in hardware for large numbers of concurrent virtual circuits, carrying minimal state per circuit. Nearest neighbor spacing has the benefit of being very easy to implement, perfect for a chip set that has to handle tens of thousands of concurrent VCs running at line rates like OC-3 or OC-12. In this respect, I'm sure the GCRA was a compromise. It's been a while since I've looked at technologies like Frame Relay, but I expect its developers faced the same design points. In my quest for a simple yet adaptable shaping/policing algorithm, I grew to like the GCRA, although I agree with your criticisms. I pondered this a lot while writing this article and drawing up its diagrams. There were moments when I wondered how ATM worked at all! However, you comment about the burden falling mostly on the transmitters captures the flavor of ATM traffic management very well. This is probably why I spent a few late nights getting the traffic shaping firmware working correctly.

Todd H said...

Hey, your description of GCRA is great, especially the diagram! I am working on some code that implements this algorithm and was having trouble understanding how it all worked until I came across this page.

One question with regards to entry [6] in the diagram - could event A5 have showed up at time E4 and still been accepted as conforming?

The reason I ask is I was looking through the ATM Traffic Management specification you mentioned in the description and in the GCRA flowchart they provide it looks like the Theoretical Arrival Time (TAT) does not get updated when a packet is non-conforming. IE, when A4 showed up too early and is dropped the TAT is left unchanged (=E4).

So couldn't the following sequence also work : 1) A3 arrives within limit so TAT set to E4 2) A4 arrives and is non-conforming so TAT left unchanged 3) A5 arrives at time E4 and is conforming and TAT set to E5.

Todd Hayton

Chip Overclock said...

I believe your comment reveals a hole in my thinking on this topic.

I used to live and breathe TM 4.0 while writing firmware for an ATM switch and an ATM network interface. But almost all of my work with the GCRA was applied to traffic shaping, not policing. After pondering it for a bit (and perusing the spec), I think you're absolutely right. TM 4.0 indicates that throwing away a non-conforming cell has exactly the effect you describe: the TAT isn't updated, so that following cell may well conform.

Well done, and thanks for the insightful comment!

Chip Overclock said...

I'm a little embarrassed to admit to a senior moment: the software I've written in C++ (Desperado) and Java (Buckaroo) to implement the GCRA works exactly as Todd describes, but I didn't think to go look at it before replying to Todd's comment, or, apparently, before writing the article. The GCRA SW I've implemented operates in two phases, with a first "attempt" phase the indicates whether the emission conforms to the GCRA, and a second "commit" phase that updates the GCRA state IFF the application decides to emit the event to the traffic shaped stream.

Thanks again to Todd and others for keeping me honest!