## 2014-05-10

### Crumbs of coroutines

Playing with handmade lexical scanners and parsers you soon discover how cool it would be if you could use coroutines, but unfortunately language like C and C++ haven't such a feature, nor they have a general gear to manage continuations — even though setjmp/longjmp can be thought as what you need to begin, but they maybe do not bring you to the end, not always at least.

I wanted to program this example (by Simon Tatham). There, you can see a “function” which decodes a stream compressed with a sort of run-length encoding, and a “function” which reads a stream and builds two kind of tokens, WORD (a sequence of uppercase or lowercase letters) and PUNCTUATION (everything but lowercase or uppercase letters). I made it first in 68k assembly, since it's still the assembly I know better, since I dislike x86, and since I can test it on my old AmigaOS programming environment (emulated with e-UAE).

The core idea is in the following snippet:

 move.l a4,-(sp)
lea    (4,pc),a4   ; the idea, GenAm disagrees with
rts

which jumps to the code pointed by a4, putting in the same a4 the address of the first instruction following rts. This means that a4 contains the code of what to execute next — sort of continuation, with only one information, namely the address to where to resume execution, but extremely simplified: the correct behaviour of the two routines is ensured by the “correct” usage of processor registers, as if they were static local variables.

It is up to the compiler (me…) to fill properly a4 the first time one of the coroutines is activated. Since the decompression code emits when it has a character ready, and parser code reads a character whenever it needs one, “activation” is symmetric. In fact, you can enter the consumer first, telling the next code is the producer, or you can enter the producer first, telling the next code to be executed is the consumer — you tell this by properly loading a4.

The following gist (better look than the embedding) shows the complete code you could assembly on Amiga (running on a 68k, real or emulated). The routines are getc and parse, where getc decompresses the stream and “emits” the bytes, and parse tokenizes the incoming stream. Bugs apart, it worked… They both expect a4 to be properly loaded with the address of where to continue. (skip the code)

Instead of lea (4,pc),a4 I had to write the instruction coded by hand as two 16bit words, since the assembler refused to put a 16bit offset not computed by it. The word 49FA encodes the lea instruction, the addressing mode “relative to pc, 16bit displacement” as source, and the a4 register as destination, then the displacement follows, and it's enough to load into a4 the address of the instruction following the rts.

There's some AmigaOS specific code (it opens the dos.library to call I/O functions), but the part you maybe are interested in, should be easily identificable. Each time you see

 move.l a4,-(sp)
dc.w $49FA,$0004
rts

it jumps to where it left the previous code. Do not get confused by this false rts and the rts that allows the code to be back to the “main” — theoretically from both routines.

First activation from the “main” is symmetric (passing the correct “continuation” and calling one of the routines), as said:

 lea    (getc,pc),a4
bsr    parse

works as it would work

 lea    (parse,pc),a4
bsr    getc

If we change the decompressing code (e.g. because we implement something able to decode a gzip stream), we could start the game like this:

 lea   (getc_gzip,pc),a4
bsr   parse

Nothing special, but worth saying.

The trick used to jump to the address a4 loading in the same a4 the address where to go next, is not needed in other assembly languages. For instance, MMIX has a GO instruction perfect for coroutines — and in PowerPC assembly likely the link register and instructions using/manipulating it could be helpful.

Next, I will try something in MMIX, then in x86 (or viceversa).