I have already skimmed the topic with two posts:
The next promised post should have been about the same matter, tackled using D.E.Knuth's MMIX, mainly because of the GO and PUSHGO instructions.
I have already skimmed the topic with two posts:
The next promised post should have been about the same matter, tackled using D.E.Knuth's MMIX, mainly because of the GO and PUSHGO instructions.
an innovative new programming language for Cocoa and Cocoa Touch
#! /usr/bin/perl use strict; while (<>) { if (/##(\d{8})##/) { my $r = `LC_TIME="it_IT.utf8" date -d$1 +"%A %d %B %Y"`; s/##\d{8}##/$r/; } print $_; }In the generated html, the sequence ## was used to mark the date, extracted as YYYYMMDD (using properly substring). I had to set LC_TIME since I am used to set my locale to en_GB.utf8 (I try to keep my system consistent about the language and avoid the mixture that happens when you use locale-aware and locale-unaware softwares), but I needed italian names for week days and months.
Following the very same idea of the previous post, I've implemented the same stuff for x86. No worry about details — I am not a x86 fan and lover — except that there are few tests I've not done in the m68k version (namely the latter assumed the compressed stream is not corrupted). But it's just noise, not worth considering.
Intel x86 assembly instructions suck, but I admit I don't know it very well and likely I haven't used some cool feature and I don't know any cool feature which, once I'll know it, will make my mind change. Rants end.
Since x86 has not too many registers, and since I've used C library (compiled with nasm, linked with gcc/ld) and x86 calling conventions apply, and since I wanted to avoid special purpose registers (ECX, ESI, EDI…) as “global” variable storage, there are extra push
and pop
to keep values between calls to library functions, while each coroutine assumes also that the register it's interested in, are not trashed.
The register EBP can be used for “shared” (or global) storage; in fact, I've used it to store the pointer to the token buffer, and the “continuation” address.
The yield/resume feature is done with this code (kept into a macro):
mov ebx,[ebp]
mov dword [ebp],$+9
jmp ebx
First the next address is put into EBX, then it's replaced by the address of the instruction following the jump, then the jump to the address in EBX is performed.
That's all. Readers interested in the whole code can find it at this gist, but I doubt it's worth it. It'd be far more interesting to study an implementation that could be used for really, as the result of compiling high level language code.
Different calling conventions can make it easier, but then you need extra code to call external functions — sticking to common C calling convention on a system is the key to access a lot of code without the need for any kind of glue — almost. Rants end, again; guess when they began.
This may work fine for two coroutines. Let's reason about a third coroutine. Does it work? No. If you need to create another cooperation, e.g. between the parser which extracts tokens (a “lexical scanner” indeed) and a grammar parser (i.e. a parser), you'll be fucked up.
E.g. our parser at some point, instead of got_token
, need to give control to another routine, namely the one which understands the grammar. Thus, for each coroutine pair we need a “slot” similar to the one in [ebp]
. A theoretical JCONT macro would be more complex, and take into account at least the coroutine we want to give control to. E.g.
parse:
JCONT parse,getc
test eax,eax
...
.wend:
mov eax,TWORD
; the grammar_parser'd like to have ptr to buffer too
; ... but this could be a global, as it is
JCONT parse,grammar_parser
...
If there's a hashmap for each coroutine, then we need to initialize it first, and the somewhere likely we could need a reset too. The macro could look, in pseudocode, something like
get_slot_of %2
mov ebx,[ebp]
mov dword [ebp],$+9
store_slot_for %1
jmp ebx
Just an idea, at a very late hour.
Playing with handmade lexical scanners and parsers you soon discover how cool it would be if you could use coroutines, but unfortunately language like C and C++ haven't such a feature, nor they have a general gear to manage continuations — even though setjmp
/longjmp
can be thought as what you need to begin, but they maybe do not bring you to the end, not always at least.