Okay, cooked up a more interesting example of multi-Cog Forth.
It's not protein folding, but it's a little more flexible than what I posted before. This example will spawn Forth tasks on all eight Cogs (counting the console task on Cog 0) and show it by blinking LEDs from each task. It allocates 7 USER task areas starting from $6000, for a total memory footprint of slightly over a kilobyte.
First, some toolkit words for setting up and starting tasks:
\ Given the base address of a task's USER area, does initial setup.
\ This doesn't set the BASE or other fields you need to do console
\ I/O, and sets the stack sizes very small.
: init-user ( user-addr -- )
[noparse][[/noparse] hex ]
dup 90 + over 10 + !
70 + over 14 + ! ;
\ Handy holder for the size of a task's USER area, for this example.
: task-user-size 0D0 ;
\ Given a USER address and an XT, sets the first-word field.
: set-first-word ( user-addr xt -- )
swap 18 + ! ;
\ Given a prepared USER address, spawns a new Cog running Forth.
: start-forth-task ( user-addr -- )
[noparse][[/noparse] decimal ]
16 lshift 'kernel 2 lshift or 8 or
2 hubop ;
\ Kills every Cog but the console Cog (0).
: killall
8 1 do i cogstop loop ;
And some demo program code:
\ Blinks an LED, 0-7, depending on which Cog we're on.
: cog-blink
1 cogid lshift ledemit \ Put a 1 in our cog ID's position
500 milliseconds wait
0 ledemit
500 milliseconds wait ;
\ Blinks forever. This is separate from the above so that
\ cog-blink can be tested.
: cog-blink-forever
begin
cog-blink
again ;
\ Given the addr of a task's USER area, sets the task up
\ to run cog-blink-forever when it starts.
: make-blink-task ( user-addr -- )
dup init-user
[noparse][[/noparse]'] cog-blink-forever set-first-word ;
\ Spawns tasks on Cogs 1-7, offset by a quarter second.
\ Net effect: chasing LEDs.
: many-tasks
[noparse][[/noparse] hex ]
7 0 do
\ Compute task address
task-user-size i * 6000 +
\ Set up the task
dup init-user
dup make-blink-task
\ Spawn
start-forth-task
100 milliseconds wait
loop ;
Well my initial test code isn't pretty but here it is. I have to rush off and start the day but I will get back later. This is fun.
*Peter*
hex
: .hex ( char -- )
30 + dup
39 > if 7 + then
emit ;
: .byte ( byte -- )
FF and 10 /mod .hex .hex ;
: .word
FFFF and 100 /mod .byte .byte ;
: .chars ( adrh adrl -- )
do i c@
dup 7F > if 7F and then
dup 20 < if drop 2E then
emit
loop ;
\ dump hex bytes + ascii from main memory
: dump ( adr cnt -- )
over + swap
do i 0F and 0=
if cr i .word ." : " then
i c@ .byte space
i 1+ 0F and 0=
if space space i 1+ i 0F - .chars then
loop ;
\ dump words from cog memory
: ldump ( adr cnt -- )
over + swap
do i 1F and 0= if cr i .word ." : " then
i l@ dup 10 rshift .word .word space
4 +loop ;
: ftype ( addr cnt -- )
over + swap do i c@ dup 0= if 20 + then emit loop ;
\ list dictionary words
: WORDS
0 latest @
begin
8 -
\ 4 words/line
over 3 > if swap drop 0 swap cr then
dup .word ." : "
dup 1+ 7 ftype
space space swap 1+ swap
nfa>lfa h@ dup
while
repeat
2drop ;
Dead-on and reasonably well-factored for "quick-and-dirty" code. My only tip would be to use cfa>nfa instead of 8 -; the dictionary layout is going to change soon.
So, with a cfa>nfa nfa>lfa pair, you can get from the XT to the link field.
The next release will include .R, which will simplify your life some (by eliminating the need for .byte). I'll also go ahead and include more of my programming words (DUMP, FORGET, and possibly SEE if I can get it working right).
Thanks Cliff, ok now I'm back in front of the propeller I can cleanup that bit of code and correct some mistakes too. The WORDS formats each word so that it displays the CFA,code field,attributes,count, and of course the name.
Next bit of code I will try to do something useful....
*Peter*
\ PBJ'S q&d CLB propeller forth extensions
hex
\ convert start and cnt to form suitable for "DO"
: bounds ( start cnt -- end start )
over + swap ;
: spaces ( cnt -- )
0 do space loop ;
: .hex ( nibble -- )
30 + dup
39 > if 7 + then
emit ;
\ Print as 2 hex digits regardless of current base
: .byte ( byte -- )
FF and 10 /mod .hex .hex ;
: .word ( 16bits -- )
FFFF and 100 /mod .byte .byte ;
: .long
dup 10 rshift .word .word ;
: .chars ( adrh adrl -- )
do i c@ 7F and
dup 20 < if drop 2E then emit
loop ;
\ dump hex bytes + ascii from main memory
: dump ( adr cnt -- )
bounds
do i 10 bounds do
i 0F and 0= if cr i .word ." : " then
i c@ .byte space
loop
2 spaces i 10 + i .chars
10 +loop ;
\ dump words from cog memory
: ldump ( wadr wcnt -- )
bounds
do i 7 and 0= if cr i .word ." : " then
i l@ .long space
loop ;
: ftype ( addr cnt -- )
bounds do i c@ dup 0= if 20 + then emit loop ;
: .head ( cfa -- )
.word ." =[noparse][[/noparse]" \ print cfa
dup h@ .word ." ] "
cfa>nfa
dup c@ 80 and if ." I" else ." ." then
dup c@ 40 and if ." C" else ." ." then
dup c@ 20 and if ." S" else ." ." then
dup c@ 1F and .byte \ print count+atr
space dup 1+ 7 ftype \ print name
4 spaces ;
\ list dictionary words
: WORDS
0 latest @
begin
\ 2 words/line
over 1 and 0= if cr then
dup .head
swap 1+ swap
nfa>lfa h@ dup
while
repeat
2drop ;
It looks like your Forth is really working well. I don't know much about Forth, but it looks like the kind of thing that I'd really like -- terse and RPN (cuts to the chase, doesn't it?). Are compiler optimizations even possible in such a direct language? It looks very intriguing. I think Chuck Moore had the right idea here.
I was reading earlier today about a new chip called SEAForth24. It has 24 processors which execute·1 billion native Forth instructions each, per second. I think I was reading that it comes in a 240-pin BGA and costs under $20 at 1K units. It seems aimed at very high-volume apps. If I remember from a while back, Chuck Moore designed this chip using his own IC layout tools written in...· Color Forth! It was interesting how he used·his hierarchical language to make a hierarchical chip. He made GDSII (Gerber for chips) tiles which were like macros that he could arrange hierarchically. Outputting the database was no thought for his tools. It seems Chuck Moore is all about identifying redundancies and eliminating them.
huh, that SEAForth24 is an interesting chip.· One aspect I find especially interesting is that it uses asynchronis digital logic. (or more accuratly local/self clocking logic)· The posted power consumption is pretty dang low for a chip with·the equivelent of a ~1GHz clock!·· I wonder how well this type of clocking would adapt to propeller style shared memory interconnect?· I wonder how they get the chip to work in applications that need·precise pacing?· (didn't see anything like the Prop's CNT register)·
People are rediscovering Forth chips all over the place. I read three papers last night alone on the subject, released in the past year.
10 years of silence and they come back...what changed? FPGAs. One of the papers included the complete Verilog for a Forth core inline in the paper, in true Literate Programming style. It was very impressive and makes me want to dust off my Verilog. (FPGAs are so much more capable these days; cores will be the next source to open.)
Chuck's design tools are neat, but the next wave of chips will come from free tools, I promise you that.
Chip, as to Forth:
Forth tends to be terse, but in the hands of a capable programmer it's also one of the most readable languages out there -- because you can conform the entire language to your problem domain. Want infix, object-orientation, auto-vectorization? No sweat.
Forth optimizations are a well-explored field, and are quite straightforward. They tend to fall into a few discrete areas:
- Finding commonly repeated word sequences (swap + over, for example) and synthesizing them into a native-code primitive automatically (so-called superinstructions).
- Inlining short words to save on procedure-call overhead.
- Compiling "hot" words to native code, often on the fly.
- Stack analysis to convert the code to a register-to-register form, on architectures where that makes sense.
If you combine these three, you can compile the entire Forth system to native code, with performance rivaling a good C compiler. All three of these have been available in production systems since the 1980s, and I've used all three of these in other (non-Forth) VMs in the past. (Smalltalk, specifically, which is also a language that allows the programmer to change any aspect on-the-fly, making traditional optimizations difficult.)
On the Propeller, superinstructions are doable -- I've got nearly 200 longs of Cog RAM free in the current kernel. Native compilation is harder, since we can't execute directly from shared or external RAM. I am already working on inlining.
(I've worked with code to page short native-code sections in and out of Cog RAM, but Forth-style threaded interpreters are already very fast. The copy operation is invariably slower than executing an interpreted form -- and threaded code takes up vastly less storage space. I've discussed some potential optimizations for PropellerForth on my tech blog, and there'll be more to come.)
"On the Propeller, superinstructions are doable -- I've got nearly 200 longs of Cog RAM free in the current kernel. Native compilation is harder, since we can't execute directly from shared or external RAM. I am already working on inlining."
Actually... time for me to fess up.
Check the new thread I'm posting in a minute or two "Large memory model for Propeller Assembly language programs"
Folks, PropellerForth now supports multitasking "both ways" -- running Forth words on multiple Cogs, and cooperative multitasking on a single Cog.
Each task requires 40 bytes of shared RAM to hold some state, plus stacks (plus another 80 for tasks that talk to the terminal). Only 4 of these bytes are specific to the task switcher.
I demonstrated running Forth code on multiple Cogs earlier, and will post the cooperative task code shortly. In summary form, the system is quite similar to most RTOSes:
- Tasks are linked together into task cycles (mostly transparently to the user), which are scheduled round-robin.
- Control is passed from task to task using the traditional Forth word PAUSE. (If the standard library is made task-aware it'll also be passed at any blocking operation, like some I/O primitives.)
- The code is entirely in Forth, so task switches take several microseconds (see numbers below). This could be optimized or moved to native code, but it's well within the parameters for my application (a full-duplex serial driver).
Using a minimal test environment (total overhead of 60 bytes per task), I've successfully run 50 tasks on a single Cog. All tasks were sharing code and some global data, but had thread-local copies of necessary variables.
The test routine:
: blink-forever
begin
u0 ledemit \ writes the address of this task to the debug LEDs
pause
again ;
For a ring of 42 tasks (chosen because it's an amusing number and also fits in a buffer I'd allocated), 1,000 context switches from this routine took 87,424,528 cycles. That's 1.09s, or 1.09ms per iteration -- so 26.01us per task activation. With one task (but with task switching), it took 26ms (the same 26us per activation).
With task switching disabled, it took 16.4ms for the same 1,000 iterations, suggesting a task-switch overhead of 10us.
This code will be included in the next release of PropellerForth (which, it should be noted, will include sources!).
There's a newer release there than what I've posted in this thread, though it's a build from earlier today and lacks the in-Cog tasker.
I've also set up code hosting and issue tracking on Google Code Hosting, so folks can feel free to file bugs against me there. (The project page is linked from the homepage above.)
Future updates will be posted on the website; I find forum threads to be really bad repositories for information, so I'm going to quit updating this one unless anyone has specific questions or issues they'd like to discuss.
Cliff L. Biffle said...
Folks, PropellerForth now supports multitasking "both ways" -- running Forth words on multiple Cogs, and cooperative multitasking on a single Cog.
Each task requires 40 bytes of shared RAM to hold some state, plus stacks (plus another 80 for tasks that talk to the terminal). Only 4 of these bytes are specific to the task switcher.
I demonstrated running Forth code on multiple Cogs earlier, and will post the cooperative task code shortly. In summary form, the system is quite similar to most RTOSes:
- Tasks are linked together into task cycles (mostly transparently to the user), which are scheduled round-robin.
- Control is passed from task to task using the traditional Forth word PAUSE. (If the standard library is made task-aware it'll also be passed at any blocking operation, like some I/O primitives.)
- The code is entirely in Forth, so task switches take several microseconds (see numbers below). This could be optimized or moved to native code, but it's well within the parameters for my application (a full-duplex serial driver).
Using a minimal test environment (total overhead of 60 bytes per task), I've successfully run 50 tasks on a single Cog. All tasks were sharing code and some global data, but had thread-local copies of necessary variables.
The test routine:
: blink-forever
begin
u0 ledemit \ writes the address of this task to the debug LEDs
pause
again ;
For a ring of 42 tasks (chosen because it's an amusing number and also fits in a buffer I'd allocated), 1,000 context switches from this routine took 87,424,528 cycles. That's 1.09s, or 1.09ms per iteration -- so 26.01us per task activation. With one task (but with task switching), it took 26ms (the same 26us per activation).
With task switching disabled, it took 16.4ms for the same 1,000 iterations, suggesting a task-switch overhead of 10us.
This code will be included in the next release of PropellerForth (which, it should be noted, will include sources!).
I've managed to run 256 threads on a single core. All were running a really simple cycle-burner so I could measure their performance.
For the folks who'd rather not read through a page of Forth code, here are the results. Sure wish mCode supported tables.
Benchmark Times (all in cycles)
Single Task:
Tasker disabled, one task: 1,536,528
Tasker enabled, one task: 2,304,528
Having the tasker enabled imposes roughly a 768 cycle penalty for each call to PAUSE, even with only one task. That's what I get for writing the engine in an afternoon; I'll optimize it later. (For those playing along at home, that's 9.6us.)
Performance scales almost perfectly linearly with the number of tasks -- that is, doubling the number of tasks halves your speed. The scheduler is deterministic O(n) for the number of tasks, by design -- though this may change when I get the thread prioritization code working.
The system has no hard limit to the number of tasks; it's bounded only by available RAM, and at 256 tasks, I had -4 bytes free. (I allowed one task to reuse some of my interpreter's thread-locals. It's cheating, but it worked.) I am deliberately overallocating, giving each task 96 (decimal) bytes.
Astute readers have noticed that every time ends in 528. I think it's weird too. You can try to replicate it with the sources below.
The Code
This will be built into the next PropellerForth release, but interested parties can try it now. Paste the code into your terminal emulator, or put it in a text file and send it as ASCII.
Note that this is not exactly the code I ran for my benchmarks above. It's slightly better factored, and runs a bit slower. Readers can match (or beat) my numbers by inlining the task-id display code.
First, the multitasking engine:
hex \ the number base of champions
\ Given a pointer to the base of a task's USER area,
\ sets up a pretty generic, spacious task.
\ Note: you cannot PAUSE directly to this task, you
\ must first activate it with start-task
: create-task ( user-addr -- user-addr )
10 over ! \ Set the number base for I/O...
dup 7C + over 10 + ! \ ...the stack pointer
dup 17C + over 14 + ! \ ...the return pointer
next-task @ over 24 + ! \ ...the next task in the cycle
dup next-task ! ; \ ...and set it to our next task
\ Given a pointer to the base of a task's USER area,
\ sets up a dinky task suitable only for trivial things.
\ Note: you cannot PAUSE directly to this task, you
\ must first activate it with start-task
: create-small-task ( user-addr -- user-addr )
10 over !
dup 28 + over 10 + !
dup 48 + over 14 + !
next-task @ over 24 + !
dup next-task ! ;
\ Suspends the current task, passing control to the
\ next (which may be the same task)
: pause
rsp@ \ stash our return-stack pointer on the param stack.
sp@ saved-sp ! \ store our stack pointer in a USER thread-local
next-task @ usp! \ activate the next task
saved-sp @ sp! \ restore our stack pointer (we're in the next task)
rsp! ; \ restore our return stack pointer and proceed
\ Given a pointer to a USER area (prepared with
\ create-task or create-small-task) and the XT of
\ a word to run, starts a new task, immediately
\ giving it control.
: start-task ( user-addr xt -- )
rsp@
rot rot 2>r
sp@ saved-sp !
r> r>
dup next-task !
usp!
sp0 @ !
sp0 @ sp!
rsp0 @ rsp!
execute \ Start the task's initial word
bye ; \ If we ever return, halt -- something is wrong
\ Creates a small task using memory from the
\ dictionary.
: allot-small-task ( -- user-addr )
align \ make sure we're at a word boundary
here 60 allot \ make room
create-small-task ;
Now, some code to multitask:
\ Shows a semi-unique value for the
\ current task, to show that we've switched
: show-task-id
u0 \ address of user area
5 rshift \ these bits are not interesting, drop them
ledemit ;
\ Simply keeps showing our task ID.
\ This is used for the non-interactive tasks.
: show-task-id-forever
begin
show-task-id
pause
again ;
\ Show the task ID and pause a specified
\ number of times. Used for the interactive
\ task (so we get control back).
: show-task-id-repeatedly ( count -- )
0 do
show-task-id
pause
loop ;
: spawn-task
allot-small-task
[noparse][[/noparse]'] show-task-id-forever start-task ;
: tasks ( count -- )
0 do spawn-task loop ;
To use, enter something like the following:
32 tasks
1000 show-task-id-repeatedly
If you'd like to see how long it takes, enter this word:
mCode does support tables, apparently. Chip posts them occasionally (example here). I can't figure out how he does it, though, and I've been meaning to ask him.
I didn't know one way or another, I just copied from Word and pasted it in.
Phil Pilgrim (PhiPi) said...
Cliff,
mCode does support tables, apparently. Chip posts them occasionally (example here). I can't figure out how he does it, though, and I've been meaning to ask him.
BTW: I type as I'm thinking, so please don't take any offense at my writing style www.norfolkhelicopterclub.co.uk
You'll always have as many take-offs as landings, the trick is to be sure you can take-off again ;-)
Comments
It's not protein folding, but it's a little more flexible than what I posted before. This example will spawn Forth tasks on all eight Cogs (counting the console task on Cog 0) and show it by blinking LEDs from each task. It allocates 7 USER task areas starting from $6000, for a total memory footprint of slightly over a kilobyte.
First, some toolkit words for setting up and starting tasks:
And some demo program code:
To use:
To stop:
*Peter*
Dead-on and reasonably well-factored for "quick-and-dirty" code. My only tip would be to use cfa>nfa instead of 8 -; the dictionary layout is going to change soon.
So, with a cfa>nfa nfa>lfa pair, you can get from the XT to the link field.
The next release will include .R, which will simplify your life some (by eliminating the need for .byte). I'll also go ahead and include more of my programming words (DUMP, FORGET, and possibly SEE if I can get it working right).
Next bit of code I will try to do something useful....
*Peter*
It looks like your Forth is really working well. I don't know much about Forth, but it looks like the kind of thing that I'd really like -- terse and RPN (cuts to the chase, doesn't it?). Are compiler optimizations even possible in such a direct language? It looks very intriguing. I think Chuck Moore had the right idea here.
I was reading earlier today about a new chip called SEAForth24. It has 24 processors which execute·1 billion native Forth instructions each, per second. I think I was reading that it comes in a 240-pin BGA and costs under $20 at 1K units. It seems aimed at very high-volume apps. If I remember from a while back, Chuck Moore designed this chip using his own IC layout tools written in...· Color Forth! It was interesting how he used·his hierarchical language to make a hierarchical chip. He made GDSII (Gerber for chips) tiles which were like macros that he could arrange hierarchically. Outputting the database was no thought for his tools. It seems Chuck Moore is all about identifying redundancies and eliminating them.
Anyway, I'm anxious to see how your Forth goes.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Chip Gracey
Parallax, Inc.
Anywho...
Marty
10 years of silence and they come back...what changed? FPGAs. One of the papers included the complete Verilog for a Forth core inline in the paper, in true Literate Programming style. It was very impressive and makes me want to dust off my Verilog. (FPGAs are so much more capable these days; cores will be the next source to open.)
Chuck's design tools are neat, but the next wave of chips will come from free tools, I promise you that.
Chip, as to Forth:
Forth tends to be terse, but in the hands of a capable programmer it's also one of the most readable languages out there -- because you can conform the entire language to your problem domain. Want infix, object-orientation, auto-vectorization? No sweat.
Forth optimizations are a well-explored field, and are quite straightforward. They tend to fall into a few discrete areas:
- Finding commonly repeated word sequences (swap + over, for example) and synthesizing them into a native-code primitive automatically (so-called superinstructions).
- Inlining short words to save on procedure-call overhead.
- Compiling "hot" words to native code, often on the fly.
- Stack analysis to convert the code to a register-to-register form, on architectures where that makes sense.
If you combine these three, you can compile the entire Forth system to native code, with performance rivaling a good C compiler. All three of these have been available in production systems since the 1980s, and I've used all three of these in other (non-Forth) VMs in the past. (Smalltalk, specifically, which is also a language that allows the programmer to change any aspect on-the-fly, making traditional optimizations difficult.)
On the Propeller, superinstructions are doable -- I've got nearly 200 longs of Cog RAM free in the current kernel. Native compilation is harder, since we can't execute directly from shared or external RAM. I am already working on inlining.
(I've worked with code to page short native-code sections in and out of Cog RAM, but Forth-style threaded interpreters are already very fast. The copy operation is invariably slower than executing an interpreted form -- and threaded code takes up vastly less storage space. I've discussed some potential optimizations for PropellerForth on my tech blog, and there'll be more to come.)
Here's one that works.
This will be included in the next PropellerForth release, but for now, that should do 'ya.
"On the Propeller, superinstructions are doable -- I've got nearly 200 longs of Cog RAM free in the current kernel. Native compilation is harder, since we can't execute directly from shared or external RAM. I am already working on inlining."
Actually... time for me to fess up.
Check the new thread I'm posting in a minute or two "Large memory model for Propeller Assembly language programs"
Each task requires 40 bytes of shared RAM to hold some state, plus stacks (plus another 80 for tasks that talk to the terminal). Only 4 of these bytes are specific to the task switcher.
I demonstrated running Forth code on multiple Cogs earlier, and will post the cooperative task code shortly. In summary form, the system is quite similar to most RTOSes:
- Tasks are linked together into task cycles (mostly transparently to the user), which are scheduled round-robin.
- Control is passed from task to task using the traditional Forth word PAUSE. (If the standard library is made task-aware it'll also be passed at any blocking operation, like some I/O primitives.)
- The code is entirely in Forth, so task switches take several microseconds (see numbers below). This could be optimized or moved to native code, but it's well within the parameters for my application (a full-duplex serial driver).
Using a minimal test environment (total overhead of 60 bytes per task), I've successfully run 50 tasks on a single Cog. All tasks were sharing code and some global data, but had thread-local copies of necessary variables.
The test routine:
For a ring of 42 tasks (chosen because it's an amusing number and also fits in a buffer I'd allocated), 1,000 context switches from this routine took 87,424,528 cycles. That's 1.09s, or 1.09ms per iteration -- so 26.01us per task activation. With one task (but with task switching), it took 26ms (the same 26us per activation).
With task switching disabled, it took 16.4ms for the same 1,000 iterations, suggesting a task-switch overhead of 10us.
This code will be included in the next release of PropellerForth (which, it should be noted, will include sources!).
www.cliff.biffle.org/software/propeller/forth/
There's a newer release there than what I've posted in this thread, though it's a build from earlier today and lacks the in-Cog tasker.
I've also set up code hosting and issue tracking on Google Code Hosting, so folks can feel free to file bugs against me there. (The project page is linked from the homepage above.)
Future updates will be posted on the website; I find forum threads to be really bad repositories for information, so I'm going to quit updating this one unless anyone has specific questions or issues they'd like to discuss.
Very kewl.
I've managed to run 256 threads on a single core. All were running a really simple cycle-burner so I could measure their performance.
For the folks who'd rather not read through a page of Forth code, here are the results. Sure wish mCode supported tables.
Benchmark Times (all in cycles)
Single Task:
Tasker disabled, one task: 1,536,528
Tasker enabled, one task: 2,304,528
Having the tasker enabled imposes roughly a 768 cycle penalty for each call to PAUSE, even with only one task. That's what I get for writing the engine in an afternoon; I'll optimize it later. (For those playing along at home, that's 9.6us.)
Multiple Tasks:
2: 4,560,528
4: 9,072,528
8: 18,096,528
16: 36,144,528
32: 72,240,528
64: 144,432,528
128: 288,816,528
256: 577,584,528
Performance scales almost perfectly linearly with the number of tasks -- that is, doubling the number of tasks halves your speed. The scheduler is deterministic O(n) for the number of tasks, by design -- though this may change when I get the thread prioritization code working.
The system has no hard limit to the number of tasks; it's bounded only by available RAM, and at 256 tasks, I had -4 bytes free. (I allowed one task to reuse some of my interpreter's thread-locals. It's cheating, but it worked.) I am deliberately overallocating, giving each task 96 (decimal) bytes.
Astute readers have noticed that every time ends in 528. I think it's weird too. You can try to replicate it with the sources below.
The Code
This will be built into the next PropellerForth release, but interested parties can try it now. Paste the code into your terminal emulator, or put it in a text file and send it as ASCII.
Note that this is not exactly the code I ran for my benchmarks above. It's slightly better factored, and runs a bit slower. Readers can match (or beat) my numbers by inlining the task-id display code.
First, the multitasking engine:
Now, some code to multitask:
To use, enter something like the following:
If you'd like to see how long it takes, enter this word:
It's used like this:
Happy hacking!
Post Edited (Cliff L. Biffle) : 11/12/2006 8:36:46 AM GMT
mCode does support tables, apparently. Chip posts them occasionally (example here). I can't figure out how he does it, though, and I've been meaning to ask him.
-Phil
Chip Gracey
Parallax, Inc.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Cheers,
Simon
BTW: I type as I'm thinking, so please don't take any offense at my writing style
www.norfolkhelicopterclub.co.uk
You'll always have as many take-offs as landings, the trick is to be sure you can take-off again ;-)