P2 Taqoz V2.8: lutLongs, Value Type Variables, Locals --- now faster

Christof Eb. · 2022-11-03 09:45

Hi,
EDIT: new update in post #3.
this is an update to https://forums.parallax.com/discussion/174531/taqoz-reloaded-2-8-better-readability-named-local-variables#latest

and https://forums.parallax.com/discussion/173841/p2-taqoz-forth-v2-8-some-tools-value-simple-local-variables-dcf77#latest

lutLongs ( n ) works like longs but the variable will be located at the end of LUT memory, private for each cog.

value ( n ) gives the value type variables, which place their content onto the stack by default. If the use is preceded by "to" the top of stack will be placed in the variable. If the use is preceded by "+to" the top of stack will be added to the variable.
to and +to set a flag variable, which is now a lutLong. Speed for values is now greatly accelerated.
Be sure to clear the flag tosetL using the word "toInit" if code is RUN in a new cog!

Named locals make forth text very much better readable and the code is often more easy to write. Their behavior is like values.
This uses a modified stack diagram as syntax for the definition of the variable names.
{: starts the definitions of names of initialized local variables
, starts the definitions of names of non-initialized local variables
-- starts comment
} finish of the modified stack diagram
These locals are of type VALUE and can be written with a preceding to or +to.

: printSum {: numberL# , sumL# -- } \ 1 locals initialized , 2 locals total 
   0 to sumL#
   numberL# 1+ 0 do 
      i +to sumL#
   loop
   sumL# .
;

Implementation of the locals:
When locals are used, a stack frame of 5 (always!) longs is created by moving the stack pointer of the auxiliary stack in LUT memory, private to the cog. The locals are located in that frame. At the end of the word the stack pointer is moved back. It took me a while to find a way to implement named locals. Some difficulty is given because the stack diagram follows the already begun colon definition. In Taqoz there are 2 separate parts of the dictionary. So it is possible to define ALIAS names for the locals while the compilation of the new word is already running. The compiler will then instantly find/use the new names. No modification of the compiler itself is necessary.

Limitations:

The auxiliary stack may not be used within the same word, because the stack pointer gives the position of the stack frame. It may be used in other words.
To move back the stack pointer the word has to finish with the modified ";" .
Number of local variables is limited to 5 per word by choice. - I had started with 3 only and then incremented this. 5 seems to cover most.
The ALIAS function might overwrite old words and the local names will be found by the compiler after the definition.
Decompiling with SEE or DECOMP will show the last ALIAS names for the local variables. So it is advisable to use these directly following the word definition.
If there is an error while compiling a name of a local will be displayed instead of the name the word and you have to FORGET the faulty word.

Speed comparison with some fibonacci code:

toset in LUT and with locals in LUT
fiboAsm:    1836311903 1177     100 pc to Assembler
fiboCog:    1836311903 585      49 pc to Assembler
fiboForth:  1836311903 7313     621 pc to Assembler
fiboValues: 1836311903 83617    7104 pc to Assembler
fiboLocals: 1836311903 59329    5040 pc to Assembler
fiboLut: 1836311903 77057       6546 pc to Assembler
fiboGlobals: 1836311903 53209   4520 pc to Assembler ok

So now the different types of variables are more or less in the same league. Traditional stack Forth code is allmost 10 times faster, because in this case the variables are held in cog registers and the used word codes are in cog ram too. But can you read this?

: fiboForth ( n -- f )  \ Fibonacci Reihe, liefert letztes Ergebnis
    0 1 
    ROT 1 - FOR 
        SWAP OVER  
        + 
    NEXT
    SWAP DROP
;

I think, this is more readable:

: fiboLocals {: n , a b c -- f }
    0 to a   1 to b   0 to c
    n for
        b to a
        c to b
        a +to c
    next
    c 
;

Have Fun, Christof

Christof Eb. · 2022-11-05 08:33

Found a Bug, updated.

Christof Eb. · 2023-06-16 08:00

A further and major update:
Perhaps you know this feeling, after a certain discovery, things just fall into their places?
Here the game changer discovery was, how we can place assembler routines in LUT RAM. https://forums.parallax.com/discussion/175395/p2-taqoz-v2-8-placing-assembler-routines-into-lut-ram#latest
This opens up great possibilities:
1. There is room in LUT for some routines. More than 256 assembler commands. - Much more than available for cogmod. So we can have routines there permanently.
2. We can have more small routines.
3. LUT routines can be accessed without the HUB bottleneck.
4. LUT routines don't need to be stacked in the Forth return stack, so they have less overhead.
5. If we use LUT Ram for the routines, we can use register variables in COG RAM, which can do math and can be accessed directly.

So now we can combine readability with speed!

We have 5 named local variables, which are now held in COG RAM.

Example:

: fiboLocals {: n , a b c -- f }
    0 to_a   1 to_b   0 to_c
    n for
        b to_a
        c to_b
        a +to_c
    next
    c 
; 
46 lap fiboLocals lap . .lap
forgetLocals

The 1st line with {: defines 4 local variables, named "n", "a", "b", "c". The first, before the comma, will be filled from stack. The ones after the comma are not initialized. After the -- to the } the text is comment.
The new thing is, that for each variable there are 3 (!) new words:
"name" will just put the value of the register variable onto the stack.
"to_name" will pop the value from the stack into the register.
"+to_name" will add the TOS to the register variable.
As each word needs only one hub access (for it's word code) which includes operation AND operand, this is now 3 times faster than using normal HUB variables!
{: will take care, that the previous local variables are stacked onto the L-stack. Before the exit ";" they will be recovered, if you used {: before.
The registers can also be accessed with the fixed words "a>", ">a" and "+>a", which give the basis for the alias names.
It is recommended to use "forgetLocals" just after the ; of a new routine, that uses {: . This helps to keep the dictionary lean.
Have fun!
Christof

Christof Eb. · 2023-07-11 13:38

Updated, had a bug.

P2 Taqoz V2.8: lutLongs, Value Type Variables, Locals --- now faster

Comments