Tachyon V4 "DAWN" - exploring new worlds

ErNa · 2017-03-06 21:35

When every character counts, a character is not necessarily an accumulator .....

In Introduction to Tachyon I read:

WS2812  ( addr cnt  -- ) ( COGREGS: 4=txmask )
      The WS2812 RGB LED driver will transmit the buffer specified using the output mask that is stored in COGREG4.

COGREGS now is not a register, but the plural of the word COGREG and COGREG4 is the fourth COGREG, where the COGREGs form an array of longs in the cog. By convention those registers are used by dedicated words, like WAITPNE, if I understand correctly.

COGREG@    ( adr -- long )    X    Fetch the contents of the indexed COGREG which is offset from REG0
COGREG!    ( long adr -- )    X

Isn't in this case "adr" the "Index" of the cog register? because:

0 COGREG@ 1 COGREG@ -							--- calculate high period

while

0 COGREG .

prints the actual address of the register in the cog.

Peter Jakacki · 2017-03-06 22:09

There is the "@" symbol so COGREG@ is pronounced as "COGREG FETCH"

COGREG ( index -- cogadr ) --- Convert register index to address of "register" in the cog
COGREG@ ( index -- long ) --- Fetch the contents of the "register" == COGREG COG@

From the kernel source you can see that there are other locations addressable as a fixed COGREG and even before REG0 there are fixed registers for SPI operations too accessed with a negative index. Thinking about that, I may convert them all to a more efficient positive index since a 15-bit unsigned literal can be represented by a single wordcode.

' COGREG - Registers used by PASM modules to hold parameters such as I/O masks and bit counts etc
COGREGS
REG0            long 0
REG1            long 0
REG2            long 0
REG3            long 0
AREG
REG4            long 0
' COGREG 5
txticks         long (sysfreq / baud )  ' set transmit baud rate
txmask          long |<txd                      ' change mask to reassign transmit
' COGREG 7 = TASK REGISTER POINTER
regptr          long registers          ' used by REG
unext           long doNEXT             ' could redirect code if used
' COGREG 10
' rearranged these register to follow REG0 so that they can be directly accessed by COGREG instruction
' COGREG 16
lapcnt          res 1

ErNa · 2017-03-06 22:13

OK, Peter thanks for quick response. Did you ever think about writing one liners in twitter? That would open new horizons as there is no way right or left! Peter for President ;-)

So: is REG0 an alias for COGREG0 ? I didn't dare to dive into the kernel, but I fear, I have to do so.

proplem · 2017-03-06 22:29

Peter - did you mention that EXTEND-V4.FTH leaves a #30404 on the stack when building?

Peter Jakacki · 2017-03-06 23:04

proplem wrote: »

Peter - did you mention that EXTEND-V4.FTH leaves a #30404 on the stack when building?

I'm not seeing this value although $76C4 looks like a data address == f32cmd (from .VARS)

( 0006 $1900  ok )   TACHYON V4
  Propeller .:.:--TACHYON--:.:. Forth V4.1 DAWN 410170227.0000
( 1564 $2DE4  ok )   

   End of source code, 0000  errors found  Load time = 1196935696 cycles at 96MHz  or 3878.145ms 
Code bytes used = 5348
CODE:$2DE4 =11236 bytes   NAME:$5D82 =5758 bytes   DATA:$76D4 =196 bytes    =12190 bytes free    Data Stack (0)
( 1565 $2DE4  ok )   
( 1566 $2DE4  ok )   
( 1567 $2DE4  ok )   ep 64 > IF roms $1F00 $FF EFILL SAVEROMS THEN

                    COPY ROMS from $2360 for 3,160 
\
( 1568 $2DE4  ok )   
( 1569 $2DE4  ok )   --- 
( 1570 $2DE4  ok )   0 U@ DROP
( 1571 $2DE4  ok )   
( 1572 $2DE4  ok )   FORGET SAVEROMS

( 1573 $2D52  ok )   
( 1574 $2D52  ok )   ?BACKUP
BACKUP |
( 1575 $2D52  ok )   AUTORUN BOOT
.STATS

CODE:$2D52 =11090 bytes   NAME:$5D8E =5746 bytes   DATA:$76D4 =196 bytes    =12348 bytes free    Data Stack (0)
( 1576 $2D52  ok )   lsroms

*** ROMS ***
0,848 VGA32x15  
0,352 HSUART    
1,900 F32       
( 1577 $2D52  ok )   
( 1578 $2D52  ok )   
( 1579 $2D52  ok )   .S
 Data Stack (0)
( 1580 $2D52  ok )

( 1583 $2D52  ok )   .VARS

76D0 fnumB
76CC fnumA
76C8 result
76C4 f32cmd
1F00 romsz
76B2 polls
76B0 timerjob
0007 SUN
0006 SAT
0005 FRI
0004 THU
0003 WED
0002 TUE
0001 MON
76AB _lt
2640 _rtc
76A2 rtcbuf
7698 time
7694 runtime
7684 wdt
7680 tid
767E timers
0080 ep
767D eeadr
767C i2cflg
0007 white
0006 cyan
0005 magenta
0004 blue
0003 yellow
0002 green
0001 red
0000 black
7678 brg
0003 @CE
0002 @MISO
0001 @MOSI
0000 @SCK
0004 @CNT
001D *SDA
001C *SCL
7670 _ctr
766C radix
7624 locals
01FF VSCL
01FE VCFG
01FD PHSB
01FC PHSA
01FB FRQB
01FA FRQA
01F9 CTRB
01F8 CTRA
01F7 DIRB
01F6 DIRA
01F5 OUTB
01F4 OUTA
01F3 INB
01F2 INA
01F1 CNT
01F0 PAR
01F0 SPR
7622 NULL$
761E boot
761D delim
761C lastkey
7618 @B
7614 @A
7610 ulong
0000 PCB
0000 FALSE
7610 @rest
0075 @WORD
009A names
7800 BUFFERS
00B6 @org
0000 OFF
00B8 dmm
00A8 errors
00A0 uhere
00A2 uthere
00B0 flags
008C prompt
00A6 autorun
00B2 keypoll
0092 rx
0694 id
( 1584 $2D52  ok )

ErNa · 2017-03-06 23:16

Next line, next question: RUNMOD runs a module, loaded before with LOADMOD. In the LCD-Driver I find this LOC:
code]ALIAS RUNMOD PLOT[/code]
Sounds like: what was RUNMOD before is PLOT now. PLOT is a fast internal plot module to set Pixel X, Y

: !LCD  lcdmem 4 COGREG! 4 3 COGREG! [PLOT]

I now understand like follows:
The procedure to initialize the LCD is named !LCD to indicate: this is an initialization.
Next the pointer to the Pixelmap (lcdmem) is stored to the 4th COGREG, a 4 goes to the third COGREG, then: [PLOT]
Later, PLOT is called without [], so does [] replace LOADMOD + RUNMOD?

Peter Jakacki · 2017-03-06 23:27

@ErNa - whenever we want to RUN a MODule we first load it and by convention all module names are enclosed in [ ] so in the case of the PLOT module we load it with [PLOT] which can now be executed with the RUNMOD opcode, essentially transferring execution to the start address of the module that was loaded. But when reading source code its confusing because the RUNMOD can be running any kind of module that was loaded so to make it read clearer we create an ALIAS of RUNMOD called PLOT. So [PLOT] is part of the kernel and loads the PLOT module into a fixed cog location that can be called by RUNMOD but referenced as PLOT.

BTW, V4.2 does away with COGREGs

Well, not exactly but I am able to move all these registers to the start of cog memory at location 1. So since all COGREGs are now actually the index + 1 I get rid of the indexing operation and simply refer to the locations 1,2,3 etc and use COG@ and COG! instead. This not only saves memory but is much faster too. V3 had to have these cogregs in upper cog memory since opcodes directly indexed the first 256 longs in cog memory but wordcode addresses all of cog and hub memory.

Peter Jakacki · 2017-03-06 23:27

-forum post glitch- MODS DELETE

D.P · 2017-03-07 04:00

Peter Jakacki wrote: »

.
.
BTW, V4.2 does away with COGREGs Well, not exactly but I am able to move all these registers to the start of cog memory at location 1. So since all COGREGs are now actually the index + 1 I get rid of the indexing operation and simply refer to the locations 1,2,3 etc and use COG@ and COG! instead. This not only saves memory but is much faster too. V3 had to have these cogregs in upper cog memory since opcodes directly indexed the first 256 longs in cog memory but wordcode addresses all of cog and hub memory.

Oh good, I like clean and simple.

ErNa · 2017-03-07 07:42

OMG, I really didn't expect Peter to present us such a mess of COGREGs, indeed, he's the last one I expected it from! In contrast to the situation after Obama. But now, after I got a glimpse of how it works, a clean, neat solution will come to existance. All nonconformal elements will be banned.

D.P · 2017-03-07 17:22

Peter did whatever he had to in order to get Tachyon the most "performance" from P1. FTP, HTTP, TELNET access while running a user routine and access to a serial console (just in case) has not been matched or even attempted in any other methodology on the P1. Tachyon is only getting better. It will be great see this methodology applied to other chips and eventually P2 silicon.

MJB · 2017-03-07 18:47

Peter Jakacki wrote: »

.
.
BTW, V4.2 does away with COGREGs Well, not exactly but I am able to move all these registers to the start of cog memory at location 1. So since all COGREGs are now actually the index + 1 I get rid of the indexing operation and simply refer to the locations 1,2,3 etc and use COG@ and COG! instead. This not only saves memory but is much faster too. V3 had to have these cogregs in upper cog memory since opcodes directly indexed the first 256 longs in cog memory but wordcode addresses all of cog and hub memory.

with all those changes I better wait until 5.0 before I reread/relearn the Tachyon source.
All my precious (old) Tachyon knowledge is rendered useless ;-)

No need to reply Peter ... :-) ...

Great, that there is progress ..
With a P2 available much earlier, there probably would not be such a sophisticated Tachyon for P1 ...

ErNa · 2017-03-07 20:35

Once Tachyon, ever Tachyon! Fighting to enter into "once"!

Peter Jakacki · 2017-03-07 23:54

ErNa wrote: »

Once Tachyon, ever Tachyon! Fighting to enter into "once"!

I see your fighting as struggling with a different way of thinking. When once everything was strict and fixed it was easy to learn and remember, and easy to teach (and mark). However that strict and fixed approach is not so flexible for the job at hand yet isn't the tool made to fit the job and not the job fit the tool? Every "job" is different and it would be nice to have a tool that could be made to fit the job. That's what Tachyon does but of course that makes Tachyon harder to "learn". However if you understand your job, you will understand what tool you need rather than trying to use the one fixed tool. So the tool is adapted to fit the job which is also why Tachyon is not "ANSI Forth" which is rather fixed for working best on PCs, not embedded realtime control over varying CPU architectures. The Propeller is a very different beast to our PC CPUs, and we like it that way.

On the forum we see new ones commonly describe a problem by the limits of the solution they envisage. Of course we wouldn't do that would we? But we do something similar when we try to solve problems with fixed tools, in the end we are describing the problem in terms of the limitations of the tool. That must mean that we never see the problem clearly, and never arrive at an optimal solution. To me "less is more", so a simple solution then which shows that we understand the problem.

You may have noticed that I try not to refer to sample code when interfacing new chips. One of the reasons is because we start thinking in terms of the limits of the tool and the thinking of whoever wrote that code, and not see what needs to be done or not done. The solution is usually much simpler than portrayed. So I just look at the datasheet and start interacting with these chips at the lower levels, trying different things, testing the limits etc. Once I know that part is working then those new functions become part of the "language" as if the tool is being shaped. Remember that the tool and target application are one and the same unlike a PC compiler and a binary blob. Start at the lower levels and build on that foundation and remember it's not really work, it's play, so have fun!

ErNa · 2017-03-08 15:43

Hi Peter, I like working top down and bottom up, and from a birds view. As every character counts and I try to inhale your coding style, so things are going forward slowly.
You outcommented

: LCD   ' VCHAR uemit W! ;

what is the reason to do so?

Peter Jakacki · 2017-03-08 16:05

ErNa wrote: »
Hi Peter, I like working top down and bottom up, and from a birds view. As every character counts and I try to inhale your coding style, so things are going slowly forward.
You outcommented
: LCD   ' VCHAR uemit W! ;
what is the reason to do so?

LCD diverts character output to the LCD screen using VCHAR rather than the serial console. In Spin the tick is a comment. but here the tick symbol is an immediate operation which reads in the following word and returns with the code field address or run address. In the LCD word we then take that result and write it to the uemit vector which the EMIT primitive uses as a redirection vector.

We would "pronounce" the code:
' VCHAR uemit W!
as Code Field Address of VCHAR uemit Word Store

Peter Jakacki · 2017-03-08 17:32

A few years ago I wrote a version of Conway's Game of Life to display on a serial terminal. Here it is again revisited with some extras running under Tachyon V4 of course.
Get the code in the Tachyon V4 folder
This is part of the code that processes the keyboard shortcuts you can use to interact with it:

'R' =[ RANDOM	]
	'G' =[ GLIDER	]
	'D' =[ DIEHARD	]
	'A' =[ ACORN	]
	'L' =[ LWSS	]
	'F' =[ FPENT	]
	'W' =[ WIDER	]
	'E' =[ EXPAND	]
	'<' =[ SHRINK	]
	'>' =[ WIDER	]
	'V' =[ VOID	]
	$20 =[ PROPLIFE ]

ErNa · 2017-03-08 22:57

One more question: where is LCDBITBLT called? And why not?
Or, was that just in response to proplems driver?

Peter Jakacki · 2017-03-08 23:43

ErNa wrote: »

One more question: where is LCDBITBLT called? And why not?
Or, was that just in response to proplems driver?

--- transfer buffer to LCD
--- start from upper left corner and move down to bottom right and set ch
pub LCDBITBLT ( -- )

This code would normally be called from the main program to update the screen but in a games program I would call this from another cog to continually update the LCD screen since the drawing routines only access the graphics buffer in hub RAM, not the LCD directly. Otherwise it is possible to call this routine at the end of the VCHAR routine which draws the font. So you could redefine VCHAR to reference this when it draws a character, and also when it clears the screen.

: VCHAR
	SWITCH
	$0D CASE VCR BREAK
	$0A CASE VLF BREAK
	$0C CASE VCLS LCDBITBLT BREAK
	$01 CASE HOME BREAK
	SWITCH@ LCDCH LCDBITBLT
	;

Other update alternatives include only updating the screen on a special character so in the original VCHAR we could have:

0 CASE LCDBITBLT BREAK

or even simply just call it from the main program but I would rename LCDBITBLT to something more readable then, say DISPLAY for instance.

Peter Jakacki · 2017-03-09 02:10

I couldn't resist speeding up and improving the Game of Life, now with a touch of color. So much better that way, don't you think?

David Betz · 2017-03-09 02:30

(deleted)

Peter Jakacki · 2017-03-09 03:06

Yes, it's doing about 15fps on a 64x32 but slows down to 11fps on the 128x32. The screen update is desynchronized from the generation update in this demo but at 64 wide the next generation takes about 77ms, still room fro improvement though.

David Betz · 2017-03-09 03:16

Peter Jakacki wrote: »

Yes, it's doing about 15fps on a 64x32 but slows down to 11fps on the 128x32. The screen update is desynchronized from the generation update in this demo but at 64 wide the next generation takes about 77ms, still room fro improvement though.

I know this is off topic for this thread but do you know if anyone has done a PASM version of Life that runs on all 8 COGs at once? Or maybe 7 with one driving VGA?

Peter Jakacki · 2017-03-09 05:28

In the LIFE demo I treat the universe as a shingle on which to print text using the Propeller font. So I went one step further and made it autoscroll and display when it reaches the right hand side so that it works like a character LCD. Just select the width and make it an output device and all Tachyon character output is directed to the shingle instead of normal character output. Notice the actual text output in the bottom left hand corner.

kwinn · 2017-03-09 14:27

@Peter Jakacki

Any idea why I get these three download windows every time I open this thread?

Peter Jakacki · 2017-03-10 13:33

kwinn wrote: »

@Peter Jakacki

Any idea why I get these three download windows every time I open this thread?

I went back to my two post that had youtube links using the youtube forum icon and removed the tags and simply inserted the full https url without any tags and it seems fine. Now if David Betz can edit his post and perhaps remove the youtube tagged link altogether then there should be no more problems. Bump can still get around to removing the youtube tag feature though.

David Betz · 2017-03-11 12:42

Peter Jakacki wrote: »

Now if David Betz can edit his post and perhaps remove the youtube tagged link altogether then there should be no more problems.

done

kwinn · 2017-03-11 18:42

Hurray, no more download popups.

ErNa · 2017-03-11 22:57

What is the idea of having I as a loop index, K as a second level loop index and J as a third level loop index? Could you just help, Peter?

Peter Jakacki · 2017-03-12 00:54

ErNa wrote: »

What is the idea of having I as a loop index, K as a second level loop index and J as a third level loop index? Could you just help, Peter?

This is all part of Forth being Forth as we are not assigning a variable for a loop index as we do in other languages perhaps. In Forth the loop index is normally sitting on the return stack along with the loop limit. The word to read that loop index is called "I", so it is a control word, not the name of a variable. In Tachyon however I avoid corrupting the return stack, a sure recipe for disaster in embedded real-time control systems, and so the index and limit values have a special loop stack. One advantage of this is that it becomes a very easy matter to index into this stack from any subroutine called from within the loop. The top loop stack item will always be the current "I", the third loop stack item will be the "I" for the next outside loop, and "J" for the one outside that and so on.

( 0004 $33B2 S08 )   5 0 DO I . SPACE LOOP
0 1 2 3 4 
( 0005 $33B2 S08 )   3 0 DO 5 0 DO I . SPACE J . 3 SPACES LOOP LOOP
0 0   1 0   2 0   3 0   4 0   0 1   1 1   2 1   3 1   4 1   0 2   1 2   2 2   3 2   4 2

So in Tachyon the loop stack permits this as well

( 0006 $33B2 S08 )   : MYSUB 5 0 DO I . SPACE J . 3 SPACES LOOP ;
( 0007 $33CA S08 )   3 0 DO MYSUB LOOP
0 0   1 0   2 0   3 0   4 0   0 1   1 1   2 1   3 1   4 1   0 2   1 2   2 2   3 2   4 2

There's also another stack in Tachyon that DO LOOP and also FOR NEXT use just for fast branching addresses and that is another "Tachyonistic" way of doing things.
Forth Stacks: Data, Return
Tachyon Stacks: Data, Return, Loop, Branch.

' I ( -- index ) The current loop index is at a fixed address just under the loop limit at top of an ascending loop stack
I                       mov     X,loopstk+1
                        jmp     #PUSHX

LOOP    if_nc           mov     X,#1                    ' default loop increment of 1
                        add     loopstk+1,X             ' increment index
                        cmps    loopstk,loopstk+1 wz,wc
BRANCH  if_a            mov     IP,branchstk            ' Branch to the address that is saved in branch
        if_a            jmp     unext
                        jmpret  LPOPX_ret,forNEXT+1 wc  ' discard top of loop stack index

Tachyon V4 "DAWN" - exploring new worlds

Comments