TACHYON O/S V3.0 JUNO - Furiously Fast Forth, FAT32+LAN+VGA+RS485+OBEX ROMS+FP+LMM+++

Peter Jakacki · 2012-07-18 10:06

richaj45 wrote: »

Hello:

So how is your coding doing 2M baud?
Why is it so much faster than the standard Full Duplex 4 port object?
Is the serial driver full duplex?
When do you think the whole of the Tachyon code will be posted?

Thanks in advance for your patience with all these questions.

cheers,
rich

There's nothing really difficult in getting the Prop to handle a couple of mega-baud but to get it to do it full-duplex or worse still, four full-duplex channels from the one cog is a huge ask. I dedicate a cog just to handle the receive and buffering side while the Tachyon VM cog will bit-bash the transmit side directly as it's faster to do it that way. So effectively you have Full-duplex without the timing problems that would normally arise from getting a cog to jumpret here and there and introducing jitter both in receive and transmit as is the case with most implementations. I've looked at the timing for a lot of these implementations and while they are fine at low speed they are way out, asymetrical, and jittery at high speeds.

The code I have on Google Docs is "live" all the time and will always be the most current version as it is the one I work off, if I make a change and someone has the document open then their document will reflect that change immediately. However to keep a few grumblers happy I have zipped up and attached the current version.

I am still a few more days away or so from having a fully functional system running, at present I need to finish off the whole text interpreting thing with it's number processing and compilation modes and then I will look at integrating SD and VGA etc.

TACHYON.zip

EDIT: Now tested and working at 3M baud!

Peter Jakacki · 2012-07-18 16:55

Do you wonder what the memory map looks like for Tachyon? I made a quick one up in Tachyon's on-line document but here is what it looks like:

Out of this almost 4K bytes there is another 800 bytes or so that I could reclaim again so that really brings the memory footprint closer to 3kB. Since Tachyon bytecode is so efficient this might only go up another few k when I add in the rest of the high-level functions.

Peter Jakacki · 2012-07-21 01:34

Here's an update, I've had a busy week but still managed to get a lot more going. The current version runs at a default of 3M baud (I use Minicom on Linux) and includes break detection to reboot the Prop. The word and number parser are working well as are the formated print functions. Numbers encountered in the terminal text input in Forth are normally converted in the current number base (decimal,hex,binary etc) but I also allow a very flexible format for these numbers as I like to use them this way.
Number prefixes include: $ for hex, % for binary, # for decimal while suffixes can be h for hex, d for decimal, and b for binary. As long as it begins or ends in a digit or an valid symbol you can use any other symbol mixed between the digits for instance:
#12:45 is decimal for 1245
%1100_0101 is binary for $C5
5_000_000d or #5,000,000 is what it looks like etc

BTW, a hex dump listing of the full 64K of memory takes 4 seconds including interpreting the text "CNT@ 0 $10000 DUMP CNT@ SWAP - #80,000 / DECIMAL ." which means at 56 characters/line and 4,096 lines that equates to 17.43us average time per character which also includes the 4us or so to send the character! I know that Spin would never ever keep up with that!

Next step now is to implement compilation mode with it's associated dictionary handlers, Hopefully I will have that done this weekend.

prof_braino · 2012-07-21 09:24

3M baud is crazy fast. Didn't somebody say 9600 should be fast enough for anybody?

I didn't check, are you doing any error checking? Sal said something about a limit at which point it starts to get unstable, have you noticed anything like that?

Peter Jakacki · 2012-07-21 17:14

prof_braino wrote: »

3M baud is crazy fast. Didn't somebody say 9600 should be fast enough for anybody?

I didn't check, are you doing any error checking? Sal said something about a limit at which point it starts to get unstable, have you noticed anything like that?

At 3M baud the terminal screen updates blindingly fast and unless it's a long listing all it does is blink a new screen into existence! Since a cog is devoted to receive only it is not handicapped with trying to maintain buffering and timing and timing for the transmitter as well. To do this in Spin would require three cogs: one for Spin, one for Receive, one for Transmit. However in Tachyon I can transmit straight from Tachyon's cog and leave the other cog to handle the receive. I can probably go faster yet still but 3M works reliably.

BTW, I have updated the top post with a zipped attachment of the latest files which I will keep up-to-date with the latest "release" version whereas the on-line document will always be the live and bleeding edge.

Peter Jakacki · 2012-07-22 22:44

Progess report....

The VGA_512x384_Bitmap object has now been incorporated into the source as there is sufficient memory available despite the fact that this object requires 24KB just for the bitmap. Being able to include objects such as these was one of the requirements of Tachyon which is also why it had to have a very small footprint despite the need for speed. Access to the object at present is via the color and pixel pointers but at a later stage I will write a Graphics demo object in Forth to showcase Tachyon.

I mentioned in the thread that Tachyon does not follow the traditional method of handling text input which normally would accumulate into a text input buffer (TIB) and after an <enter> would be processed word by word in an interpretive fashion. One of the drawbacks of this is that branch and loop stuctures which are meant to be compiled cannot be entered in an interpretive fashion and you just end up with the "compile only" warning. What I've been able to do is to skip the TIB and just parse words as they are encountered in the input stream and compile these immediately into the current code compilation location. When an <enter> is encountered the bytecodes are then executed which means that it is now possible to type in interactively such statements as:
20 1 DO CR I 0 DO 2A EMIT LOOP LOOP<enter>
without having to resort to creating a special word, invoking it, and then forgetting it (deleting). The interactve statements also behave exactly like they would inside a definition and at the same speed. Of course words such as IF and THEN etc are tagged as immediate words and are not compiled but instead execute immediately and control the compilation.

My next step which I am presently on is allowing new definitions to be created to which end I am testing the dictionary and vector table handlers. This stage of the testing will probably be completed in the next 24 hours.

Has anyone tried loading Tachyon onto a board yet? Please let me know if you had any problems. The same goes for viewing the on-line Google document source. BTW, I am using one of my Pixie boards at the moment so I can play with the VGA graphics especially when the Tachyon kernel is developed enough (soon) that I can just communicate to it via Bluetooth or ZigBee and interactively develop my applications.

Peter Jakacki · 2012-07-23 01:34

Here's a simple benchmark for Spin vs Tachyon in terms of raw HUB memory access. I increment each byte of the 24K of bitmap memory which of course involves a hub read and a hub write as well as looping. So both fragments of code are limited by the hub access time and shouldn't really perform any differently as hub access is like wading through molasses. However the Spin loop takes 368ms to execute vs the 137ms it takes Tachyon to do the same. I will look at some other routines later and compare the differences.

' Spin: Increment 24kB of video memory byte by byte - takes 368ms with 50 bytes of code

repeat
    i := Pixels
    x := CNT
    repeat $6000
      byte[i++] += 1
    y := CNT
    x := (y-x)/80000
    coms.tx($0D)
    coms.tx($0A)
    coms.dec(x)

' Tachyon: Increment 24kB of video memory byte by byte - takes 137ms with 32 bytes of code

100 0 DO CNT@ $8000 $2000 DO 1 I C+! LOOP CNT@ SWAP - 80,000 / CR U. LOOP

Dave Hein · 2012-07-23 06:19

If I replace "byte[i++] += 1" with "byte[i++]++" in Spin I get 294 msec. I get 132 msec in TrimSpin.

prof_braino · 2012-07-23 11:56

Dave Hein wrote: »

If I replace "byte[i++] += 1" with "byte[i++]++" in Spin I get 294 msec. I get 132 msec in TrimSpin.

AS I understand it, TrimSpin is a subset of spin. What kind of applications do you use it for?

For example, forth is often used for cases where we want drivers maybe eventually in assembler, but don't want to write the whole program in assembler. And/or we want to develop and test interactively.

Can you do everything spin can do using your subset? This is a cool idea.

Peter Jakacki · 2012-07-23 15:48

Dave Hein wrote: »

If I replace "byte[i++] += 1" with "byte[i++]++" in Spin I get 294 msec. I get 132 msec in TrimSpin.

Well, if it comes down to optimizations then if I replace "1 I C+!" with "I C++" then I get 88ms in Tachyon

Phil Pilgrim (PhiPi) · 2012-07-23 16:14

If Tachyon is capable of handling the full repertoire of Spin commands -- either natively or with a combination of native words -- I wonder if it might be a better target virtual machine for the Spin compiler than the current p-code interpreter.

-Phil

Dave Hein · 2012-07-23 16:41

prof_braino wrote: »

AS I understand it, TrimSpin is a subset of spin. What kind of applications do you use it for?
Can you do everything spin can do using your subset?

The current appoach on TrimSpin is to write a Spin interpreter that implements all of the instructions, but runs about twice as fast. However, because this doesn't fit in a cog, the interpreter is trimmed down to only the bytecodes that are used by a particular program. This is done by profiling the bytecodes in an existing binary file, and generating an interpreter that matches it.

I'm not sure which applications would use it other than those that need to run just a little bit faster.

Peter, the 88 msec time is very impressive.

Peter Jakacki · 2012-07-24 22:33

Dave Hein wrote: »

The current appoach on TrimSpin is to write a Spin interpreter that implements all of the instructions, but runs about twice as fast. However, because this doesn't fit in a cog, the interpreter is trimmed down to only the bytecodes that are used by a particular program. This is done by profiling the bytecodes in an existing binary file, and generating an interpreter that matches it.

I'm not sure which applications would use it other than those that need to run just a little bit faster.

Peter, the 88 msec time is very impressive.

I think you are on the right track with having your own Spin interpreter and trimming the way you do. I assume that what can't fit in the cog but is needed is then implemented at a higher-level?

As I develop and test Tachyon further I keep optimizing the kernel. I just improved the erase and fill operations simply by creating a helper word with two PASM instructions so that I can now fill or erase the 24kB of memory in 34ms.

: CLS        pixels W@ $6000 ERASE ;

In the kernel I also had code for some fast single bytecode constants using mov X,#n and jmp #PUSHX instructions but now I have reduced this to one instruction per "entry" so that constants 0...8 take but 10 instructions total.

As of this moment the runtime compiler seems to be working fine although there are a lot of little things that need to be tidied up. Here is a little test code that I use to list the words in the dictionary. BTW, I have a whitespace after each line except the last to prevent execution of the line of code which normally is executed on a <CR>. I will tidy up little things like this in the process of testing and developing.

: NFA>CFA  BEGIN C@++ $7F > UNTIL ;
: WORDS      
    names W@ 
    BEGIN 
    CR DUP .W 3A EMIT BL EMIT 
    DUP NFA>CFA 1- DUP 3 ADO I C@ .B BL EMIT LOOP 
    3 + 
    SWAP .STR BL EMIT 
    DUP C@ 0= 
    UNTIL 
    DROP CR 
    ;

\ we will now execute WORDS
_WORDS ok

13C2: 8A C1 53 WORDS 
13CA: 8A C1 52 NFA>CFA 
13D4: 80 00 06 RESET 
13DC: 80 01 06 ?EXIT 
13E4: 80 04 06 0EXIT 
13EC: 80 06 06 EXIT 
13F3: 80 07 06 NOP 
13F9: 80 08 06 3DROP 
1401: 80 09 06 2DROP 
1409: 80 0A 06 DROP 
1410: 80 0C 06 ?DUP 
1417: 80 0D 06 DUP 
141D: 80 0F 06 2DUP 
1424: 80 11 06 OVER 
142B: 80 13 06 3RD 
1431: 80 15 06 4TH 
1437: 80 19 06 SWAP 
143E: 80 1C 06 ROT 
1444: 80 17 06 NIP 
144A: 80 25 06 1+ 
144F: 80 27 06 1- 
1454: 80 21 06 + 
1458: 80 23 06 - 
145C: 80 2E 06 * 
1460: 80 2B 06 / 
1464: 80 29 06 U/MOD 
146C: 80 30 06 UM* 
1472: 80 32 06 NEGATE 
147B: 80 36 06 INVERT 
1484: 80 39 06 AND 
148A: 80 3B 06 OR 
148F: 80 3D 06 XOR 
1495: 80 3F 06 SHR 
149B: 80 41 06 SHL 
14A1: 80 43 06 2/ 
14A6: 80 45 06 2* 
14AB: 80 47 06 REV 
14B1: 80 49 06 MASK 
14B8: 80 4D 06 0= 
14BD: 80 51 06 = 
14C1: 80 55 06 > 
14C5: 80 5A 06 C@ 
14CA: 80 5C 06 W@ 
14CF: 80 5E 06 @ 
14D3: 80 60 06 C+! 
14D9: 80 62 06 C! 
14DE: 80 64 06 C@++ 
14E5: 80 68 06 W+! 
14EB: 80 6A 06 W! 
14F0: 80 6C 06 +! 
14F5: 80 6E 06 ! 
14F9: 80 70 06 CMOVE 
1501: 80 71 06 SET 
1507: 80 73 06 CLR 
150D: 80 75 06 BIT? 
1514: 80 7A 06 C++ 
151A: 80 7E 06 IC! 
1520: 80 80 06 PUSH4 
1528: 80 81 06 PUSH3 
1530: 80 82 06 PUSH2 
1538: 80 83 06 PUSH1 
1540: 80 86 06 LVAR 
1547: 80 8B 06 BVAR 
154E: 80 95 06 FALSE 
1556: 80 95 06 OFF 
155C: 80 95 06 0 
1560: 80 94 06 1 
1564: 80 93 06 2 
1568: 80 92 06 3 
156C: 80 91 06 4 
1570: 80 90 06 5 
1574: 80 8F 06 6 
1578: 80 8E 06 7 
157C: 80 8D 06 8 
1580: 80 97 06 0FF 
1586: 80 99 06 ON 
158B: 80 99 06 TRUE 
1592: 80 99 06 -1 
1597: 80 9B 06 BL 
159C: 80 BC 06 LSTACK 
15A5: 80 F3 06 EMIT 
15AC: 80 9D 06 P@ 
15B1: 80 9F 06 P! 
15B6: 80 A1 06 OUTSET 
15BF: 80 A3 06 OUTCLR 
15C8: 80 A4 06 OUTPUTS 
15D2: 80 A6 06 INPUTS 
15DB: 80 A8 06 PX 
15E0: 80 AB 06 CLKBIT 
15E9: 80 AD 06 CLKBITS 
15F3: 80 B6 06 REBOOT 
15FC: 80 AE 06 COG@ 
1603: 80 B2 06 COG! 
160A: 80 B8 06 HUBOP 
1612: 80 BA 06 STACKS 
161B: 80 BC 06 LSTACK 
1624: 80 BE 06 CALL 
162B: 80 C1 06 (XCALL) 
1635: 80 C6 06 (RCALL) 
163F: 80 C8 06 (WCALL) 
1649: 80 CA 06 (CMPJEQ) 
1654: 80 CE 06 (ELSE) 
165D: 80 D0 06 (IF) 
1664: 80 D2 06 (UNTIL) 
166E: 80 D4 06 (AGAIN) 
1678: 80 D6 06 (ADO) 
1680: 80 D9 06 (DO) 
1687: 80 DA 06 (LOOP) 
1690: 80 DC 06 (+LOOP) 
169A: 80 E4 06 (FOR) 
16A2: 80 E6 06 (NEXT) 
16AB: 80 E8 06 >R 
16B0: 80 EB 06 R> 
16B5: 80 ED 06 >L 
16BA: 80 EF 06 L> 
16BF: 80 F1 06 I 
16C3: 80 FF 06 BKEY 
16CA: 80 F3 06 EMIT 
16D1: 80 F6 06 (REG) 
16D9: 80 F8 06 @REG 
16E0: 80 FA 06 WAIT 
16E7: 80 FD 06 (PTRSTR) 
16F2: 80 FE 06 (CMPSTR) 
16FD: 82 C1 19 STOP 
1704: 82 C1 1A COGID 
170C: 82 C1 02 0<> 
1712: 82 C1 01 <> 
1717: 82 C1 05 0> 
171C: 82 C1 04 0< 
1721: 82 C1 03 < 
1725: 82 C1 06 U< 
172A: 82 C1 07 WITHIN 
1733: 82 C1 0A LEAVE 
173B: 82 C1 0B J 
173F: 82 C1 0C K 
1743: 82 C1 0D IX 
1748: 82 C1 2F ERASE 
1750: 82 C1 2E FILL 
1757: 82 C1 08 ms 
175C: 82 C1 09 CNT@ 
1763: 82 C1 23 KEY 
1769: A2 C1 0F \ 
176D: 82 C1 22 HEX 
1773: 82 C1 21 DECIMAL 
177D: 82 C1 20 BINARY 
1786: 82 F6 00 REG 
178C: 82 F6 3D BASE 
1793: 82 F6 2C DIGITS 
179C: 82 F6 36 HERE 
17A3: 82 F6 3A DELIM 
17AB: 82 C1 14 .S 
17B0: 82 C1 17 DUMP 
17B7: 82 C1 18 COGDUMP 
17C1: 82 C1 15 .STACKS 
17CB: 82 C1 16 DEBUG 
17D3: 82 C1 1B CLS 
17D9: 82 C1 1C SPACE 
17E1: 82 C1 1D BELL 
17E8: 82 C1 1E CR 
17ED: 82 C1 10 .H 
17F2: 82 C1 11 .B 
17F7: 82 C1 12 .W 
17FC: 82 C1 13 .L 
1801: 82 C1 3E . 
1805: 82 C1 31 >DIGIT 
180E: 82 C1 32 NUMBER 
1817: 82 C1 29 GETWORD 
1821: 82 C1 2B FINDSTR 
182B: 82 20 C1 ' 
182F: 82 C1 2D *VER 
1836: 82 C1 34 @PAD 
183D: 82 C1 35 +PAD 
1844: 82 C1 36 >CHAR 
184C: 82 C1 37 #> 
1851: 82 C1 38 <# 
1856: 82 C1 39 # 
185A: 82 C1 3A #S 
185F: 82 C1 3B .STR 
1866: 82 C1 3C STRLEN 
186F: 82 C1 3D U. 
1874: 82 F6 1C colors 
187D: 82 F6 20 pixels 
1886: 82 F6 32 names 
188E: 82 F6 36 here 
1895: 82 F6 38 codes 
189D: A2 C1 42 FOR 
18A3: A2 C1 43 NEXT 
18AA: A2 C1 44 DO 
18AF: A2 C1 45 ADO 
18B5: A2 C1 46 LOOP 
18BC: A2 C1 47 +LOOP 
18C4: A2 C1 48 IF 
18C9: A2 C1 49 ELSE 
18D0: A2 C1 4A THEN 
18D7: A2 C1 4B BEGIN 
18DF: A2 C1 4C UNTIL 
18E7: A2 C1 4D AGAIN 
18EF: A2 C1 4E : 
18F3: A2 C1 4F ; 
18F7: A2 C1 50 CREATE 
1900: 82 C1 51 AddXcode

Peter Jakacki · 2012-07-25 21:51

I have updated the top post to include a demo board binary and Forth source test samples in the zip attachment. So now it's easy just to load in a standard binary (set for 57.6K baud) and hookup a VGA monitor.

Here's that attachment anyway:TACHYON.zip

jmg · 2012-07-25 23:44

If you were to take an example from here, for all Boolean variables (aka pins)
http://www.automation-course.com/branching-in-il/
which links from
http://en.wikipedia.org/wiki/Instruction_list

shown below, and coded that in TACHYON, what would the result look like ?
It seems the braces in the Instruction List carry an implied stack ?

; Instruction list coded
   AND(
       OR  I0.1
       OR  I0.2
      )
   AND(
       OR I0.5
       OR I0.6
       )
   AND I2.0
   AND I2.5
   = Q2.3

; Algebraic coded this is 
 Q2.3 =  I2.0 AND I2.5 AND (I0.1 OR  I0.2) AND (I0.5 OR I0.6)

This link  http://users.isr.ist.utl.pt/~pjcro/courses/api0809/docs/API_I_C3_IL.pdf
has a slightly different style, and I think this is also equivalent to the above 

   AND(
       LD  I0.1
       OR  I0.2
      )
   AND(
       LD I0.5
       OR I0.6
       )
   AND I2.0
   AND I2.5
   ST  Q2.3

Peter Jakacki · 2012-07-26 00:32

jmg wrote: »

If you were to take an example from here, for all Boolean variables (aka pins)
http://www.automation-course.com/branching-in-il/
which links from
http://en.wikipedia.org/wiki/Instruction_list

shown below, and coded that in TACHYON, what would the result look like ?
It seems the braces in the Instruction List carry an implied stack ?

The thing with Forth is that you can tailor the way it compiles to suit your style or application. In this case I have added or modified a couple of words which would eventually be part of the kernel anyway. Following this I set the number base to octal to allow the port numbering system to be used where the last digit I assume is from 0..7, so it's octal. So if I interpreted the instructions correctly they could look like this example where I have spread the statement over a few lines and named it as well.

STL doesn't look like it's using a stack from the programmers perspective at least, interally it may but with a lot of these compilers the syntax is very rigid.

\ Let's define a couple of general-purpose extensions 
: IN  ( bit -- ) MASK P@ AND 0<> ;
: OUT    ( state pin -- )  px ;

\ change number base to octal to enhance readability of port numbering
8 BASE C!  

\ Now code the example
: PLC_DEMO
  0.1 IN 0.2 IN OR
  0.5 IN 0.6 IN OR
  AND
  2.0 IN AND 2.5 IN AND 
  2.3 OUT
;

jmg · 2012-07-26 14:40

Thanks.

Peter Jakacki wrote: »

STL doesn't look like it's using a stack from the programmers perspective at least, interally it may but with a lot of these compilers the syntax is very rigid.

I did find some more info that mentions stack here,

http://claymore.engineer.gvsu.edu/~jackh/books/plcs/chapters/plc_il.pdf

An important concept in this programming language is the stack. (Note: if you use a calculator
with RPN you are already familiar with this.)

and I think they mean the brackets are an inferred/hidden stack, not quite the explicit stack of forth.

Peter Jakacki · 2012-07-27 01:43

The current version appears to be working well although I still have to build in a SAVE feature for backing-up downloaded user code for full auto-restore and run. But that will come soon. For now I have uploaded to youtube a short video where I load in some Forth code into a Prop board running Tachyon and show how well this bytecode handles bitmapped VGA without resorting to a GPU cog.

For reference this is the code I downloaded:

 
: START   CNT@ $10 @REG ! ;
: LAP       CNT@ $10 @REG @ - #80,000 / ;

\ **************************** VGA FUNCTIONS ********************

: CLRSCN        pixels W@ $6000 ERASE ;

: COLORS    colors W@ W! colors W@ DUP 2+ #382 CMOVE ;

$FF04 COLORS
{                                   
: PLOT ( x y -- ) 
    6 SHL OVER 3 SHR +  
    SWAP 7 AND MASK SWAP pixels W@ + SET 
    ;
}

: HLINE  ( x y length -- ) 
    ROT SWAP ADO I OVER PLOT LOOP DROP 
    ;
    
: VLINE  ( x y length -- ) 
    ADO DUP I PLOT LOOP DROP 
    ;

: ITEM    2* 2* @REG ;

: ITEMS  ( items -- ) 0 DO I ITEM ! LOOP ;

: RECT ( x1 y1 xlen ylen -- ) 
    4 ITEMS 
    3 ITEM @ 2 ITEM @ 1 ITEM @ HLINE 
    3 ITEM @ 1 ITEM @ + 2 ITEM @ 0 ITEM @ VLINE 
    3 ITEM @ 2 ITEM @ 0 ITEM @ VLINE 
    3 ITEM @ 2 ITEM @ 0 ITEM @ + 1 ITEM @ HLINE 
    ;

: BOXES
    CLRSCN 
    #200 0 DO I I 50 50 RECT 4 +LOOP 
    $C0 $C0 $30 FOR 2DUP $80 $40 RECT SWAP 4 + SWAP 4 - NEXT 2DROP 
;


: SLANTS  180 0 DO 100 0 DO I J + I PLOT LOOP 4 +LOOP ;

\ : RSLANTS  180 0 DO 100 0 DO I 180 J - + 100 I - PLOT LOOP 4  +LOOP ;



: X   8 @REG ;
: Y #10 @REG ;
: VCR    0 X W! ;
: VLF    #34 Y W+! ;
: HOME        VCR 0 Y W! ;
HOME
: CHAR ( ch -- ) 
       DUP 2/ 7 SHL $8000 + 
       ( ch addr ) 
       20 0 DO DUP @ 3RD 1 AND IF 2/ THEN 
       10 0 DO DUP 1 AND IF X W@ I + Y W@ J + PLOT THEN 2/ 2/ LOOP 
       DROP 4 +  
       LOOP 2DROP #18 X W+! X W@ #500 > IF VCR VLF THEN ;

: CTRL
    DUP $0D = IF VCR DROP EXIT THEN
    DUP $0A = IF VLF DROP EXIT THEN
    DUP $0C = IF HOME CLRSCN DROP EXIT THEN
    DUP $01 = IF HOME DROP EXIT THEN
    DUP $1B = IF DROP R> DROP EXIT THEN
    CHAR
    ;
: VEMIT  DUP 20 < IF CTRL ELSE CHAR THEN ;


: DEMO
    CLRSCN HOME
    BEGIN KEY VEMIT AGAIN
    ;

Christof Eb. · 2012-07-27 06:22

Hi Peter,
I am very impressed by your direction implemeting forth. I like the byte code which is directly the adress. Simple and fast whow!
I have played just a little bit with tachyon. I can switch on a LED at port P7: 7 MASK OUTSET. I have not been able to do this with the LED on Port P23 of the demoboard: "23 MASK OUTSET".
I cannot see a number with two digits (22) on the stack using ^d. But if I use "." I can print it.

What is your idea of the system, when will it be complete for you? Will it be a self- hosted system with SD-card and keyboard support?
I understand that the stack in cog-ram is a key to a very fast implementation but I am a little bit frightend what ca be done with such a short stack?

Christof

mindrobots · 2012-07-27 06:32

Wicked Fast, Peter!

I like the loop compiling from the input stream - very useful for testing/playing!

Peter Jakacki · 2012-07-27 06:55

Hi Christof,
Thanks, I'm trying to keep it simple, both in terms of the kernel source and one step compilation plus in terms of just using it. Yes, I will be implementing full SD and keyboard drivers etc, either in Forth itself perhaps running in another cog or a stripped down version from the OBEX. The default number base is hex so that may be the problem then? I will update the binary to the latest version too. You can always force the number to decimal either as #23 or 23d and the same goes for forcing hex or binary as well as ASCII and control characters as in "#" and ^D.

Yes, the stack "seems" a little small but stack abuse is rife in a lot of Forth code and should really be reined in. In fact the return stack gets misused by many programs because they have no way of manipulating data easily on the data stack and when the return stack is not restored properly, then whamo, crash ..... not good. So you see I have the loop stack just for loops or even for temporary values and they can even be addressed easily too using the I,IX, and J type words etc. The other thing that helps is the use of the fixed register area which is used internally and also to handle four or more parameters. Eventually I will have local variables created from the stack comment ( -- ) and saved in these registers. Manipulating and juggling values on the data-stack is fine when there are only 2 or 3 values but over that it becomes annoying. Anyway my original plan was to try and keep it small and tidy on the stack but allow for a mechanism which could automatically come into action when the stack overflowed and start saving these values to hub memory but only when necessary which is obviously required to make the system robust (but slower during these exceptions).

Peter Jakacki · 2012-07-27 07:12

mindrobots wrote: »

Wicked Fast, Peter!

I like the loop compiling from the input stream - very useful for testing/playing!

Thanks Rick, that's what always used to annoy me with every Forth I've ever played with (including my own) in that you had to create a definition to run a quick command just because it contained looping and/or branching etc. So compiling word by word has advantages and when it is executed on an <enter> it runs at the same speed and the exact same way it would when it's built into a definition. The VGA stuff I'm doing is really just to provide a test environment for Tachyon at present but it looks like it would be good to continue with this stuff and turn it into a development system on a chip rather than having to rely on PCs or other platforms. When Prop II comes along I will be in a good position to port this stuff over fairly easily I think.

BTW, when I did some testing and tweaking of my serial receive driver the other day I found out why it wouldn't work above 3M baud. It turns outs that Minicom is quite happy to issue the 3.5 and 4M baud commands to the Linux drivers but the FTDI chip just reverts to a slow speed instead. From the timing it looks like I could go faster again if only the PC could keep up !

Christof Eb. · 2012-07-27 08:01

Hi Peter,
great if this will be a full self hosting system!

_DECIMAL 22 MASK OUTSET ok
does not switch on the LED.

I downloaded the new version, but I still cannot see a 22d on the stack. Only after a DUP there are two entrys 16hex. See the terminal protocol below.
Perhaps you want to have a look at it, although I don't want to break your actual work.

Christof

Propeller .:.:--TACHYON--:.:. Forth V1.0 rev120727.1800
_22d ok

DATA STACK
01CE: 000001CE A55AA55A
01D0: 00000000 00000000 00000000 00000000
01D4: 00000000 00000000 00000000 00000000
01D8: 00000000 00000000
RETURN STACK
01DA: 00001007 00000CBB
01DC: 00000C60 00000ABC 00000AA2 00000A44
01E0: 000009DF 000009DF 000009D7 00000000
01E4: 00000000 00000000 00000000 00000000
LOOP STACK
01E8: 00000000 00000000 00000000 00000000
01EC: 00000000 00000000 00000000 00000000
REGISTERS
093C: 00 00 F0 01 00 00 00 00 00 00 00 00 00 00 00 00 ................
094C: 00 00 00 00 00 00 00 00 00 00 00 00 24 00 00 00 ............$...
095C: 00 20 00 00 16 00 00 00 00 00 00 00 02 00 00 00 . ..............
096C: BC 09 00 00 00 00 E0 14 00 00 A4 01 A6 01 20 0D .............. .
097C: 01 10 10 32 64 00 00 00 00 00 00 00 00 00 00 00 ...2d...........
098C: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
099C: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
09AC: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

COMPILATION AREA
01A4: 81 16 7C 5C 76 07 FF 5C 06 96 7F E8 0E 01 7C 5C ..|\v..\......|\
01B4: 76 07 FF 5C 0E 97 7F E8 AD 5F FF 5C 0E 01 7C 5C v..\....._.\..|\
01C4: 76 07 FF 5C 76 07 FF 5C 76 07 FF 5C 0E 01 7C 5C v..\v..\v..\..|\
01D4: 0E 9D 7F EC CE 97 BF A0 33 01 7C 5C CF 97 BF A0 ........3.|\....
DUP ok
_
DATA STACK
01CE: 000001CE 00000016
01D0: 00000016 A55AA55A 00000000 00000000
01D4: 00000000 00000000 00000000 00000000
01D8: 00000000 00000000

Peter Jakacki · 2012-07-27 08:30

Ah, I have pins 16..23 setup for VGA, that would explain why you can't set the output. As for the 22 not appearing on the stack the problem could be with the earlier version's whitespace "feature" in that it would compile but not execute a line if it had a trailing whitespace. Try it without a space after it and just hit <enter> or else download the latest binary at the top post as I have just updated it before. I tried it just now:

Propeller .:.:--TACHYON--:.:. Forth V1.0 rev120727.1800 
_ ok
_.S ok

STACK: 00000000 00000000 00000000 A55AA55A _ ok
_22d ok
.S ok

STACK: 00000000 00000000 A55AA55A 00000016 _

Here's a new video that shows a bit more detail that you should find interesting

Christof Eb. · 2012-07-27 08:56

Hi, Peter,
as the cog outputs are "ored", this is not a problem of the hardware.
I had already tried rev120727.1800.
_22d MASK OUTSET ok works
_DECIMAL 22 MASK OUTSET ok does not work.

_DECIMAL ok
_22 MASK OUTSET ok does work

Christof

Peter Jakacki · 2012-07-27 09:06

Yes, the DECIMAL is compiled but not executed until you hit enter however the numbers are converted before that happens. The best way is to either set the base beforehand as you have done or else explicitly force the number to a decimal either as #22 or 22d so that there is no confusion. i just checked and the OUTA and DIRA registers are being set correctly although I have VGA hooked up to these pins so I haven't connected an LED.

_22d MASK OUTSET ok
_1F4 COG@ . ok
40400000 _ ok
_1F6 COG@ . ok
40400000 _ ok

New youtube video showing some more graphics and scrolling as well as a little bit with the serial terminal speed

Christof Eb. · 2012-07-27 09:46

Thanks for the explanation, Peter and keep on with this!
I will next try your forth for a little experiment. The interactive way and the speed should be just optimal for such things.
I want to compare a transistor amplifiier with a valve amplifier. I want to give them short burst signals like beep-pause-beep-pause... as input. This shall be done with the demoboard. Should be quite simple with your forth, but I am no forth guru at all....
Due to the difference of the output impedance I assume, that a difference of the loudspeaker input signal at the end of the beeps could be visible with an oscilloscope.
Thanks for sharing Christof

D.P · 2012-07-27 16:43

I was struggling with the previous version "white space" damage I think using the Prop BOE and BST but since rev1220727. 1800 all is well.

Also trying to use "download" with google docs inserts a bunch of garbage into the source (MAC OSX, FireFox 13) so I for one appreciate the zips. Probably just pebkac.

Appreciate all of your work, look forward to trying your bluetooth module.

prof_braino · 2012-07-29 08:36

I like the graphics. Are you going to include a source editor in your package?

I get tested the HC06 blue tooth module yesterday, I was able to use the terminal from two houses away.

Peter Jakacki · 2012-07-29 15:59

prof_braino wrote: »

I like the graphics. Are you going to include a source editor in your package?

I get tested the HC06 blue tooth module yesterday, I was able to use the terminal from two houses away.

Certainly that is my intention to add a source code editor since I can have VGA. So there will be support for keyboard, mouse, and SD card as well as audio etc. I want to make this as standalone as possible although most of my projects won't need all of these things. Since the 512x384 bitmap takes up a bit of memory, well most of it, I might have to allow for an enhanced text and graphics mode. For the moment even with the 512x384 bitmap I can simply let code overflow into where the lower video memory is and move the "start of video" pointer up accordingly. This means that junk from the ROM would be displayed at the bottom of the screen but I can blank this by setting the tiles for these to the same foreground and background colors.

Although it is possible to add extra chips to do more wonderful things with the video I'd rather concentrate on getting the most out of the Prop itself and also so that when Prop II pops up I'm ready to port and play!

At present I am working on making the user code persistent and autostarting with the backup to EEPROM.

TACHYON O/S V3.0 JUNO - Furiously Fast Forth, FAT32+LAN+VGA+RS485+OBEX ROMS+FP+LMM+++

Comments