Tachyon V4 "DAWN" - exploring new worlds

Peter Jakacki · 2017-01-18 03:42

I would probably just put in enough cog code to load multiple sequential blocks, even if the files were fragmented the cluster size is still 32kB so there is no need to even follow the cluster chain.

Also SD cards are formatted with a 2MB partition before the FAT32 partition to optimize how the SD card controller handles wear leveling and prioritizing FAT tables etc. A quick scan of an 8GB card shows it is all blank in the first 4194304 bytes except for sector 0.

So we could even just stick an image or images somewhere in there so the bootloader wouldn't even need to worry about the FAT, it could just init the card and load in from a starting sector straight into RAM.

Anyway I'm doing the FTP style load over the ping-pong network so it doesn't matter as much but either way I could also save the image to the SD card that way. The main thing is that I want to treat the lower 32k of EEPROM as boot ROM.

Peter Jakacki · 2017-01-29 08:03

PING PONG+ update
As many of you know I have multidrop communications network built into Tachyon called ping-pong due to the one for one transactions between a master and slave Prop. This network mainly supports RS-485 for long distances but is also usable with simple single I/O networks usually with a small value resistor in series with the pin. The ping-pong nature allows me to effect virtual full-duplex one on connections over long distances that operate as if I am plugged directly into the serial console port so that I can interact with Tachyon the same way.

Over the last week or so I have added 9-bit mode so that I can transmit 8-bit binary data transparently if needed and some of the $100-$1FF address codes are reserved as network commands codes. Using these codes one Prop can access the hub memory of another Prop in the background and use this to clone an image onto the whole network of matching devices. At present at 2Mbps I can handle transfers at around 100kB/sec so a full hub write takes around 300ms. Because groups of devices can be selected this means I can have the basic networking and perhaps Tachyon preloaded in EEPROM on each device yet have the master download the final image at startup to all similar devices in parallel. Since this is asynch then there are no special tricks that are needed such as matching clocks etc plus even standard MCUs with 9-bit mode UARTS can be used and indeed this is the plan too.

The network speed has been tested up to 4Mbps but I have some tweaking to do to get it to run reliably at that rate. This includes the HSUART communications ROM too as this is used for the master.

Anyway, this mode might be beyond most simple LED blinker uses but I'm stoked that I can now develop on one board and immediately upgrade the software on a whole network in the blink of an eye. My current project uses about 40+ Props in such a network. Now to polish it up and I may add some modes.

PING PONG COMMANDS:
SELECT n - select a point to point connection with a device, 0 deselects all
GLOBAL - select all devices in listen only mode
GROUP n - select a group of devices that match in listen only mode
INDUCT - add another device as a member to the current selection
WRITE addr cnt <data> - write to hub memory
READ addr cnt <data> - read from hub memory
HALT - halt all other cogs in the selected devices
INIT addr par cog - load a cog in the selected devices)
RESET - reset all devices

Shawn Lowe · 2017-01-30 08:42

Peter-
I personally find it really cool when users such as you and jonnymac use the propeller in industrial settings. As a maintenance technician, I see PLC's all over the place and talking to coworkers you would think that nothing else could exist in a industrial environment. Then I come home and see it proved otherwise. Awesome job!

Peter Jakacki · 2017-01-30 08:59

Shawn Lowe wrote: »

Peter-
I personally find it really cool when users such as you and jonnymac use the propeller in industrial settings. As a maintenance technician, I see PLC's all over the place and talking to coworkers you would think that nothing else could exist in a industrial environment. Then I come home and see it proved otherwise. Awesome job!

Thanks Shawn, I enjoy using the Prop, even without P2 I can certainly squeeze a lot out of it, and more each day it seems. It's a shame that real P1 use is languishing because so many are holding off for P2 (for years), and it seems it's mostly because of language choices. However I'm having a lot of fun at the moment cloning from one Prop to many with the new ping-pong loader commands, I type "CLONE" and the selected units immediately start up as clones

Of course I can also specify a file image to use instead. I'm not just "cloning" around either, as you know, this is real serious industrial stuff.

RS_Jim · 2017-01-31 16:26

Peter,
I like what I see with your new Tycon. I have two questions:how is the best way to start learning Forth and have you ever tried running your ping pong net wirelessly like via xbee?
Jim

Peter Jakacki · 2017-01-31 18:57

RS_Jim wrote: »

Peter,
I like what I see with your new Tycon. I have two questions:how is the best way to start learning Forth and have you ever tried running your ping pong net wirelessly like via xbee?
Jim

Hi Jim, I think the best way to learn Forth on an MCU vs PC is to play with the hardware. So if you had a ping sensor, or LEDs, or motors etc, then try simple things and build up from there. This way you are learning the language by using it to do something, and see the results. Once you start seeing how quick and interactive it is you will get hooked. Avoid "learning Forth" on a PC and avoid trying to convert code from one language to another. Normally there are different and better ways to implement functions in Forth.

However as to your question regarding using ping-pong over the likes of xbee I would think that this wouldn't be suitable as ping-pong relies on fast transmit/receive switching which is fine for I/O pins and RS485 but wireless loves its preambles. You might think that the Xbee's UART "transparent data mode" should work with ping-pong but it is still RF and it would have to packetize even a single character much like Ethernet. So you could do it but I don't think that the effective speed would be great. However, don't let me stop you although I might hook up some simple RF modules and try this myself.

MJB · 2017-02-01 00:02

Peter Jakacki wrote: »

RS_Jim wrote: »

Peter,
I like what I see with your new Tycon. I have two questions:how is the best way to start learning Forth and have you ever tried running your ping pong net wirelessly like via xbee?
Jim

Hi Jim, I think the best way to learn Forth on an MCU vs PC is to play with the hardware. So if you had a ping sensor, or LEDs, or motors etc, then try simple things and build up from there. This way you are learning the language by using it to do something, and see the results. Once you start seeing how quick and interactive it is you will get hooked. Avoid "learning Forth" on a PC and avoid trying to convert code from one language to another. Normally there are different and better ways to implement functions in Forth.

However as to your question regarding using ping-pong over the likes of xbee I would think that this wouldn't be suitable as ping-pong relies on fast transmit/receive switching which is fine for I/O pins and RS485 but wireless loves its preambles. You might think that the Xbee's UART "transparent data mode" should work with ping-pong but it is still RF and it would have to packetize even a single character much like Ethernet. So you could do it but I don't think that the effective speed would be great. However, don't let me stop you although I might hook up some simple RF modules and try this myself.

I was thinking about Ping-Pong over wireless/RF or maybe even ESP8266 as well to extend the NW without cables.
But did not spend enough time to understand Ping-Pong.
Is it possible to ping-pong on a command / response or 'line' level instead the character level you mention above?
For low speed that would be ok for my applications.

Peter Jakacki · 2017-02-01 00:21

MJB wrote: »

I was thinking about Ping-Pong over wireless/RF or maybe even ESP8266 as well to extend the NW without cables.
But did not spend enough time to understand Ping-Pong.
Is it possible to ping-pong on a command / response or 'line' level instead the character level you mention above?
For low speed that would be ok for my applications.

Well even though ppnet (might be easier to refer to) does ping-pong at the byte level it also works with blocks of data when it reads and writes hub memory, so I guess this is possible. So I will look into it as it might be useful for me to bridge the hardwired network in some cases. I haven't really had time to play with the ESP chips yet but since we can use these in telnet mode then they should well too.

Peter Jakacki · 2017-02-12 03:05

Even though V4 is still in the testing folder I have been using it in new designs in place of V3. This has helped me fill in the holes so to speak and I have included various optimizations as well. Take for instance the humble high level toggle.

V4 250kHz toggle

( 0011 $18C0  ok )   LAP 1000000 FOR 29 HIGH 29 LOW NEXT LAP .LAP
384000224 cycles at 96000000Hz  or 4000.002ms

V3 186kHz toggle

LAP 1000000 FOR 29 HIGH 29 LOW NEXT LAP .LAP 5.333secs ok

Contrast this with pure Spin which comes in at 9.71kHz (@96Mhz clock), over 25 times slower than V4

pub toggle
  dira[29] := 1
  repeat
    outa[29] := 1
    outa[29] := 0

The simpler looking Spin that uses !outa[29] is much slower again at 7.19kHz

So I may be moving V4 across to the main folder and put V3 and V4 into sub-folders.

btw, Using the fast pin method V4 can toggle at 1MHz

( 0019 $18C0  ok )   29 MASK MPIN 1000000 LAP FOR H L NEXT LAP .LAP
96000176 cycles at 96000000Hz  or 1000.001ms

The fast pin allows high level to pulse lines faster than 1us, or precisely 1us with "L L L H"

rbehm · 2017-02-12 10:05

Hi Peter,
I just dowloaded V4, thanks for making it public.
It's indeed much faster. With V3 I had to set a char delay of 1ms to be able to load any file (with a 6MHz xtal). This now works without any char delay, nice.
But I have the problem that after loading extend-v4.fth it hangs, no boot up message nothing, not even the serial TX is active. There is nothing special on my board,just 64K EEPROM and some SPI devices.
BTW there is an error in EXTEND-V4.FTH at line 163. It complains that id is unknown.

Cheers from Taiwan
Reinhardt, putting Propellors into real helicopters.

Peter Jakacki · 2017-02-12 11:11

rbehm wrote: »

Hi Peter,
I just dowloaded V4, thanks for making it public.
It's indeed much faster. With V3 I had to set a char delay of 1ms to be able to load any file (with a 6MHz xtal). This now works without any char delay, nice.
But I have the problem that after loading extend-v4.fth it hangs, no boot up message nothing, not even the serial TX is active. There is nothing special on my board,just 64K EEPROM and some SPI devices.
BTW there is an error in EXTEND-V4.FTH at line 163. It complains that id is unknown.

Cheers from Taiwan
Reinhardt, putting Propellors into real helicopters.

Hi Reinhardt,
4.1 is the current version to use as ID! replaces the " ok" in the prompt with a 3 character ID which is very useful for me when I am talking to other Props over PINGNET.
Be aware of the defaults that I have been testing with as I use 6MHZ and 921600 baud.

Propeller .:.:--TACHYON--:.:. Forth V4.1 DAWN 410170212.1230

MODULES LOADED: 
18C0: EXTEND.fth          Primary extensions to TACHYON+ kernel  - 170201-0430

AUTORUN BOOT
Loading cog 3 E4E2 F32     
*** ROMS ***
0,848 VGA32x15  
0,352 HSUART    
1,900 F32       
CODE:$2D06 =11014 bytes   NAME:$5DB2 =5710 bytes   DATA:$76D4 =196 bytes    =12460 bytes free    Data Stack (0)
--------------------------------------------------------------------------------
( 0001 $2D06  ok )   " PBJ" ID!
( 0002 $2D06 PBJ )

Cog usage

COG 0 Tachyon
COG 1 Console and PINGNET
COG 2 Timers
COG 3 Loader or F32

rbehm · 2017-02-12 14:01

Peter Jakacki wrote: »
rbehm wrote: »

Hi Peter,
I just dowloaded V4, thanks for making it public.
It's indeed much faster. With V3 I had to set a char delay of 1ms to be able to load any file (with a 6MHz xtal). This now works without any char delay, nice.
But I have the problem that after loading extend-v4.fth it hangs, no boot up message nothing, not even the serial TX is active. There is nothing special on my board,just 64K EEPROM and some SPI devices.
BTW there is an error in EXTEND-V4.FTH at line 163. It complains that id is unknown.

Cheers from Taiwan
Reinhardt, putting Propellors into real helicopters.

Hi Reinhardt,
4.1 is the current version to use as ID! replaces the " ok" in the prompt with a 3 character ID which is very useful for me when I am talking to other Props over PINGNET.
Be aware of the defaults that I have been testing with as I use 6MHZ and 921600 baud.
Propeller .:.:--TACHYON--:.:. Forth V4.1 DAWN 410170212.1230

MODULES LOADED: 
18C0: EXTEND.fth          Primary extensions to TACHYON+ kernel  - 170201-0430

AUTORUN BOOT
Loading cog 3 E4E2 F32     
*** ROMS ***
0,848 VGA32x15  
0,352 HSUART    
1,900 F32       
CODE:$2D06 =11014 bytes   NAME:$5DB2 =5710 bytes   DATA:$76D4 =196 bytes    =12460 bytes free    Data Stack (0)
--------------------------------------------------------------------------------
( 0001 $2D06  ok )   " PBJ" ID!
( 0002 $2D06 PBJ )   
Cog usage
COG 0 Tachyon

COG 1 Console and PINGNET

COG 2 Timers

COG 3 Loader or F32

Thanks Peter,
I will try it.
Am I correct that V4 is case sensitive?

Peter Jakacki · 2017-02-12 16:11

Yes, V4 an V3 are case sensitive although I did include a feature in V3 that in case it didn't find the word in the dictionary that it would then convert to uppercase and try again. I have considered making V4 case insensitive though, or at least include the option.

rbehm · 2017-02-13 05:18

Ok tested V4.1. After adapting to my board it works as it should.
How about making every word lower case. Then we would not be forced to wear out the shift key.

Peter Jakacki · 2017-02-13 05:46

I'll make it case insensitive

ErNa · 2017-02-13 07:48

To me the best would be: support lower case, but don't allow ambiguities, that is, if a word once is defined WrdDef (word definition), no upper case word is allowed anymore WrDDef (write double defined)

Peter Jakacki · 2017-02-13 08:05

ErNa wrote: »

To me the best would be: support lower case, but don't allow ambiguities, that is, if a word once is defined WrdDef (word definition), no upper case word is allowed anymore WrDDef (write double defined)

I'm looking at how I want to handle it now but I prefer not enforcing anything at all if I can help it. Words such as wrddef WRDDEF WrDDef should really be stored as they are but then that complicates the search routines which need to be fast. So the only real way around it is to do the fast search, if it fails then do another slower search which might be ok in interactive mode but block mode ( TACHYON ... END ) should be exact match I think.

Peter Jakacki · 2017-02-13 13:35

Here are some low level timings comparing V4 with V3. What's interesting is that while V3 had a few fast constants in code, V4 can handle 15-bit constants easily.

Timings are for 96MHz on both

V4                              V3
Push literal    833ns           833..1333..1833..2000
DROP            833ns           666ns..1000 (>4 items)
32-bit push     1166            3000
DO LOOP         500ns           500ns
FOR NEXT        333ns           333
DUP             666ns           833
SWAP            500ns           333
NOP             333ns           333
1+              333ns           333
8<<             333ns           333

So operations that do not push or pop are the same and really reflect the time it takes to fetch the instruction but pushing literals is much faster especially considering that V3 reserved some opcodes for fast constants but these varied in execution time with the fastest being 0 at 833ns if there were no more than 3 items on the stack and the slowest 2us. V4 handles all literals up to 15-bits in 833ns and uses an internal stack rather than V3's assigned external stack. Both versions still have 4 fixed locations in the cog for the top items of the stack. SWAP being a fraction slower on V4 has me a little confused since the code is identical!

The faster data stack push and pops in V4 make a big difference in overall speed too. I/O operations are faster too as HIGH and LOW are opcodes and then there is PIN to specify a fast pin and H and L to set them high and low with 333ns execution times. So H L H will generate a 1us high pulse with a 333ns notch in the middle of it.

I think the reason for the slightly longer SWAP in V4 is due to the one extra instruction in the doNEXT loop before it executes SWAP causing it to wait longer for the hub on the next doNEXT.

Peter Jakacki · 2017-02-14 01:57

The WS2812 RGB LED timing is normally implemented in PASM due to the speed required. Just for a lark I thought I'd try it totally in high level code. This routine outputs to a single LED but could just as easily output a whole array of course. No PASM required!

pub XLED ( ggrrbb pin -- )	DUP PIN L MASK SWAP 8 REV 24 FOR H SHROUT L NEXT 2DROP ;

D.P · 2017-02-14 03:49

Peter Jakacki wrote: »
The WS2812 RGB LED timing is normally implemented in PASM due to the speed required. Just for a lark I thought I'd try it totally in high level code. This routine outputs to a single LED but could just as easily output a whole array of course. No PASM required!
pub XLED ( ggrrbb pin -- )	DUP PIN L MASK SWAP 8 REV 24 FOR H SHROUT L NEXT 2DROP ;

Congrats, that's impressive speed. So I assume V3 will stay where it is and we should be moving to V4 going forward?

Peter Jakacki · 2017-02-14 04:31

V4 will get all the bells and whistles which it may be missing some of at present, but it is the version that runs faster and takes up less memory overall. If that is the case why do we need to keep maintaining or developing V3, or V2 or V1 for that matter?
V4 has internal data stacks, no hub stacks required, still has room for more cog instructions, has faster constants and literals. Pingnet fully supported. The list goes on.

Peter Jakacki · 2017-02-19 16:55

EXTEND has been patched to work correctly with 32kB EEPROMs as part of the SAVEROM setup. This ended up wiping all the cog images for the kernel back in the first 32k since access higher than 32k simply mirrors back to the first 32k.

proplem · 2017-02-23 20:14

@Peter,

Peter Jakacki wrote: »

EXTEND has been patched to work correctly with 32kB EEPROMs as part of the SAVEROM setup. This ended up wiping all the cog images for the kernel back in the first 32k since access higher than 32k simply mirrors back to the first 32k.

I was stuck in Juno thread - so I didn't see the fix.
The problem with hardware reset of lamestation is solved.

Thank you very much!

David Betz · 2017-02-23 20:18

How do you like the LameStation? I see he has them on sale for half price.

proplem · 2017-02-23 20:20

@Peter, I understood [PLOT] loads the runmod, but how is the lcdmem memory byte array passed with these COGREG! words. Can your explain this a bit?

And where does PLOT know from wether it is working with color or black and white?

Best regards,
proplem

proplem · 2017-02-23 20:46

David Betz wrote: »

How do you like the LameStation? I see he has them on sale for half price.

Hi David, LameStation is a nice toy to learn. I bought it for (me and) my sons to wake their interest by using tachyon. With tachyon it is fun to interactively dive into this small microcomputer. Sound will be possible, graphics is on the way, Joystick reading, buttons, a great tool to learn (tachyon)!

I must say that the plexi glass has broken under the intense usage of my 7 year young boy. But it is possible to improve that before usage and avoid the break.

regards, proplem

Edit: I forgot to mention that I must use propplug to program it with tachyon. Over the rs232 plug I was not able to program it.

David Betz · 2017-02-23 20:52

proplem wrote: »

David Betz wrote: »

How do you like the LameStation? I see he has them on sale for half price.

Hi David, LameStation is a nice toy to learn. I bought it for (me and) my sons to wake their interest by using tachyon. With tachyon it is fun to interactively dive into this small microcomputer. Sound will be possible, graphics is on the way, Joystick reading, buttons, a great tool to learn (tachyon)!

I must say that the plexi glass has broken under the intense usage of my 7 year young boy. But it is possible to improve that before usage and avoid the break.

regards, proplem

Edit: I forgot to mention that I must use propplug to program it with tachyon. Over the rs232 plug I was not able to program it.

Thanks for the info. The PropPlug is not a problem for me. If I get one I may get back in touch to find out what your fix is for the fragile plexiglass enclosure. :-)

MJB · 2017-02-23 22:04

proplem wrote: »

@Peter, I understood [PLOT] loads the runmod, but how is the lcdmem memory byte array passed with these COGREG! words. Can your explain this a bit?

And where does PLOT know from wether it is working with color or black and white?

Best regards,
proplem

from Juno thread and V3 / V4.1 source:
looks like black&white (FG/BG)

                org REG0
                res 3
pixshift        res 1       --- COGREG 3
pixeladr        res 1      ---  COGREG 4


' PLOT MODULE
' Used for VGA/TV or LCD graphics
' pixshift is always a multiple of two, 512 pixels/line = 6 etc
'
                        org     _RUNMOD
' PLOT ( x y -- )
_PLOT                   shl     tos,pixshift    ' n^2 bytes/Y line
                        mov     X,tos+1
                        shr     tos+1,#3        ' byte offset in line
                        add     tos,tos+1       ' byte offset in frame
                        add     tos,pixeladr    ' byte address in memory
                        and     X,#7    ' get bit mask
                        mov     tos+1,#1
                        shl     tos+1,X
                        jmp     #SET

' SET ( mask caddr -- ) Set bit(s) in hub byte



ALIAS RUNMOD PLOT
word X
word Y
: !LCD
	lcdmem lcdmemlen ERASE
	lcdmem 4 COGREG!              --- store the address of the array to COGREG 4 ALIAS pixeladr        
        4 3 COGREG!                        --- store the pixel shift  to COGREG 3 ALIAS pixshift    
        [PLOT]
	X W~ Y W~
	;

proplem · 2017-02-24 04:39

MJB wrote: »

from Juno thread and V3 / V4.1 source:
looks like black&white (FG/BG)

Yes? You know my assembler is worse than FORTH :-)) But the demos Peter created all look very well colored. So these use other routines of PLOT?

MJB · 2017-02-24 12:11

proplem wrote: »

MJB wrote: »

from Juno thread and V3 / V4.1 source:
looks like black&white (FG/BG)

Yes? You know my assembler is worse than FORTH :-)) But the demos Peter created all look very well colored. So these use other routines of PLOT?

I learnt PASM (Assembler in general) AND Forth
from reading Peter's original V0.x Tachyon source code.

If you can program in C or any other usual language ASM/PASM is not so difficult.

YES - most of Peter's displays look color.
I modified one of his QVGA drivers for a display he did not support - works ..

      mov     tos+1,#1
      shl     tos+1,X
      jmp     #SET

this snipped takes a 1, shifts it to the required BIT position X
and jumps to the code to set a bit mask into a byte.

so it sets one single bit == black/white

The color LCD displays get written directly to the LCD not into a buffer.
Color VGA will need a buffer ... I need to read a bit in the source ;-)

I think [plot] was introduced as a runmod after having been fixed part of a very old V1.x kernel for monochrome VGA 512x384 bitmap.

Tachyon V4 "DAWN" - exploring new worlds

Comments