Tachyon 5 - Old 'straight line' algorithm - for vector graphics, multi-stepper control, sound synths

bob_g4bbybob_g4bby Posts: 107
edited 2020-10-11 - 09:54:09 in Propeller 1
I first saw this in a computer magazine back in around 1979. There was always rivalry over who could draw 'straight lines' of pixels the fastest. There were a lot of algorithms more complicated than they needed to be! The algorithm not only applies to vector graphics; so for example where two or more steppers need to move from a start position, ending simultaneously at the target position. In the Tachyon code that follows, the number of axes is set with constant AXES. It's currently a 3 axis engine. The start and end positions are set for the demonstration in tables MYSTART and MYGOAL. Here's the code:-
--- Straight line algorithm for Tachyon 5v7
--- Useful for anywhere where a bunch of numbers have to change in fixed non-integer ratios to arrive simultaneously at a target
--- e.g. vector graphics, multi-stepper control, sound synthesis, lighting faders

PUBLIC
3  := AXES		--- number of axes to run
AXES 4* := AXESL	--- byte loop count for the arrays of longs
AXES longs POSN		--- Start and present position array
AXES longs GOAL		--- the target array
AXES longs ACC		--- axis accumulator

PRIVATE
AXES longs DELTA	--- set to GOAL minus initial POSN
long TBASE		--- timebase, set from the biggest DELTA
long SLCOUNT		--- counter to keep track of progress

pri [SL]	( -- )		--- slarray indexer
I 4* +
;

--- inc or dec the I'th member of POSN, dependent on whether
--- I'th member of DELTA is pos or neg
pri STEPPOSN ( -- )
POSN [SL]
DELTA [SL] @ 0<
IF -- ELSE ++ THEN
;

pri !DELTAS	( -- )	--- initialise DELTAS
AXES FOR
	I 4* >R
	GOAL R@ + @
	POSN R@ + @
	-
	DELTA R> + !
     NEXT
;

--- TBASE set to the largest DELTA
pri !TBASE	( --  )
0 AXES
FOR
	DELTA [SL] @ ABS MAX
NEXT
TBASE !
;

--- initialise straight line engine for new line
pub !SLINIT	( starts goals -- )
ACC AXESL 0 FILL	--- zero the accumulators
GOAL AXESL CMOVE	--- set the target position
POSN AXESL CMOVE	--- set the starting position
0 SLCOUNT ! 
!DELTAS			--- calculate all goals minus starts
!TBASE			--- TBASE set to the largest DELTA
;

--- step one place along the 'straight line', POSN is set to new position
--- flg returned true if arrived at GOAL, else false
pub SLSTEP	( -- flg )
AXES FOR
	ACC [SL] DUP @
	DELTA [SL] @ ABS +	( ptr-to-ACC ACC+|DELTA| -- )
	DUP			( ptr-to-ACC ACC+|DELTA| ACC+|DELTA| -- )
	TBASE @ >		( ptr-to-ACC ACC+|DELTA| -- )
	IF
		TBASE @ -	( ptr-to-ACC ACC+|DELTA|-TBASE -- )
		STEPPOSN	--- dec POSN if DELTA -ve, else inc POSN	
	THEN
	SWAP !			( -- ) --- and save new ACC value 		 	
     NEXT
     SLCOUNT @ TBASE @ =
     SLCOUNT ++ 	 
;

--- Test words

TABLE MYSTART -15 , 10 , 13 ,	--- test start values
TABLE MYGOAL 15 , 21 , -7 ,	--- test goal values

--- test of !SLINIT
pub test!SLINIT	( -- )
MYSTART MYGOAL !SLINIT
;

--- print AXES size array of longs
pub .ARR ( adr -- )
AXES FOR
	DUP [SL] @ . SPACE
     NEXT
     DROP
;

--- Print all variables
pub .SL	( -- )
CR
." POSN    = " POSN .ARR CR
." GOAL    = " GOAL .ARR CR
." DELTA   = " DELTA .ARR CR
." TBASE   = " TBASE @. CR
." SLCOUNT = " SLCOUNT @. CR
;

--- demo that straight line alg. works
pub SLDEMO	( -- )
MYSTART MYGOAL !SLINIT		--- initialise straight line engine
." STRAIGHT LINE DEMO" CR
.SL CR
." POSN steps by coordinates ..." CR
BEGIN
	SLSTEP
	POSN .ARR CR
UNTIL
." End of straight line" CR
;

--- time the demo, so no printing in the loop
pub SLTIME	( -- )
MYSTART MYGOAL !SLINIT		--- initialise straight line engine
." STRAIGHT LINE TIMING DEMO" CR
.SL CR
." POSN steps by coordinates ..." CR
LAP
BEGIN
	SLSTEP
UNTIL
LAP
TBASE @ . ." Steps in " AXES . ."  Axes took" CR
.LAP CR
." End of straight line" CR
;

Run SLDEMO to see the coordinates at each step. Run SLTIME to get some idea of how long the engine takes. So - I bet there's clever ways of making this faster - or has anyone got an even faster algorithm?. Might be good as part of a ROM? :-)

Comments

  • MaciekMaciek Posts: 148
    edited 2020-10-08 - 18:29:23
    I run your code and that is what I got

    TF5> SLTIME --- STRAIGHT LINE TIMING DEMO

    POSN = -15 10 13
    GOAL = 15 21 -7
    DELTA = 30 11 -20
    TBASE = 30
    SLCOUNT = 0

    POSN steps by coordinates ...
    30 Steps in 3 Axes took
    67,920 cycles = 849.000us
    End of straight line
    ok

    Seems pretty impressive to me.

    Now a little off topic - I couldn't copy-paste larger portions of the program let alone do it in one go. It somehow got messed up and I had to do it in small chunks to make it work. There must be some more clever method to upload a bulkier code all at once.
    The question is how to do it efficiently ?
    The way I did it is simply a no go for more than a couple of lines.
    Might it be my terminal setting for the line delay is too small ?
  • bob_g4bbybob_g4bby Posts: 107
    edited 2020-10-08 - 19:18:48
    Hi Maciek,

    I use tera term under windows with 0 ms inter-character delay and 15 ms end of line delay and that seems to load any size of file reliably. I agree, it sounds like a terminal delay issue.

    I think the code as it stands is fast enough for stepper motor control, but would need converting to assembly language for drawing many straight lines on a video screen, probably.
  • There are some simple optimisations that can probably make a big difference to this timing. Such things as "I 4*" instead of a slow R@ since it is rarely used and so it's high level.

    I use ascii-xfr with a line delay of around 3ms while leaving the minicom terminal connected. It's very fast that way.
  • Minicom is what I use with Tachyon, or e4thcom for other forths. Thanks for the tip.
  • bob_g4bbybob_g4bby Posts: 107
    edited 2020-10-09 - 07:26:33
    Me and Doug Seaton (and others) at Ferranti Computer Systems had built single board Nascom Z80 computers - we had a 'computer club' going too. Doug was a brainy chap, always exploring the maths, and the add-on board he designed had a programmable character set. Using assembly language to draw fast straight lines and circles, we made demos of tumbling space ships and so on, faster than the commercial home computers could do it. I remember the line drawing program was fastest if it was made from self-modifying code. We sweated the bytes out of those tiny routines!

    Doug ( our resident cpu designer) had dreamt up a z80 emulator from two 4-bit TI slice ALUs and a shed load of bipolar roms to do the microcoding. We collected the bits together, but it never made the prototype stage. It would have run at the heady heights of 20MHz, instead of the 2MHz that the first z80s ran at. Doug would have liked the propeller chips, he'd have had all eight cogs whizzing hard! I remember his hastily scribbled pages of cordic code for rotating stuff, so echoes there of the p2.
  • Peter JakackiPeter Jakacki Posts: 9,707
    edited 2020-10-09 - 07:37:43
    This is the script I use from a Linux terminal while leaving minicom connected in another terminal.
    # transfer text file with extra line delay 
    # usage: ./s <file> <port> <delay>
    # examp:  ./s EXTEND 0 3
     ascii-xfr -sn -l $3 Forth/$1.FTH > /dev/ttyUSB$2
    

    Tachyon 5v7 iirc was upgraded for faster serial downloads so you can experiment with the line delay but 3 to 5 should be ok and 15 is way over the top. But this assumes you are loading in block mode using TACHYON <code> END which strips away the usual echo and prints line numbers and messages with a final report. USB serial suffers a lot when you are echoing everything back along with prompts etc and then the line delay is no longer distinct on the serial end and many lines may be appear on the serial rx line without any delay. I should do a proper capture of that too one day.


    BTW, the code optimization is taking into account that it is not efficient nor recommended to use the return stack for juggling and that R@ is provided as a convenience but it is horribly inefficient.
    pub R@						R> R> DUP >R SWAP JUMP ;
    
    On the other hand a simple I 4* is very fast, so either call [SL] or better still, just code it in directly as I 4* +

    Remember what happens if you push data onto the return stack and forget to pop it? CRASH!
  • Thanks Peter, that'll speed up the old edit compile test loop.
  • @bob_g4bby BTW Bob, thanks for sharing that code! :)
  • bob_g4bbybob_g4bby Posts: 107
    edited 2020-10-09 - 08:17:27
    R@ is used in setting up the line, but not in SLSTEP, where most time is spent in the loop. So only a modest gain to be had there. Inlining [SL] is useful, will try that, when I get up (tablet, pre-breakfast). I suspect the ABS word could go, if the step direction was all setup during initialisation, rather than decided every time in STEPPOSN called within SLSTEP. Need to get rid of that IF ELSE THEN.

    It's good to bat these small bits of code around each other and benefit from others speed tricks - + it might get people reading tachyon, with the aid of the glossary, even if it's only to translate the algorithm to another language! It's best to start off written in a simple way so others can easily understand the algorithm + you've got something that at least works. Then start speeding it up progressively, which can make it harder to read.

    If there was another array of longs, each member set either 1or -1 ready to add to POSN and the DELTA array made all positive values, both these things done at initialisation, then SLSTEP wouldn't need any decisions about direction, nor the ABS word every step
  • bob_g4bby wrote: »
    I first saw this in a computer magazine back in around 1979. There was always a competition for who could draw 'straight lines' of pixels the fastest. There were a lot of algorithms more complicated than they needed to be! The algorithm not only applies to vector graphics; so for example where two or more steppers need to move from a start position, ending simultaneously at the target position.

    This is cool stuff. By chance do you remember the magazine / issue(s) where this competition was published? I love the "archeology" of computers and would really like to track this stuff down.

    -joe
  • bob_g4bbybob_g4bby Posts: 107
    edited 2020-10-09 - 17:38:03
    Joe, I'm sorry, I don't remember the details. When I say 'competition' it was just readers letters on the subject + there was always friendly rivalry in our computer club to write faster graphics routines. It was a time of fast change - from painfully slowly entering machine code on switches through cassette tape based loading. The big change for me was building my own floppy disk interface and loading from 360kbyte 5-1/4" floppies. What a speed-up!
  • bob_g4bby wrote: »
    Joe, I'm sorry, I don't remember the details. When I say 'competition' it was just readers letters on the subject + there was always friendly rivalry in our computer club to write faster graphics routines. It was a time of fast change - from painfully slowly entering machine code on switches through cassette tape based loading. The big change for me was building my own floppy disk interface and loading from 360kbyte 5-1/4" floppies. What a speed-up!

    No worries. I remember reading in I think "PC Tech Journal" articles by Michael Abrash about all kinds of wizardly ways to speed up EGA/VGA access. Also, back in the TRS-80 days, there was a magazine that challenged how to write the "fastest" game-of-life engine for that machine.

    Fun times!
    -j
  • bob_g4bbybob_g4bby Posts: 107
    edited 2020-10-12 - 12:54:29
    Here's a refinement of the straight line algorithm which takes 85% the time of the version above, so a modest speed up. SLSTEP has been made faster by in-lining two words, [SL] and STEPPOSN. An extra array called INCR is initialised with +1 or -1 to avoid any decisions about count direction in each call to SLSTEP.
    --- Straight line algorithm for Tachyon 5v7
    --- Useful for anywhere where a bunch of numbers have to change in fixed non-integer ratios to arrive simultaneously at a target
    --- e.g. vector graphics, multi-stepper control, sound synthesis, lighting faders
    --- This version has eliminated the direction decisions made in SLSTEP, to speed up the algorithm a bit
    
    PUBLIC
    3  := AXES		--- number of axes to run
    AXES 4* := AXESL	--- byte loop count for the arrays of longs
    AXES longs POSN		--- Start and present position array
    AXES longs GOAL		--- the target array
    AXES longs ACC		--- axis accumulator
    AXES longs INCR		--- value to be added to POSN on each step (1 or -1)
    
    PRIVATE
    AXES longs DELTA	--- set to GOAL minus initial POSN
    long TBASE		--- timebase, set from the biggest DELTA
    long SLCOUNT		--- counter to keep track of progress
    
    
    --- inc or dec the I'th member of POSN, by adding I'th member of INCR
    pri STEPPOSN ( -- )
    POSN I 4* + DUP @
    INCR I 4* + @
    + SWAP !
    ;
    
    
    --- initialise DELTA and INCR
    pri !DELTA&INCR	( -- )
    AXES FOR
    	I 4* >R
    	GOAL R@ + @
    	POSN R@ + @
    	- DUP 0<
    	INCR I 4* + >R		--- INCR = -1 if GOAL<POSN, else = 1
    	IF -1 ELSE 1 THEN
    	R> ! 
    	ABS DELTA R> + !	--- DELTA = |GOAL - POSN| 
         NEXT
    ;
    
    --- TBASE set to the largest DELTA
    pri !TBASE	( --  )
    0 AXES
    FOR
    	DELTA I 4* + @ MAX
    NEXT
    TBASE !
    ;
    
    --- initialise straight line engine for new line
    pub !SLINIT	( starts goals -- )
    ACC AXESL 0 FILL	--- zero the accumulators
    GOAL AXESL CMOVE	--- set the target position
    POSN AXESL CMOVE	--- set the starting position
    0 SLCOUNT ! 
    !DELTA&INCR		--- calculate all goals minus starts and set up directions
    !TBASE			--- TBASE set to the largest DELTA
    ;
    
    --- step one place along the 'straight line', POSN is set to new position
    --- flg returned true if arrived at GOAL, else false
    pub SLSTEP	( -- flg )
    AXES FOR
    	ACC I 4* + DUP @
    	DELTA I 4* + @ ABS +
    	DUP
    	TBASE @ >
    	IF			--- if ACC[I] is > than TBASE, then time to step POSN[I] on 
    		TBASE @ -
    		POSN I 4* + DUP @
    		INCR I 4* + @
    		+ SWAP !	--- POSN[I]=POSN[I]+INCR[I]	
    	THEN
    	SWAP !			--- and save new ACC[I] value 		 	
         NEXT
         SLCOUNT @ TBASE @ =	--- produce flg
         SLCOUNT ++
    ;
    
    --- Test words
    
    TABLE MYSTART -15 , 10 , 13 ,	--- test start values
    TABLE MYGOAL 15 , 21 , -7 ,	--- test goal values
    
    --- test of !SLINIT
    pub test!SLINIT	( -- )
    MYSTART MYGOAL !SLINIT
    ;
    
    --- print AXES size array of longs
    pub .ARR ( adr -- )
    AXES FOR
    	DUP I 4* + @ . SPACE
         NEXT
         DROP
    ;
    
    --- Print all variables
    pub .SL	( -- )
    CR
    ." POSN    = " POSN .ARR CR
    ." GOAL    = " GOAL .ARR CR
    ." DELTA   = " DELTA .ARR CR
    ." INCR    = " INCR .ARR CR
    ." TBASE   = " TBASE @. CR
    ." SLCOUNT = " SLCOUNT @. CR
    ;
    
    --- demo that straight line alg. works
    pub SLDEMO	( -- )
    MYSTART MYGOAL !SLINIT		--- initialise straight line engine
    ." STRAIGHT LINE DEMO" CR
    .SL CR
    ." POSN steps by coordinates ..." CR
    BEGIN
    	SLSTEP
    	POSN .ARR CR
    UNTIL
    ." End of straight line" CR
    ;
    
    --- time the demo, so no printing in the loop
    pub SLTIME	( -- )
    MYSTART MYGOAL !SLINIT		--- initialise straight line engine
    ." STRAIGHT LINE TIMING DEMO" CR
    .SL CR
    LAP
    BEGIN
    	SLSTEP
    UNTIL
    LAP
    TBASE @ . ." Steps in " AXES . ."  Axes took" CR
    .LAP CR
    ." End of straight line" CR
    ;
    
    So-ooo, I dare you to do better!
  • Hey Bob, you could probably get rid of the return stack shuffling in this routine too even though it is only initialization.
    btw, Isn't >R IF -1 ELSE 1 THEN R> better coded with SWAPs?
    --- initialise DELTA and INCR
    pri !DELTA&INCR	( -- )
    	AXES FOR
    	  GOAL I 4* + @
    	  POSN I 4* + @
    	  - DUP 0<
    	  INCR I 4* + SWAP		--- INCR = -1 if GOAL<POSN, else = 1
    	    IF -1 ELSE 1 THEN
    	  SWAP ! 
    	  ABS DELTA I 4* + !	--- DELTA = |GOAL - POSN| 
        NEXT
    ;
    

    Keep the code coming. It helps me to see how I can or where I need to improve Tachyon itself. I keep thinking that SLSTEP could be optimized somehow, maybe not, but maybe. I noticed that there isn't much difference if you define I 4* + as a routine.
    TF5> LAP I 4* + LAP .LAP --- 208 cycles = 2.080us  ok                                      
    TF5> : IL+ I 4* + ;                                                                        
    TF5> LAP IL+ LAP .LAP --- 304 cycles = 3.040us  ok   
    
    But if it were coded as PASM it wouldn't ned to push and pop the stack and would only 320ns (@96MHz).
  • Yes Peter, I bet there's more cycles to come out of it yet. Paying more attention to using kernel and minimising high level words in SLSTEP might pay off. As you say, >R and R> are pretty slow words + it's only to easy to get 'em unbalanced and then it's reset button time. BTW The more you use tachyon, the more you realise that the best tools result from constant refinement by people who actually use the tool themselves.
  • bob_g4bbybob_g4bby Posts: 107
    edited 2020-10-13 - 10:57:29
    So, SLSTEP ends with
    SLCOUNT @ TBASE @ =	--- produce flg
    SLCOUNT ++
    
    But, in keeping track of when the line finishes, why count up? If we count down instead and keep that counter on the stack we don't need SLCOUNT and we go a bit faster:-
    --- Straight line algorithm for Tachyon 5v7
    --- Useful for anywhere where a bunch of numbers have to change in fixed non-integer ratios to arrive simultaneously at a target
    --- e.g. vector graphics, multi-stepper control, sound synthesis, lighting faders
    --- This version keeps a down counter on the stack to tell when the line is done
    
    PUBLIC
    3  := AXES		--- number of axes to run
    AXES 4* := AXESL	--- byte loop count for the arrays of longs
    AXES longs POSN		--- Start and present position array
    AXES longs GOAL		--- the target array
    AXES longs ACC		--- axis accumulator
    AXES longs INCR		--- value to be added to POSN on each step (1 or -1)
    
    PRIVATE
    AXES longs DELTA	--- set to GOAL minus initial POSN
    long TBASE		--- timebase, set from the biggest DELTA
    
    
    
    --- inc or dec the I'th member of POSN, by adding I'th member of INCR
    pri STEPPOSN ( -- )
    POSN I 4* + DUP @
    INCR I 4* + @
    + SWAP !
    ;
    
    
    --- initialise DELTA and INCR
    pri !DELTA&INCR	( -- )
    AXES FOR
    	I 4* >R
    	GOAL R@ + @
    	POSN R@ + @
    	- DUP 0<
    	INCR I 4* + >R		--- INCR = -1 if GOAL<POSN, else = 1
    	IF -1 ELSE 1 THEN
    	R> ! 
    	ABS DELTA R> + !	--- DELTA = |GOAL - POSN| 
         NEXT
    ;
    
    --- TBASE set to the largest DELTA
    pri !TBASE	( --  )
    0 AXES
    FOR
    	DELTA I 4* + @ MAX
    NEXT
    TBASE !
    ;
    
    --- initialise straight line engine for new line
    pub !SLINIT	( starts goals -- stepcount )
    ACC AXESL 0 FILL	--- zero the accumulators
    GOAL AXESL CMOVE	--- set the target position
    POSN AXESL CMOVE	--- set the starting position
    !DELTA&INCR		--- calculate all goals minus starts and set up directions
    !TBASE			--- TBASE set to the largest DELTA
    TBASE @ 1+		--- A stepcount left on stack. ready for SLSTEP to decrement
    ;
    
    --- step one place along the 'straight line', POSN is set to new position
    --- flg returned true if arrived at GOAL, else false
    pub SLSTEP	( stepcount -- stepcount )
    AXES FOR
    	ACC I 4* + DUP @
    	DELTA I 4* + @ ABS +
    	DUP
    	TBASE @ >
    	IF			--- if ACC[I[ is > than TBASE, then time to step POSN[I] on 
    		TBASE @ -
    		POSN I 4* + DUP @
    		INCR I 4* + @
    		+ SWAP !	--- POSN[I]=POSN[I]+INCR[I]	
    	THEN
    	SWAP !			--- and save new ACC[I] value 		 	
         NEXT
    1-				--- decrement the stepcount
    ;
    
    --- Test words
    
    TABLE MYSTART -15 , 10 , 13 ,	--- test start values
    TABLE MYGOAL 15 , 21 , -7 ,	--- test goal values
    
    
    --- print AXES size array of longs
    pub .ARR ( adr -- )
    AXES FOR
    	DUP I 4* + @ . SPACE
         NEXT
         DROP
    ;
    
    --- Print all variables
    pub .SL	( -- )
    CR
    ." POSN    = " POSN .ARR CR
    ." GOAL    = " GOAL .ARR CR
    ." DELTA   = " DELTA .ARR CR
    ." INCR    = " INCR .ARR CR
    ." TBASE   = " TBASE @. CR
    ;
    
    --- demo that straight line alg. works
    pub SLDEMO	( -- )
    ." STRAIGHT LINE DEMO" CR
    MYSTART MYGOAL !SLINIT		--- initialise straight line engine ( -- loopcount )
    .SL CR
    ." POSN steps by coordinates ..." CR
    BEGIN
    	SLSTEP			( -- loopcount )
    	POSN .ARR CR		( -- loopcount )
    	?DUP 0=
    UNTIL
    ." End of straight line" CR	( -- )
    ;
    
    --- time the demo, so no printing in the loop
    pub SLTIME	( -- )
    MYSTART MYGOAL !SLINIT		--- initialise straight line engine
    ." STRAIGHT LINE TIMING DEMO" CR
    .SL CR
    LAP
    BEGIN
    	SLSTEP			( -- loopcount )
    	?DUP 0=
    UNTIL
    LAP
    TBASE @ . ." Steps in " AXES . ."  Axes took" CR
    .LAP CR
    ." End of straight line" CR
    ;
    
    So this takes 71% of the time of the original.
  • Congrats, Bob.
    That is a huge improvement in speed. Lateral thinking works.

    It's a pity today's education is mainly based on tests, tests, tests. You just memorize the answers and the brain correlates them with the questions.
    I do not envy the next generations if this trend continues. It wasn't like that when I was a schoolboy many decades ago...
  • Thanks Maciek, it's been fun to squeeze as much out of it as possible.

    My memory for stuff I wasn't all that interested in has never been good. For that and other reasons I didn't do that well in exams. Ever since I left uni, though, I've enjoyed learning and using stuff I was interested in. Now I'm retired, that's even more so with so much knowledge available on-line. I used to buy Practical Wireless and other magazines - no more, there is far more on the internet.

    e.g. I installed Punyforth a few days ago on a little wifi processor module, the ESP-12 from ebay. Can be programmed to be an IoT device by itself or act as a serial-to-wifi bridge for P1 or P2. Yet another platform to explore and build into projects. Don't think I'm going to start automating the home though! Enough breaks down without making it more complex ;-)
  • Practical Wireless - haven’t heard that name in decades :)

    Used to read those in my youth years in the 60’s. IIRC one issue came with a little plastic pouch with 3 miniature screwdrivers - 2 flat (yellow and green) and 1 Philips (red). Still have and use them :smiley:
Sign In or Register to comment.