A faster way?

RichardF · 2007-07-08 19:56

I need a method which finds the highest of 8 frequencies coming in from my 8 TSL230 sensors and records which sensor was highest. My first cut at this is simple logic but it may not be the fastest way. freqx is simultaneously updated from 4 cogs running ctra and ctrb. freqx and CameFrom are global variables and will be used to·direct my PEBOT towards the brightest light.

PUB ComCen
·HighFreq := 0
·CameFrom := 0
·repeat
·· if freq1 > HighFreq
···· HighFreq := Freq1
···· CameFrom := 1
·· if freq2 > HighFreq
···· HighFreq := Freq2
···· CameFrom := 2
·· if Freq3 > HighFreq
···· HighFreq := Freq3
···· CameFRom := 3
·· if freq4 > HighFreq
···· HighFreq := Freq4
···· CameFrom := 4
·· if Freq5 > HighFreq
···· HighFreq := Freq5
···· CameFrom := 5
·· if freq6 > HighFreq
···· HighFreq := Freq6
···· CameFrom := 6
·· if Freq7 > HighFreq
···· HighFreq := Freq7
···· CameFrom := 7
·· if freq8 > HighFreq
···· HighFreq := Freq8
···· CameFrom := 8····

Would appreciate finding· more efficient/faster code keeping it in Spin.
Thanks,
Richard

CardboardGuru · 2007-07-08 22:01

Slightly faster.

PUB ComCen
 HighFreq := Freq1
 CameFrom := 1
 repeat
   if freq2 > HighFreq
     HighFreq := Freq2
     CameFrom := 2
   if Freq3 > HighFreq
     HighFreq := Freq3
     CameFRom := 3
   if freq4 > HighFreq
     HighFreq := Freq4
     CameFrom := 4
   if Freq5 > HighFreq
     HighFreq := Freq5
     CameFrom := 5
   if freq6 > HighFreq
     HighFreq := Freq6
     CameFrom := 6
   if Freq7 > HighFreq
     HighFreq := Freq7
     CameFrom := 7
   if freq8 > HighFreq
     HighFreq := Freq8
     CameFrom := 8

Beau Schwabe · 2007-07-09 02:00

Richard,

How about...

HighFreq := (Freq1<<3|0)#>(Freq2<<3|1)#>(Freq3<<3|2)#>(Freq4<<3|3)#>(Freq5<<3|4)#>(Freq6<<3|5)#>(Freq7<<3|6)#>(Freq8<<3|7)

CameFrom := HighFreq&$7+1
HighFreq := HighFreq>>3

...In the case of a Tie between any of the Frequencies, the frequency with the highest index value will determine the winner.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

Post Edited (Beau Schwabe (Parallax)) : 7/9/2007 2:05:06 AM GMT

RichardF · 2007-07-09 09:38

Thanks, Beau, I will give your code a try, but I have some studying to do to understand it.
Richard

mirror · 2007-07-09 10:23

Obviously, only 29 bits of frequency information are processed with Beau's code. Practically this is not a concern, unless you were using the frequency variable in a quasi-floating point kind of way (ie upper 16 bits for frequency, lower 16 bits for fraction of frequency). In this case each term of the expression should be
HighFreq := (Freq1&!7|0)#>(Freq2&!7|1) etc

with the final resolution being

CameFrom := HighFreq&7+1
HighFreq := HighFreq&!7

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

CardboardGuru · 2007-07-09 10:30

It's a cunning piece of code. The #> operator gives the higher of 2 numbers. But what Beau has done here is to chain them together, so it returns the highest of a list of numbers.

HighFreq := Freq1 #> Freq2 #> Freq3 #> Freq4 #> Freq5 #> Freq6 #> Freq7 #> Freq8

That's nice functionality that hadn't occurred to me before. I'll have to review my code to see whether I can improve some code with this.

But the real clever thing is finding a way to keep the CameFrom variable together with the appropriate Freq. Beau has put it in the lower 3 bits of each Freq that is compared. And shifted the Freqency to the left 3 bits to make room. (Freqx<<3|y)
The lower 3 bits aren't going to make a difference to the comparison if the Freqs are different. And if they are the same, you probably don't care which is returned. (Beau's code will return the last of the matching Freqs, yours returns the first.)

The other difference to your code is that yours will cope with 32 bit frequencies, and Beau's 29 bit. But it's a fair bet that your frequencies don't have anything like that many bits anyway.

deSilva · 2007-07-09 12:04

I like this thread very much, which shows also the need for more discussions about code optimization. Some time ago I used a benchmark framework that convinced me to change to assembler

In SPIN you need 10 mys for the smallest things - this adds up... I benchmarked the above mentioned pieces of code. The result is given in kHz, as I strobed it using a frequency counter on an output pin. If you want to know the time, take 1/2f, so 50 kHz means 10 mys, as two loops are performed within one cycle.

   1. Original, variables as locals                     5,5 kHz
   2. Original,variables in VAR (first memory cells)    6 kHz
   3. Original, variables as VAR, higher memory         4,6 kHz
   4. same when variables as DAT
   5. Beaus solution (as 2.):                           3,4 KhZ
   6. modified by me:                                   4,6 kHz

Ad 6. I tried to get rid of Beaus shifts, which are expensive, assuming that the frequencies contain small positive values only

HighFreq := (Freq1)#>(Freq2|$1_0000000)#>(Freq3|$2_0000000) ....

Edit: Had some bad typos in it ....

Post Edited (deSilva) : 7/9/2007 12:18:29 PM GMT

CardboardGuru · 2007-07-09 13:02

deSilva,

Interesting that the original should be faster. Beau's is shorter in object code as well as source code. I don't see why bit shifts should be expensive as the underlying machine code in the interpreter can shift any number of bits with a single 4 cycle instruction. Unless the interpreter does it by a dumb brute force 1 bit at a time method.

Your modification won't work. The freq that is ORed with $7_0000000 will always be largest...

Beau Schwabe · 2007-07-09 14:38

Richard,

Hmmm... 3,4 KHz ...I wonder if you 'pre-shifted' the Frequency value as part of your routine that detects the frequency before you did a comparison for the highest frequency, if that would improve the overall speed.

I realize you wanted to keep this in Spin, but alternatively you could re-write just this function as an assembly call.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Beau Schwabe

IC Layout Engineer
Parallax, Inc.

deSilva · 2007-07-09 19:54

So dumb of me fiddling with the MSB

But I was interested in timings only. I think what Beau suggests is to let the "pre-shifting" be done by the COGs doing the meassurements. If they know they can as well include their identification; thus saving the OR as well....

In executing "Byte Code" the length of elementary operations is of not much concern in relation to the execution overhead. Byte Code is most (time) efficient when you execute operations not readily available as machine code (i.e. 32-bit arithmetic on 8-bit machines, floating point, access to "hash" vectors,...)

rokicki · 2007-07-09 20:27

Mostly, though, the reason the sequential Spin wins
is because the conditionals will be false most of the time.
As a matter of fact, for every execution, assuming every
permutation is possible (which is probably somewhat
strong), only 1.59 out of the 7 conditionals will be true
on average. That is, we only execute the guarded
block approximately 23% of the time.

deSilva · 2007-07-09 21:01

Hmm... may be I just considered the "best case"

You made a very good point rockiki!

   2. Original,variables in VAR (first memory cells)
     2a: Best case (highest freq first)    6 kHz
     2b: Worst case (all pathes)           4,6 kHz
     2c: random values                     5,4 kHz 
   5. Beau's solution (with SHIFT and OR): 3,4 KhZ

deSilva · 2007-07-09 21:45

I added some text-output to the program, and the frequency counter is now just for double checking; the values have changed as all overhead is now subtracted by some auto-calibration

This was not absolutly correct in the old program...

   2. Original,variables in VAR (first memory cells)
     2a: Best case (highest freq first)     72 mys
     2b: Worst case (all pathes)           131 mys
     2c: random values                      80 mys 
   5. Beau's solution (with SHIFT and OR): 135 mys

mirror · 2007-07-09 21:48

Beau Schwabe (Parallax) said...

I realize you wanted to keep this in Spin, but alternatively you could re-write just this function as an assembly call.

Sometimes translating·Spin to assembly is NOT an option. Even if you'd like the speed increase sometimes it's just not possible. My current application uses all 8 cogs, each of them doing their own·separate·task - 7 of the cogs are already in assembly, but·of neccessity the supervisory task needs to be written in Spin·(too much varied functionality to be written in assembly). I've faced a couple of situations trying to figure out the best solution to a problem in terms of performance (primarily speed as opposed to size).
I think this thread is great, I'd love to see more cold analysis of execution times of various common problems solved in Spin.

deSilva · 2007-07-09 23:17

@mirror
I fully understand your point; I solved that kind of problem - well, not yet - by reserving one cog for "utilities", similar to FLOAT32. All SMALL things that have to run fast are collected there.

Just for fun I should give one version of assembly for the problem; my counter showed 130 kHz, translating to 4 mys per loop. This is astonishingly slow compared to spin (speed-up only 20 to 30).

Cross-checking: 8x 39 ticks = 312 *12,5 ns = 3.9 mys plus a little bit more....
Waiting for the HUB really hurts

 ORG 0
 highest
    or dira, : port

 ' PAR points to data block with ten entries:
 ' cameFrom, highFreq, freq1  to freq8
 ' first two will be set by this programm 
 ' we loop from end to beginning

  mov :faddr, par
  add :faddr, #4*(8+1) 
  mov :freqN, #8
  mov :highFreq, #0

  
:loop8
  rdlong :freq, :faddr
  min :highFreq, :freq   WC  'a big misnomer
  if_C mov :cameFrom, :freqN

  sub :faddr, #4 
  djnz :freqN, #:loop8
  

  wrlong :highFreq,:faddr
  sub :faddr, #4
  wrlong :cameFrom, :faddr

  xor outa, : port
  jmp #:loopN     'repeat forever to meassure strobe at pin 30
  
  : port long 1<<30
  :faddr long 0
  :freqN long 0
  :highfreq long 0
  :freq long 0
  :cameFrom long 0 
  FIT 496

Post Edited (deSilva) : 7/9/2007 11:23:53 PM GMT

mirror · 2007-07-09 23:58

I think that #> and <# have an unfortunate naming (limit minimum and limit maximum) in the Propeller manual. As a result, every time I think about using them I have to look at the manual again. How about this as a proposal for naming these two operators:

In English:
- A > B is read "A is greater than B" or "is A greater than B"
- A < B is read "A is less than B" or "is A less than B"

If # is read as "pick" then:
- A #> B could be read as "Pick the greater of A and B"
- A <# B could be read as "Pick the lesser of A and B"

So the operator names then become:
- #> "pick greater"
- <# "pick lesser"

Similarly:
A #>= B reads "pick the greater value of A and B and assign it to A"

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

codemonkey · 2007-07-10 02:26

After seeing Beau's code, then heading right to the spin manual to figure out what they do, then having to read the description to find out what the heck they really do [noparse][[/noparse]i guess that's the difference between an IC Layout Engineer and a code monkey], i mentally noted:

a #> b

ahhh, that's the MAX(a, b) function and similarly

a#<b

is the MIN(a, b) function. And the "=" part works like any other assignment thingie (eg. +=). If you wanna throw in the word "pick" for the "#", sure, fine. It already has 3 different meanings, why not a fourth?

Now i just have to figure out a good place to use them [noparse]:)[/noparse]

mirror · 2007-07-10 05:46

codemonkey said...
After seeing Beau's code, then heading right to the spin manual to figure out what they do, then having to read the description to find out what the heck they really do [noparse][[/noparse]i guess that's the difference between an IC Layout Engineer and a code monkey], i mentally noted:

a #> b

ahhh, that's the MAX(a, b) function and similarly

a#<b

is the MIN(a, b) function. And the "=" part works like any other assignment thingie (eg. +=). If you wanna throw in the word "pick" for the "#", sure, fine. It already has 3 different meanings, why not a fourth?

Now i just have to figure out a good place to use them [noparse]:)[/noparse]

Except that in the manual:
#> is called Limit Minimum
<# is called Limit Maximum

So:
Limit Minimum is the same as max(a,b)
Limit Maximum is the same as min(a,b)

I do find the Parallax naming of these two operators a·bit confusing.

deSilva · 2007-07-10 07:09

This discussion seems to start everytime a "newbie" discovers theses operators

The mess is even deeper in assembly where "limit min" is just abbreviated "min" (see my code example three postings above), although - as everyone is aware of - is the mathematical MAX-function.

The main application for min and max however is not as in our example where you "widen" a certain range (looking for the highest frequency, starting with e.g. 0) but in "clipping" algorithms.

It is extremely simple to garantee a value to be in an a,b-range by writing

value := a #> value <#  b

For me, keeping this "pattern" in mind helped a lot!

I gave a wrong line yesterday in the early morning, seems the "pattern" slipped my mind

Thanks to MIRROR!!

Whats more important, the above example can be performed by just two machine instructions!!

Post Edited (deSilva) : 7/10/2007 11:48:39 AM GMT

deSilva · 2007-07-10 07:39

On using machinecode to speed-up "small" operations

In this example it was possible to reduce a 100 mys SPIN code sequence to 4 mys pure machine code. To utilize this advantage fully it is extremely important to design the interface to "the other" COG to not need a copying of too many variables in SPIN. I.e. the parameter blocks have to be constructed very carefully!

In the above example, you can either let the COG compute the algorithm permanently (the "data-flow" approach) or - more typically - start it by storing a specific value into a communication cell and waiti for the result; in my example:

cameFrom :=0
repeat while cameFrom == 0

This interface however will take more than 13 mys, reducing the generic speed-up considerable (by a factor of 3!). In fact the while loop is superfluous if "the other" COG performs the request faster than a single line of SPIN code.

ErNa · 2007-07-10 08:24

48mys: FMax := ((F1 #> F2) #> (F3 #> F4)) #> ((F5 #> F6) #> (F7 #> F8))

mirror · 2007-07-10 08:40

deSilva said...
This discussion seems to start everytime a "newbie" discovers theses operators
The mess is even deeper in assembly where "limit min" is just abbreviated "min" (see my code example three postings above), although - as everyone is aware of - is the mathematical MAX-function.

The main application for min and max however is not as in our example where you "widen" a certain range (looking for the highest frequency, starting with e.g. 0) but in "clipping" algorithms.

It is extremely simple to garantee a value to be in an a,b-range by writing
value := a <# value #> b
For me, keeping this "pattern" in mind helped a lot!

Whats more important, the above example can be performed by just two machine instructions!!

Being entirely theoretical, the manual states:
<quote on Pg 261>
X := Y + 21 <# 250
The above example adds 21 to Y and limits the result to a maximum value to 250.
</quote>

So to limit a value in the range a to b (where a < b) then I would think you need to write it as:
value := a #> value <# b

Have I misread the manual?

Post Edited (mirror) : 7/10/2007 8:59:14 AM GMT

deSilva · 2007-07-10 11:43

No, you have not! But it looked "nice" when I wrote it down

And I only use it an assembler... I'll change my posting to avoid misunderstandings. Thank you!

deSilva · 2007-07-10 11:58

ErNa said...
48mys: FMax := ((F1 #> F2) #> (F3 #> F4)) #> ((F5 #> F6) #> (F7 #> F8))

My Benchmark even says 39 mys

*)
You assume, the coding (Shift, or) has ben done by the data provider.
But you also have to add the de-coding:

CameFrom := HighFreq&$7+1
HighFreq := HighFreq>>3

Giving additional 21 mys

So we now have constant 60+ mys in contrast to changing 70 to 130 mys of the original code.

*) You most likely used DAT variables oder variables in higher memory rather than low memory variables ; in that case I have 50 mys very close to your 48 mys.

Post Edited (deSilva) : 7/10/2007 12:05:44 PM GMT

ErNa · 2007-07-10 12:26

I just wanted to show a different way. In the sense of parallelism: A data source generates data objects, these are send to a free processor. This processor evalutate two consecutive element and forwards the greater value to an processor in the next level. He also could send the smaller to another. This can establish an sorting algorithm.

deSilva · 2007-07-10 12:31

And it is the only technique - AFAIK - to really speed up searching in an unsorted vector: "Divide and Conquer"!

Fred Hawkins · 2007-07-10 14:11

mirror said...

codemonkey said...
After seeing Beau's code, then heading right to the spin manual to figure out what they do, then having to read the description to find out what the heck they really do [noparse][[/noparse]i guess that's the difference between an IC Layout Engineer and a code monkey], i mentally noted:

a #> b

ahhh, that's the MAX(a, b) function and similarly

a#<b

is the MIN(a, b) function. And the "=" part works like any other assignment thingie (eg. +=). If you wanna throw in the word "pick" for the "#", sure, fine. It already has 3 different meanings, why not a fourth?

Now i just have to figure out a good place to use them [noparse]:)[/noparse]

Except that in the manual:
#> is called Limit Minimum
<# is called Limit Maximum

So:
Limit Minimum is the same as max(a,b)
Limit Maximum is the same as min(a,b)

I do find the Parallax naming of these two operators a·bit confusing.

Makes me wish for a c preprocessor to rename the obscure.

Tracy Allen · 2007-07-10 16:59

mirror said...

Except that in the manual:

#> is called Limit Minimum

<# is called Limit Maximum

So:

Limit Minimum is the same as max(a,b)

Limit Maximum is the same as min(a,b)

I do find the Parallax naming of these two operators a bit confusing.

Yes, the shape of the operators make it seem that they have directionality, but in fact they don't. They are commutative, in the mathematical sense that
a #> b = b #> a
and
a <# b = b <# a

That is clear when you think of them as MAX(a,b)=MAX(b,a) or MIN(a,b)=MIN(b,a) respectively.

Also, they have the associate property (that ErNa used):
(a #> b ) #> c = a #> (b #> c)

As a mnemonic I think of #> as , "number greatest" (The arrow points UP the number line), and <# as "lowest number" (The arrow points DOWN the number line). But no directional "than" as in greater or less than ....

The Stamp PBASIC has the same issue with its MIN and MAX operators that have to be stood on their head.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

Post Edited (Tracy Allen) : 7/10/2007 5:42:54 PM GMT

mpark · 2007-07-11 06:42

Well, you could just read "a #> b" as "the greater of a and b".

I can kind of see where the "limit minimum" name comes from, though I think it's an unfortunate choice. You have to forget the mathematical sense of min(a,b) and go with the common English usage of minimum in phrases such as "$5 minimum" and "minimum order: 3 pcs". In other words, "#>" sets a lower limit. It is commutative, but I think it works best if the lower limit is on the right of "#>". Some hypothetical Spin examples:

  bet := inputAmount #> 5            ' read as "minimum bet: $5" (or "lower limit: $5" or "at least $5")
  orderQty := desiredQty #> 3        ' minimum order: 3
  SetThrottle( desiredSpeed #< 60 )  ' max speed: 60 mph

Tracy Allen · 2007-07-11 16:53

I like that mnemonic, #> as the "greater of a and b" and <# as the "lesser of a and b".

In many usages, it will not be known in advance which side will be the limit. That is the case with Beau's deucedly clever formula, where all variables are peers.

I first came up to this kind of function in an analog computer, where potentiometers actually set the lower and upper limits of control or output variables. When the lab started implementing it on a PDP 7, they carried over the same logic, with lower and upper limits, but I think they called them floor and ceiling.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

deSilva · 2007-07-11 17:44

Tracy Allen said...
In many usages, it will not be known in advance which side will be the limit.

In fact. This is why you need this function

MAX and MIN are most heavily used operations, and it was very wise tzo implement them in hardware! Higher programming languages sometimes have "generic" MAX and MIN functions, allowing you to write:
MAX ( a,b,c,d,e,...) ode MIN (a,b,c,d,e,....
)

So there is nothing peculiar with this operation except that it has got the wrong name - as many things with the Propeller are counter experience and counter intuition.

Tracy Allen said...
I like that mnemonic, #> as the "greater of a and b" and <# as the "lesser of a and b".

I also think this is a good memory hook... I have more difficultities in assembly where it is called MIN and MAX, the opposite way around

Post Edited (deSilva) : 7/11/2007 5:49:14 PM GMT

A faster way?

Comments