PNut/Spin2 Latest Version (v51 - New POW, LOG2, EXP2, LOG10, EXP10, LOG, EXP floating-point ops)

ersmith · 2021-09-23 22:03

@evanh said:
Good stuff Wuerfel_21. And quick. I'm sure Eric is happy there is someone else with the skills to help.

Very happy. Ada has contributed so much, not just in the obvious things she's added to flexspin, but also behind the scenes with bug reports, suggestions, and testing. She's really a great programmer!

Rayman · 2021-09-23 22:54

@cgracey said:

@Rayman said:
Great to see floating point in Spin2!

Does Spin2 use up more of the cog memory now?

The Spin2 interpreter is now 4,784 bytes long, occupying $00000..$012AF. The floating-point instructions added a total of 554 bytes of code and data to the interpreter, increasing its size by 13%.

Wait, that must be HUB space, right? Is cog usage still the same?

cgracey · 2021-09-24 03:28

@Rayman said:

@cgracey said:

@Rayman said:
Great to see floating point in Spin2!

Does Spin2 use up more of the cog memory now?

The Spin2 interpreter is now 4,784 bytes long, occupying $00000..$012AF. The floating-point instructions added a total of 554 bytes of code and data to the interpreter, increasing its size by 13%.

Wait, that must be HUB space, right? Is cog usage still the same?

Sorry, I didn't answer your actual question.

Cog RAM usage is exactly the same. No change there.

ersmith · 2021-09-24 14:06

@"Stephen Moraco" said:
@Wuerfel_21 thanks for clarifying, I jumped the gun... I'll look forward to the new operations arriving when they do!

They're in github now.

evanh · 2021-09-25 08:22

Cool! floating_point_demo.spin2 now works with flexspin.

evanh · 2021-09-25 08:46

Hmm, just investigated converting the clock frequency calculations back into floats. But it doesn't really look any better unless all variables are converted back. Otherwise the code looks just as untidy with lots of float() and round() sprinkled around.

Really, muldiv65() does what is needed there. After all, the objective is to find the best set of integer components of clock-mode for a HUBSET instruction.

pik33 · 2021-09-25 09:29

@ersmith said:

@"Stephen Moraco" said:
@Wuerfel_21 thanks for clarifying, I jumped the gun... I'll look forward to the new operations arriving when they do!

They're in github now.

@evanh said:
Cool! floating_point_demo.spin2 now works with flexspin.

What I can see now on Github is the commit from 10 days ago (2021.09.15) without any floating point debug/operators

evanh · 2021-09-25 09:55

Ah, yeah, I'm not using the website. Using the "git pull" and "make" commands from shell instead. Eric has updated the master dev branch in the Git repository. One advantage of having a Linux box. These tools are just there, even if they're a mystery to work sometimes.

pik33 · 2021-09-25 10:54

@evanh said:
Ah, yeah, I'm not using the website. Using the "git pull" and "make" commands from shell instead. Eric has updated the master dev branch in the Git repository. One advantage of having a Linux box. These tools are just there, even if they're a mystery to work sometimes.

I have several raspberries to try this.
Edit: fresh build on rpi400 doesn't understand new operators. "Unexpected ."

evanh · 2021-09-25 11:41

Is that the make? Or is that a flexspin error?

Do a make clean and make it again. And make sure you're actually using the fresh build of flexspin.

pik33 · 2021-09-25 11:46

I did git pull, make, all went ok and I have a working flexprop on my RPi400. It says it is 5.9.2
However if I tried

c:= a /. b

it told me "unexpected ." while trying to compile this for a P2

evanh · 2021-09-25 12:04

It may not be part of flexprop. Did you build flexspin alone? git clone https://github.com/totalspectrum/spin2cpp

Or you can download the zip from the website and unzip then git pull and make from that.

evanh · 2021-09-25 12:13

Yeah, that'll be it. My flexspin version string is Version 5.9.3-beta-v5.9.2-21-gd463d1ea Compiled on: Sep 25 2021

ersmith · 2021-09-25 12:13

@pik33 said:
I did git pull, make, all went ok and I have a working flexprop on my RPi400. It says it is 5.9.2
However if I tried

c:= a /. b

it told me "unexpected ." while trying to compile this for a P2

Sorry, I had only updated spin2cpp, not flexprop. Try again now (and if you're still having trouble do a "make clean" before "make install").

pik33 · 2021-09-25 13:10

Now it is 5.9.3 and it compiled /.

ersmith · 2021-09-25 14:17

@cgracey : I've found a few bugs in the Spin2 floating point routines. There are a few general categories:

(1) Rounding errors: IEEE 32 says results should be rounded to the nearest even bit pattern. I think Spin2 may always be rounding up.
(2) Overflow errors: Many operations which should give infinite results give NaN patterns (but not always the "official" NaN)
(3) Negative zero: there are a few cases where $8000_0000 needs to be treated specially.

I've attached a .zip file with the test program I used to generate these.

IEEE 32 bit binary tests
Add tests: 
  Error: 7F800000 +. 7F800000 -> 7FFFFFFF expected 7F800000
  Error: FF800000 +. FF800000 -> FFFFFFFF expected FF800000
  Error: 7F800000 +. 7F000000 -> 7FC00000 expected 7F800000
  Error: 7F800000 +. FF000000 -> 7F000000 expected 7F800000
  Error: FF800000 +. 7F000000 -> FF000000 expected FF800000
  Error: FF800000 +. FF000000 -> FFC00000 expected FF800000
  Error: 7F000000 +. 7F800000 -> 7FC00000 expected 7F800000
  Error: 7F000000 +. FF800000 -> FF000000 expected FF800000
  Error: FF000000 +. 7F800000 -> 7F000000 expected 7F800000
  Error: FF000000 +. FF800000 -> FFC00000 expected FF800000
  Error: 7F7FFFFE +. 7F7FFFFE -> 7FFFFFFE expected 7F800000
  Error: FF7FFFFE +. FF7FFFFE -> FFFFFFFE expected FF800000
  Error: 3F800001 +. 3F800000 -> 40000001 expected 40000000
  Error: BF800001 +. BF800000 -> C0000001 expected C0000000
  Error: C0000000 +. C0000001 -> C0800001 expected C0800000
  Error: 40000000 +. 40000001 -> 40800001 expected 40800000
  Error: 7F7FFFFE +. 7F7FFFFF -> 7FFFFFFF expected 7F800000
  Error: FF7FFFFE +. FF7FFFFF -> FFFFFFFF expected FF800000
  Error: 7F000001 +. 7F000000 -> 7F800001 expected 7F800000
  Error: FF000001 +. FF000000 -> FF800001 expected FF800000
  Error: 7E800001 +. 7E800000 -> 7F000001 expected 7F000000
  Error: FE800001 +. FE800000 -> FF000001 expected FF000000
  Error: 7EFFFFFE +. 7EFFFFFF -> 7F7FFFFF expected 7F7FFFFE
  Error: FEFFFFFE +. FEFFFFFF -> FF7FFFFF expected FF7FFFFE
  Error: 40000000 +. 34000000 -> 40000001 expected 40000000
  Error: C07FFFFF +. B3FFFFFF -> C0800000 expected C07FFFFF
  Error: 3F800000 +. 3F800001 -> 40000001 expected 40000000
  Error: BF800000 +. BF800001 -> C0000001 expected C0000000
 Elapsed microseconds: 225
 count: 344 passed: 316
Sub tests: 
  Error: 3F800001 -. BF800000 -> 40000001 expected 40000000
  Error: BF800001 -. 3F800000 -> C0000001 expected C0000000
  Error: 7F800000 -. FF800000 -> 7FFFFFFF expected 7F800000
  Error: FF800000 -. 7F800000 -> FFFFFFFF expected FF800000
  Error: 7F800000 -. FF000000 -> 7FC00000 expected 7F800000
  Error: 7F800000 -. 7F000000 -> 7F000000 expected 7F800000
  Error: FF800000 -. FF000000 -> FF000000 expected FF800000
  Error: FF800000 -. 7F000000 -> FFC00000 expected FF800000
  Error: 7F000000 -. FF800000 -> 7FC00000 expected 7F800000
  Error: 7F000000 -. 7F800000 -> FF000000 expected FF800000
  Error: FF000000 -. FF800000 -> 7F000000 expected 7F800000
  Error: FF000000 -. 7F800000 -> FFC00000 expected FF800000
  Error: 7F7FFFFE -. FF7FFFFE -> 7FFFFFFE expected 7F800000
  Error: FF7FFFFE -. 7F7FFFFE -> FFFFFFFE expected FF800000
  Error: C0000000 -. 40000001 -> C0800001 expected C0800000
  Error: 40000000 -. C0000001 -> 40800001 expected 40800000
  Error: 7F7FFFFE -. FF7FFFFF -> 7FFFFFFF expected 7F800000
  Error: FF7FFFFE -. 7F7FFFFF -> FFFFFFFF expected FF800000
  Error: 7F000001 -. FF000000 -> 7F800001 expected 7F800000
  Error: FF000001 -. 7F000000 -> FF800001 expected FF800000
  Error: 7E800001 -. FE800000 -> 7F000001 expected 7F000000
  Error: FE800001 -. 7E800000 -> FF000001 expected FF000000
  Error: 7EFFFFFE -. FEFFFFFF -> 7F7FFFFF expected 7F7FFFFE
  Error: FEFFFFFE -. 7EFFFFFF -> FF7FFFFF expected FF7FFFFE
  Error: 40000000 -. B4000000 -> 40000001 expected 40000000
  Error: C07FFFFF -. 33FFFFFF -> C0800000 expected C07FFFFF
 Elapsed microseconds: 305
 count: 316 passed: 290
Mul tests: 
  Error: C0000000 *. 7F800000 -> FFFFFFFF expected FF800000
  Error: C0800000 *. FF800000 -> 7FFFFFFF expected 7F800000
  Error: 40A00000 *. 7F800000 -> 7FFFFFFF expected 7F800000
  Error: 40E00000 *. FF800000 -> FFFFFFFF expected FF800000
  Error: C1100000 *. 7F000000 -> FFFFFFFF expected FF800000
  Error: C0400000 *. FF000000 -> 7FC00000 expected 7F800000
  Error: BF000001 *. 00000001 -> 80000000 expected 80000001
  Error: BFC00000 *. 80000001 -> 00000001 expected 00000002
  Error: C0200001 *. 00000001 -> 80000002 expected 80000003
  Error: C0600000 *. 80000001 -> 00000003 expected 00000004
  Error: 7F800000 *. 7F800000 -> 7FFFFFFF expected 7F800000
  Error: FF800000 *. 7F800000 -> FFFFFFFF expected FF800000
  Error: 7F800000 *. FF800000 -> FFFFFFFF expected FF800000
  Error: FF800000 *. FF800000 -> 7FFFFFFF expected 7F800000
  Error: 7F800000 *. C0400000 -> FFFFFFFF expected FF800000
  Error: FF800000 *. 40C00000 -> FFFFFFFF expected FF800000
  Error: FF800000 *. C1000000 -> 7FFFFFFF expected 7F800000
  Error: 7F000000 *. 7F800000 -> 7FFFFFFF expected 7F800000
  Error: FE800000 *. 7F800000 -> FFFFFFFF expected FF800000
  Error: 7F800000 *. FF000000 -> FFFFFFFF expected FF800000
  Error: FF800000 *. FE800000 -> 7FFFFFFF expected 7F800000
  Error: 7F800000 *. 7EFFFFFF -> 7FFFFFFF expected 7F800000
  Error: FE7FFFFF *. 7F800000 -> FFFFFFFF expected FF800000
  Error: 7F800000 *. FF7FFFFF -> FFFFFFFF expected FF800000
  Error: FF7FFFFF *. FF800000 -> 7FFFFFFF expected 7F800000
  Error: 00800000 *. 7F800000 -> 40800000 expected 7F800000
  Error: 81000000 *. 7F800000 -> C1000000 expected FF800000
  Error: 7F800000 *. 81000000 -> C1000000 expected FF800000
  Error: FF800000 *. 80800000 -> 40800000 expected 7F800000
  Error: 7F800000 *. 00FFFFFF -> 40FFFFFF expected 7F800000
  Error: 80800001 *. 7F800000 -> C0800001 expected FF800000
  Error: 7F800000 *. 80800001 -> C0800001 expected FF800000
  Error: 80FFFFFF *. FF800000 -> 40FFFFFF expected 7F800000
  Error: 00000001 *. 7F800000 -> 35000000 expected 7F800000
  Error: 80000003 *. 7F800000 -> B5C00000 expected FF800000
  Error: 7F800000 *. 80000002 -> B5800000 expected FF800000
  Error: FF800000 *. 80000004 -> 36000000 expected 7F800000
  Error: 7F800000 *. 007FFFFF -> 407FFFFE expected 7F800000
  Error: 807FFFFF *. 7F800000 -> C07FFFFE expected FF800000
  Error: 7F800000 *. 807FFFFF -> C07FFFFE expected FF800000
  Error: 807FFFFF *. FF800000 -> 407FFFFE expected 7F800000
  Error: C03FFFFE *. 7F000000 -> FFBFFFFE expected FF800000
  Error: C09FFFFE *. FF000000 -> 7FFFFFFF expected 7F800000
  Error: C0DFFFF9 *. 7F000000 -> FFFFFFFF expected FF800000
  Error: C1100001 *. FF000000 -> 7FFFFFFF expected 7F800000
  Error: 7F000000 *. 7F000000 -> 7FFFFFFF expected 7F800000
  Error: FF7FFFFD *. 7F000000 -> FFFFFFFF expected FF800000
  Error: 7F000000 *. FE800004 -> FFFFFFFF expected FF800000
  Error: FF000005 *. FF000001 -> 7FFFFFFF expected 7F800000
  Error: 7F000009 *. 7F7FFFFA -> 7FFFFFFF expected 7F800000
  Error: FE7FFFF9 *. 7F000000 -> FFFFFFFF expected FF800000
  Error: 7F000000 *. FE800000 -> FFFFFFFF expected FF800000
  Error: FF7FFFFF *. FF7FFFFF -> 7FFFFFFF expected 7F800000
  Error: FEFFFFF7 *. 7E800001 -> FFFFFFFF expected FF800000
  Error: 7F000000 *. FF000000 -> FFFFFFFF expected FF800000
  Error: 7F000000 *. 7F7FFFFE -> 7FFFFFFF expected 7F800000
  Error: FF7FFFFD *. FF000001 -> 7FFFFFFF expected 7F800000
  Error: 7EFFFFFD *. C0000008 -> FF800006 expected FF800000
  Error: 3F800002 *. 7F7FFFFE -> 7F800001 expected 7F800000
  Error: 7F000009 *. C0C00002 -> FFFFFFFF expected FF800000
  Error: FF7FFFFD *. C0400001 -> 7FFFFFFF expected 7F800000
  Error: 00000001 *. 3F7FFFFF -> 00000000 expected 00000001
  Error: 00000001 *. BF7FFFFF -> 80000000 expected 80000001
  Error: 00FFFFFF *. 3F000000 -> 007FFFFF expected 00800000
  Error: 00FFFFFF *. BF000000 -> 807FFFFF expected 80800000
 Elapsed microseconds: 490
 count: 334 passed: 269
Div tests: 
  Error: 7F800000 /. 00000000 -> FFFFFFFF expected 7F800000
  Error: FF800000 /. 00000000 -> FFFFFFFF expected FF800000
  Error: 7F800000 /. 80000000 -> FFFFFFFF expected FF800000
  Error: FF800000 /. 80000000 -> FFFFFFFF expected 7F800000
  Error: FF800000 /. 40000000 -> FF000000 expected FF800000
  Error: 7F800000 /. C0400000 -> FEAAAAAB expected FF800000
  Error: FF800000 /. C0800000 -> 7E800000 expected 7F800000
  Error: 7F800000 /. 40A00000 -> 7E4CCCCD expected 7F800000
  Error: FF800000 /. 40C00000 -> FE2AAAAB expected FF800000
  Error: 7F800000 /. C0E00000 -> FE124925 expected FF800000
  Error: FF800000 /. C1000000 -> 7E000000 expected 7F800000
  Error: 3F800000 /. 7F800000 -> 00200000 expected 00000000
  Error: C0000000 /. 7F800000 -> 80400000 expected 80000000
  Error: 40400000 /. FF800000 -> 80600000 expected 80000000
  Error: C0800000 /. FF800000 -> 00800000 expected 00000000
  Error: 40A00000 /. 7F800000 -> 00A00000 expected 00000000
  Error: C0C00000 /. 7F800000 -> 80C00000 expected 80000000
  Error: 40E00000 /. FF800000 -> 80E00000 expected 80000000
  Error: C1000000 /. FF800000 -> 01000000 expected 00000000
  Error: 7F000000 /. 7F800000 -> 3F000000 expected 00000000
  Error: FE800000 /. 7F800000 -> BE800000 expected 80000000
  Error: 7F000000 /. FF800000 -> BF000000 expected 80000000
  Error: FE800000 /. FF800000 -> 3E800000 expected 00000000
  Error: 7EFFFFFF /. 7F800000 -> 3EFFFFFF expected 00000000
  Error: FE7FFFFF /. 7F800000 -> BE7FFFFF expected 80000000
  Error: 7F7FFFFF /. FF800000 -> BF7FFFFF expected 80000000
  Error: FF7FFFFF /. FF800000 -> 3F7FFFFF expected 00000000
  Error: 7F800000 /. 7F000000 -> 40000000 expected 7F800000
  Error: FF800000 /. 7E800000 -> C0800000 expected FF800000
  Error: 7F800000 /. FF000000 -> C0000000 expected FF800000
  Error: FF800000 /. FE800000 -> 40800000 expected 7F800000
  Error: 7F800000 /. 7EFFFFFF -> 40000001 expected 7F800000
  Error: 7F800000 /. FE7FFFFF -> C0800001 expected FF800000
  Error: 7F800000 /. FF7FFFFF -> BF800001 expected FF800000
  Error: FF800000 /. FF7FFFFF -> 3F800001 expected 7F800000
  Error: 7F800000 /. 00800000 -> 7FFFFFFF expected 7F800000
  Error: FF800000 /. 01000000 -> FFFFFFFF expected FF800000
  Error: 7F800000 /. 81000000 -> FFFFFFFF expected FF800000
  Error: FF800000 /. 80800000 -> 7FFFFFFF expected 7F800000
  Error: 7F800000 /. 00FFFFFF -> 7FFFFFFF expected 7F800000
  Error: FF800000 /. 00800001 -> FFFFFFFF expected FF800000
  Error: 7F800000 /. 80800001 -> FFFFFFFF expected FF800000
  Error: FF800000 /. 80FFFFFF -> 7FFFFFFF expected 7F800000
  Error: 7F800000 /. 00000001 -> 7FFFFFFF expected 7F800000
  Error: FF800000 /. 00000003 -> FFFFFFFF expected FF800000
  Error: 7F800000 /. 80000002 -> FFFFFFFF expected FF800000
  Error: FF800000 /. 80000004 -> 7FFFFFFF expected 7F800000
  Error: 7F800000 /. 007FFFFF -> 7FFFFFFF expected 7F800000
  Error: FF800000 /. 007FFFFF -> FFFFFFFF expected FF800000
  Error: 7F800000 /. 807FFFFF -> FFFFFFFF expected FF800000
  Error: FF800000 /. 807FFFFF -> 7FFFFFFF expected 7F800000
  Error: 3F800000 /. 00000000 -> FFFFFFFF expected 7F800000
  Error: C0000000 /. 00000000 -> FFFFFFFF expected FF800000
  Error: 40400000 /. 80000000 -> FFFFFFFF expected FF800000
  Error: C0800000 /. 80000000 -> FFFFFFFF expected 7F800000
  Error: 40A00000 /. 00000000 -> FFFFFFFF expected 7F800000
  Error: C0C00000 /. 00000000 -> FFFFFFFF expected FF800000
  Error: 40E00000 /. 80000000 -> FFFFFFFF expected FF800000
  Error: C1000000 /. 80000000 -> FFFFFFFF expected 7F800000
  Error: 7F000000 /. 00000000 -> FFFFFFFF expected 7F800000
  Error: FE800000 /. 00000000 -> FFFFFFFF expected FF800000
  Error: 7F000000 /. 80000000 -> FFFFFFFF expected FF800000
  Error: FE800000 /. 80000000 -> FFFFFFFF expected 7F800000
  Error: 7EFFFFFF /. 00000000 -> FFFFFFFF expected 7F800000
  Error: FE7FFFFF /. 00000000 -> FFFFFFFF expected FF800000
  Error: 7E7FFFFF /. 80000000 -> FFFFFFFF expected FF800000
  Error: FEFFFFFF /. 80000000 -> FFFFFFFF expected 7F800000
  Error: 00800000 /. 00000000 -> FFFFFFFF expected 7F800000
  Error: 81000000 /. 00000000 -> FFFFFFFF expected FF800000
  Error: 01000000 /. 80000000 -> FFFFFFFF expected FF800000
  Error: 80800000 /. 80000000 -> FFFFFFFF expected 7F800000
  Error: 00FFFFFF /. 00000000 -> FFFFFFFF expected 7F800000
  Error: 80800001 /. 00000000 -> FFFFFFFF expected FF800000
  Error: 00800001 /. 80000000 -> FFFFFFFF expected FF800000
  Error: 80FFFFFF /. 80000000 -> FFFFFFFF expected 7F800000
  Error: 00000001 /. 00000000 -> FFFFFFFF expected 7F800000
  Error: 80000003 /. 00000000 -> FFFFFFFF expected FF800000
  Error: 00000002 /. 80000000 -> FFFFFFFF expected FF800000
  Error: 80000004 /. 80000000 -> FFFFFFFF expected 7F800000
  Error: 007FFFFF /. 00000000 -> FFFFFFFF expected 7F800000
  Error: 807FFFFF /. 00000000 -> FFFFFFFF expected FF800000
  Error: 007FFFFF /. 80000000 -> FFFFFFFF expected FF800000
  Error: 807FFFFF /. 80000000 -> FFFFFFFF expected 7F800000
  Error: 7B000000 /. 05000000 -> 7FFFFFFF expected 7F800000
  Error: 7F7FFFFF /. 00000001 -> 7FFFFFFF expected 7F800000
  Error: 7F000000 /. 007FFFFF -> 7FFFFFFF expected 7F800000
  Error: 00FFFFFF /. 40000000 -> 007FFFFF expected 00800000
  Error: 5E800000 /. 00000000 -> FFFFFFFF expected 7F800000
  Error: 46FFFE00 /. 00000000 -> FFFFFFFF expected 7F800000
  Error: C6FFFE00 /. 00000000 -> FFFFFFFF expected FF800000
 Elapsed microseconds: 740
 count: 379 passed: 289
Sqrt tests: 
  Error: FSQRT 80000000 -> FFFFFFFF expected 80000000
  Error: FSQRT 7F800000 -> 5F800000 expected 7F800000
 Elapsed microseconds: 753
 count: 96 passed: 94

evanh · 2021-09-25 14:32

Rounding errors: IEEE 32 says results should be rounded to the nearest even bit pattern

Gee, never tried to consider that level of specificity myself. I'm guessing there is some particular method that produces that as a natural outcome ...

ersmith · 2021-09-25 14:33

Oh, and another (minor) bug: DEBUG(fdec(x)) is printing "3.402824e+38" instead of "Inf" for $7f80_0000.

ersmith · 2021-09-25 14:40

And another oddity: NaN ==. NaN is returning FALSE (as it should), but NaN <>. NaN is also returning FALSE. I'm not sure if this is a bug, exactly, but in most other languages I've seen NaN <>. NaN would return TRUE, just like !(NaN ==. NaN).

~~EDITED: Whoops! It turns out that NaN <>. NaN should be false, according to the spec, so this is a case where Spin2 is doing the right thing, and gcc is not.~~

EDITED further: I should have gone back to the actual standard rather than relying on web pages. The IEEE spec is very careful in specifying that any comparison involving NaN should return unordered, which is distinct from equal (and hence x "not equal" x should be TRUE if x is NaN).

ersmith · 2021-09-25 16:56

@evanh said:

Rounding errors: IEEE 32 says results should be rounded to the nearest even bit pattern

Gee, never tried to consider that level of specificity myself. I'm guessing there is some particular method that produces that as a natural outcome ...

No, it's just that if you have a value like say 1.5 that's exactly half way between integers, then there's no unique "nearest" value to round to. Always rounding up results in a small bias away from 0. This can matter a lot in scientific and financial applications. For an integer example, if you have inputs that are exactly 1.5 and 2.5 and you round up then you get 2 and 3 (the sum is 5), whereas if you round to nearest even you get 2 and 2 ( the sum is 4, the same as the sum of the original inputs). This special rounding only needs to happen if the input is exactly half way between possible outputs of the rounding; otherwise, of course, you round to the nearest number.

ersmith · 2021-09-25 16:58

(deleted)

evanh · 2021-09-25 17:02

Okay, got it.

What I've been doing previously is adding 50% of the divisor to the dividend before division. But the means by which I get 50% is simple truncation (SHR #1) ...

evanh · 2021-09-25 17:28

I think SHR #2 then SHL #1 would work. In other words clear the lsb of the 50% divisor. Nope failed.
Best I got is: ((dividend + (divisor>>1) - 1) | 1) / divisor grr, that fails too.
Maybe this time: First copy dividend bit1 to bit0, then (dividend + divisor>>1) / divisor damn it! Way past bed time.

evanh · 2021-09-26 15:37

Right, I've cheated and used the Cordic's pipeline to give me both results at once.

pub  div33ieee( dividend, divisor ) : r | s
    org
        mov r, divisor
        shr r, #1       wc      ' odd/even check
        add dividend, r         ' + 50% divisor
'       rep @.rend, #1
        qdiv    dividend, divisor
    if_nc   sub dividend, #1            ' even
    if_nc   qdiv    dividend, divisor       ' even
    if_nc   getqx   s               ' even
    if_nc   testb   s, #0       wc      ' even
        getqx   r
.rend
    if_nc   mov r, s                ' even
    end

EDIT: Belatedly attached some rounding test results.

evanh · 2021-09-27 07:05

Here's the muldiv65() version:

pub  muldiv65ieee( mult1, mult2, divisor ) : r | s
    org
        rep @.rend1, #1         ' IRQ shield
        qmul    mult1, mult2
        mov r, divisor
        shr r, #1
        testb   divisor, #0 wz      ' odd(1)/even(0)
        getqx   mult1
        getqy   mult2
.rend1
        add mult1, r    wc      ' + 50% divisor
        addx    mult2, #0

        rep @.rend2, #1         ' IRQ shield
        setq    mult2
        qdiv    mult1, divisor
    if_nz   sub mult1, #1   wc      ' minus one
    if_nz   subx    mult2, #0
    if_nz   setq    mult2
    if_nz   qdiv    mult1, divisor
    if_nz   getqx   s
    if_nz   testb   s, #0       wz      ' odd(1)/even(0)
        getqx   r
.rend2
    if_nz   mov r, s                ' even
    end

evanh · 2021-09-27 08:01

I figure Chip's original muldiv64(), including two IRQ shields, should take 124 sysclock ticks. In reality it takes more because of skipping and other overheads. But, for comparison, I'll use 124 ticks.

pub  muldiv64( mult1, mult2, divisor ) : r
    org
        rep @.rend1, #1         ' IRQ shield
        qmul    mult1, mult2            ' 4
        getqx   mult1               ' 60
.rend1
        getqy   mult2

        rep @.rend2, #1         ' IRQ shield
        setq    mult2
        qdiv    mult1, divisor          ' 68
        getqx   r               ' 124
.rend2
    end

My extended muldiv65() comes to 132 ticks. An extra 8 ticks between the QMUL and QDIV.

pub  muldiv65( mult1, mult2, divisor ) : r
    org
        rep @.rend1, #1         ' IRQ shield
        qmul    mult1, mult2            ' 4
        mov r, divisor
        shr r, #1
        getqx   mult1               ' 60
        getqy   mult2
.rend1
        add mult1, r    wc
        addx    mult2, #0

        rep @.rend2, #1         ' IRQ shield
        setq    mult2
        qdiv    mult1, divisor          ' 76
        getqx   r               ' 132
.rend2
    end

And finally, muldiv65ieee() comes to 142 ticks.
All versions additionally have variability of up to +7 more for hub-op alignment.

cgracey · 2021-10-14 04:17

I just posted a new version 35q at the top of this thread.

The main symbol table was increased from 64KB to 256KB. Others went from 4KB to 32KB.

https://drive.google.com/file/d/1Op4YOyFIul-7-UhRq7Cdv-OfGx5afb5M/view?usp=sharing

cgracey · 2021-12-17 03:52

I've got the mouse feedback working in DEBUG. This should be pretty self-explanatory if you look at the code.

I had to slow the screen capture way down to make a sufficiently-small gif file, but this is reading almost 250 mouse positions per second, since I set the USB serial driver latency to 4ms.

For some reason, my screen-capture program is dragging a huge inverse box around with the cursor. Try to ignore that.

The way this works is you pass GETMOUSE a pointer to a 4-long record (either in Spin2 or PASM regs). It puts the mouse's x position, y position, left button, and right button into the four longs. If the cursor is outside of the PLOT window, then x = -1, y = -1, left = 0, right = 0. If the mouse cursor is in the window, it will return properly scaled and oriented coordinates and each button will read -1 if pressed or 0 if not pressed.

Francis Bauer · 2021-12-17 08:22

@cgracey said:

I've got the mouse feedback working in DEBUG. This should be pretty self-explanatory if you look at the code.

I had to slow the screen capture way down to make a sufficiently-small gif file, but this is reading almost 250 mouse positions per second, since I set the USB serial driver latency to 4ms.

For some reason, my screen-capture program is dragging a huge inverse box around with the cursor. Try to ignore that.

The way this works is you pass GETMOUSE a pointer to a 4-long record (either in Spin2 or PASM regs). It puts the mouse's x position, y position, left button, and right button into the four longs. If the cursor is outside of the PLOT window, then x = -1, y = -1, left = 0, right = 0. If the mouse cursor is in the window, it will return properly scaled and oriented coordinates and each button will read -1 if pressed or 0 if not pressed.

Great work Chip. This looks promising...

mwroberts · 2021-12-17 14:14

@cgracey Mouse feedback in debug is fantastic!!! This allows for a user interface... buttons and sliders and... You should start a new thread so people know about this... When will you release a new pnut with this feature? I imagine it will be a while before it makes it to propeller tool...

PNut/Spin2 Latest Version (v51 - New POW, LOG2, EXP2, LOG10, EXP10, LOG, EXP floating-point ops)

Comments