Propeller 2 Spin - any news?

Bill HenningBill Henning Posts: 6,445
edited 2013-03-18 - 06:28:37 in Propeller 2
Now that I have my DE2-115 running, I am very interested in how Spin2 is doing...

Comments

  • Cluso99Cluso99 Posts: 15,776
    edited 2013-03-10 - 21:15:26
    Bill, I have not progressed any further on conversion of my faster P1 version. IIRC I posted the latest code in its own P2 thread.

    Of course I am interested to hear what Chips view on P2 spin is.
  • cgraceycgracey Posts: 12,461
    edited 2013-03-10 - 21:19:14
    I started working on it, but haven't done much in several months, as we've been wrapping the silicon up. And then there's all that chip documentation to finish....
  • Cluso99Cluso99 Posts: 15,776
    edited 2013-03-10 - 21:31:50
    Chip,
    I fully understand you have been too busy...
    Can I ask a few questions about your spin2 thoughts?

    I was interested to use my faster P1 spin as a base because it did quicker decoding freeing up valuable cog space that permited most of the other spin codes to be sped up, particularly the maths ones. I had thought that the decoding would fit in the CLUT but I now think this is better used for the stacks. The reason for this is that almost every spin code contains at least one push and a pop to/from the stack. Decoding is only once per code. Hence the speed yield will be much better using the CLUT as stacks. Do you agree?

    Are you happy to use something like my vector table in hub to decode each spin code? Each spin code reads a long containing 3 addresses and 5 bits of flags from hub. The addresses are used as addresses to routines to be performed by the spin code.

    Is there any extras you can think of that we could start implementing while we wait for you to get some free time?
  • jazzedjazzed Posts: 11,803
    edited 2013-03-10 - 22:11:00
    Cluso99 wrote: »
    Is there any extras you can think of that we could start implementing ...?

    Greater than 16 bit addresses? Maybe that has already been fixed.

    I would be worried about stack growth > clut size.
    Anybody could always use a different version of the interpreter if necessary though.
  • cgraceycgracey Posts: 12,461
    edited 2013-03-10 - 22:25:22
    Cluso99 wrote: »
    Chip,
    I fully understand you have been too busy...
    Can I ask a few questions about your spin2 thoughts?

    I was interested to use my faster P1 spin as a base because it did quicker decoding freeing up valuable cog space that permited most of the other spin codes to be sped up, particularly the maths ones. I had thought that the decoding would fit in the CLUT but I now think this is better used for the stacks. The reason for this is that almost every spin code contains at least one push and a pop to/from the stack. Decoding is only once per code. Hence the speed yield will be much better using the CLUT as stacks. Do you agree?

    Are you happy to use something like my vector table in hub to decode each spin code? Each spin code reads a long containing 3 addresses and 5 bits of flags from hub. The addresses are used as addresses to routines to be performed by the spin code.

    Is there any extras you can think of that we could start implementing while we wait for you to get some free time?

    I was planning on sticking with byte codes for density, since RDBYTEC is pretty fast. The other certainty would be to use the stack RAM as the run-time stack That would place certain limits on call depth, but would be very efficient and fast. From where I left off, I was figuring Spin on Prop2 was going to be about 30x faster than on Prop1.
  • Cluso99Cluso99 Posts: 15,776
    edited 2013-03-11 - 00:37:07
    cgracey wrote: »
    I was planning on sticking with byte codes for density, since RDBYTEC is pretty fast.

    Agreed. No reason to change this.
    The other certainty would be to use the stack RAM as the run-time stack That would place certain limits on call depth, but would be very efficient and fast.

    This sounds good. I have not examined the interpreter from real usage point of view.
    From where I left off, I was figuring Spin on Prop2 was going to be about 30x faster than on Prop1.
    WOW. I thought maybe somewhere between 10x to 20x which would be very acceptable.

    The biggest improvements I made was with the (simple) maths bytecodes. I also implemented your? improved multiply and divide routines. Many bytecodes also improved because I was able to inline some of the push/pop and other called routines (due to cog space restrictions).

    By using LMM, some of the lesser used opcodes could be moved to LMM, leaving more room to streamline some of the bytecode execution.
  • David BetzDavid Betz Posts: 13,699
    edited 2013-03-11 - 04:39:29
    cgracey wrote: »
    I was planning on sticking with byte codes for density, since RDBYTEC is pretty fast. The other certainty would be to use the stack RAM as the run-time stack That would place certain limits on call depth, but would be very efficient and fast. From where I left off, I was figuring Spin on Prop2 was going to be about 30x faster than on Prop1.
    I don't know if you want to take this into account but a small stack might interfere with using Femto Basic on the P2 since I believe it uses a recursive parser. Of course, typical Basic expressions are failry simple so that might not make a big difference.
  • potatoheadpotatohead Posts: 9,989
    edited 2013-03-11 - 08:40:01
    30X? Heck, SPIN might end up on par with PASM on P1. :)
  • Bill HenningBill Henning Posts: 6,445
    edited 2013-03-11 - 13:11:35
    I wonder if an intermediate port of Spin would be useful...

    - lower 64KB is used for Spin code sections (any free space could be used by PASM code as buffers)
    - upper 64KB is used for DAT
    - this should allow keeping to just 16 bit pointers, albeit by implementing a split I&D scheme

    This may reduce the work needed to get a sub-optimal Spin running.
  • Cluso99Cluso99 Posts: 15,776
    edited 2013-03-11 - 17:21:35
    I wonder if an intermediate port of Spin would be useful...

    - lower 64KB is used for Spin code sections (any free space could be used by PASM code as buffers)
    - upper 64KB is used for DAT
    - this should allow keeping to just 16 bit pointers, albeit by implementing a split I&D scheme

    This may reduce the work needed to get a sub-optimal Spin running.
    I thought that restricting spin to the lower 64KB hub would be fine, at least to get it running. That is double what we have on the P1. I don't think anyone ran out of space for spin programs... it was the buffer space for video, etc and of course these can be tasked to fit in the upper 64KB.
  • Bill HenningBill Henning Posts: 6,445
    edited 2013-03-11 - 17:26:45
    You are right; for the first "Spin" (pun intended) no need for split I&D, as you say, the last 64KB can be reserved for buffers etc.
  • pedwardpedward Posts: 1,583
    edited 2013-03-13 - 23:03:15
    cgracey wrote: »
    I was planning on sticking with byte codes for density, since RDBYTEC is pretty fast. The other certainty would be to use the stack RAM as the run-time stack That would place certain limits on call depth, but would be very efficient and fast. From where I left off, I was figuring Spin on Prop2 was going to be about 30x faster than on Prop1.

    Wow, that's something like 15MIPS!?
  • jazzedjazzed Posts: 11,803
    edited 2013-03-13 - 23:20:04
    Cluso99 wrote: »
    I thought that restricting spin to the lower 64KB hub would be fine, at least to get it running. That is double what we have on the P1. I don't think anyone ran out of space for spin programs...

    I certainly exhausted code space. :frown:
  • Cluso99Cluso99 Posts: 15,776
    edited 2013-03-14 - 00:59:46
    jazzed wrote: »
    I certainly exhausted code space. :frown:
    Do you mean 32KB of spin bytecode??? Or just filled the whole 32KB with buffers, spin, pasm, etc???
  • jazzedjazzed Posts: 11,803
    edited 2013-03-14 - 07:27:50
    Cluso99 wrote: »
    Do you mean 32KB of spin bytecode??? Or just filled the whole 32KB with buffers, spin, pasm, etc???

    I said code space. One project was a g-code decoder and x-y stepper driver (driven by one pin toggles). It was almost entirely SPIN except for the full-duplex serial driver and some data. That g-code program is lost except that I sent it to someone in email once.
  • Cluso99Cluso99 Posts: 15,776
    edited 2013-03-14 - 20:05:40
    Thanks jazzed. I was not aware of anyone who had almost a whole 32KB of spin code.

    Guess there may be a reason to have the 64KB version initially, and a second 64KB+ version (later - should not be restricted to 128KB as hopefully we may bet a P2B with >128KB hub in the future, if there is enough demand).
  • FredBlaisFredBlais Posts: 368
    edited 2013-03-18 - 06:28:37
    I think SPIN is essential for the Propeller. From my experience with the first Prop, here are some things that I would change (just my opinion...)

    Function calls : function should be called with the ( ) at the end. Right now, I'm sometimes confused between function and variables.
    Objects : It would be nice to be able to get the content of an object global variable without passing by a function call, directly with the dot operator.
    Operators : I think that <= >= != would be more standard than =< >= <>
    Structures : would make the code easier to understand than this
    labels[label_cnt*3] := commands[pc+1] 'Store label id in an array
    labels[label_cnt*3+1] := pc 'Store the PC address for later use
    labels[label_cnt*3+2] := 0 'Not looping
    

    Just my 2 cent :)
Sign In or Register to comment.