Spin2 Interpreter

cgraceycgracey Posts: 11,734
edited 2019-06-30 - 08:43:05 in Propeller 2
Here is the Spin2 interpreter. It includes the brief code that boots up the initial Spin2 instance in cog0, and even a little bytecode program to toggle P0.

The compiler just needs to initialize four longs with the startup data and then append the object blob after the interpreter. Then, you have a Spin2 application ready to load and execute.

I'm at the point now where I just need to modify the compiler to handle the latest changes I've made to the interpreter and then Spin2 should come to life in PNut.exe. Jeff is ready to fit the new Spin2 compiler into the PropellerTool.exe and make the modifications needed to the syntax highlighter.

This interpreter really packs in the functionality using XBYTE. The entire interpreter, including initialization code is 3,564 bytes. It will probably grow to 4KB once all the smart pin auxiliary bytecodes and routines are added in.

Cog registers $000.$140 are free for user PASM programs and data, while $1E0..$1EF serve as the RESULT/parameters/locals conduit for inline assembly and cog/hub calls. Auxiliary bytecodes can be added without any impact on cog registers.

In this example, a 250MHz clock is selected and the following program loops at 3.906MHz:
REPEAT
  PINNOT(0)

This file should assemble and run on the current P2 Eval board. (I say "should" because I'm not sure if I've used any operators that might not be in the last release of PNut.exe.)

«1345678

Comments

  • Peter JakackiPeter Jakacki Posts: 8,702
    edited 2019-06-30 - 09:00:13
    Way to go Chip, well done.

    Best I got out of TAQOZ was about 2.36MHz using 48 PIN 0 FOR T NEXT which in this case NEXT is counting down to zero from zero and toggling the selected pin with T.

    It would be really nice if the Spin2 compiler could produce a listing though.

    CORRECTION: the pin outputs 2.23MHz but the toggle rate is 4.46MHz

    Tachyon Forth - compact, fast, forthwright and interactive
    useforthlogo-s.png
    P2 --- TAQOZ INTRO & LINKS --- P2 SHORTFORM DATASHEET --- TAQOZ RELOADED - 64kB binary with room to spare
    P1 --- Latest Tachyon with EASYFILE --- Tachyon Forth News Blog --- More
    paypal.png PayPal me
    Brisbane, Australia
  • cgraceycgracey Posts: 11,734
    edited 2019-06-30 - 08:54:05
    The loop rate is 3.906MHz, so the P0 cycle rate is 1.953MHz. Was your cycle rate 2.36MHz? What frequency were you running at?

    Yes, a listing will happen at some point.
  • cgracey wrote: »
    The loop rate is 3.906MHz, so the P0 cycle rate is 1.953MHz. Was your cycle rate 2.36MHz? What frequency were you running at?

    Yes, a listing will happen at some point.

    I just posted a correction but the scope doesn't always show the correct frequency however I measured 56 cycles per loop
    TAQOZ# 48 PIN 1,000,000 LAP FOR T NEXT LAP .LAP --- 56,000,089 cycles= 233,333,704ns @240MHz ok
    
    So that brings it to a toggle rate of 4.464MHz @250MHz

    Tachyon Forth - compact, fast, forthwright and interactive
    useforthlogo-s.png
    P2 --- TAQOZ INTRO & LINKS --- P2 SHORTFORM DATASHEET --- TAQOZ RELOADED - 64kB binary with room to spare
    P1 --- Latest Tachyon with EASYFILE --- Tachyon Forth News Blog --- More
    paypal.png PayPal me
    Brisbane, Australia
  • cgracey wrote: »
    The loop rate is 3.906MHz, so the P0 cycle rate is 1.953MHz. Was your cycle rate 2.36MHz? What frequency were you running at?

    Yes, a listing will happen at some point.

    I just posted a correction but the scope doesn't always show the correct frequency however I measured 56 cycles per loop
    TAQOZ# 48 PIN 1,000,000 LAP FOR T NEXT LAP .LAP --- 56,000,089 cycles= 233,333,704ns @240MHz ok
    
    So that brings it to a toggle rate of 4.464MHz @250MHz

    Actually, I can get that SAME rate if the bytecodes are on copacetic boundaries for the the RDFAST.

    I just found that putting in four extra bytecodes before the loop get me to 4.464MHz toggle rate.
  • So, I think we both hit the limit.
  • Peter JakackiPeter Jakacki Posts: 8,702
    edited 2019-06-30 - 10:19:14
    cgracey wrote: »
    I just found that putting in four extra bytecodes before the loop get me to 4.464MHz toggle rate.

    Luv it :)

    These are the sections of code that get executed in TAQOZ btw.
    ' TAQOZ wordcode interpreter stub
    doCALL                  call    fx                      ' could call cog or hub code - use ret to return
    doNEXT                 	rdword  fx,PTRA++               ' read word code instruction
    doCODE                  cmp     threaded,fx wc        	' wordcode below threaded are cog or hubexex - just call
            if_nc           jmp     #doCALL			' just call if it is asm code - either cog or in
            
    
    ' TOGGLE the selected pin        
    _T              _ret_   drvnot  pinreg
    
            
    ' NEXT ( -- ) Decrement count (on loop stack) and loop until 0, then pop loop stack
    forNEXT                 djz     index,#POPBRANCH        ' exit loop
                    _ret_   mov     PTRA,branchadr          ' loop again
    
    edit: added doCALL line and aligned

    Tachyon Forth - compact, fast, forthwright and interactive
    useforthlogo-s.png
    P2 --- TAQOZ INTRO & LINKS --- P2 SHORTFORM DATASHEET --- TAQOZ RELOADED - 64kB binary with room to spare
    P1 --- Latest Tachyon with EASYFILE --- Tachyon Forth News Blog --- More
    paypal.png PayPal me
    Brisbane, Australia
  • There's a huge advantage to keeping the top of the stack in a register, like we are doing, but I can't figure out how to efficiently keep more than the top of the stack in registers.
  • Great work, Chip!

    Except that now I will have to update Catalina and re-do all the C/Spin interoperability ... and just when I thought I was nearly finished! :)

    Is there a definition of the updated Spin2 language? Or is this the original Spin?
    Catalina - a FREE ANSI C compiler for the Propeller.
    Download it from http://catalina-c.sourceforge.net/
  • cgraceycgracey Posts: 11,734
    edited 2019-06-30 - 09:25:47
    Here are the bytecodes that make my loop:
    		byte	bc_con_0		'constant 0
    		byte	bc_pinnot		'pinnot
    		byte	bc_jmp,-3 & $7F		'loop to bc_con_0
    

    Here are the XBYTE code snippets that execute to make the loop (6-clock overhead per bytecode):
    bc_con_0	pusha	x
    		mov	x,pa
    	_ret_	sub	x,#con_n1-$1FF
    
    bc_pinnot	drvnot	x
    	_ret_	popa	x
    
    bc_jmp		rfvars	w
    		add	pb,w
    	_ret_	rdfast	#0,pb
    

    It's the pusha and popa that drag things down, mostly.
  • RossH wrote: »
    Great work, Chip!

    Except that now I will have to update Catalina and re-do all the C/Spin interoperability ... and just when I thought I was nearly finished! :)

    Is there a definition of the updated Spin2 language? Or is this the original Spin?

    I don't have a definition, yet, but I'll get one together as soon as things are working well.
  • jmgjmg Posts: 13,993
    cgracey wrote: »
    Actually, I can get that SAME rate if the bytecodes are on copacetic boundaries for the the RDFAST.

    I just found that putting in four extra bytecodes before the loop get me to 4.464MHz toggle rate.


    Does that mean speed will bounce around, depending on code alignment ?
    Is there an ALIGN mechanism to prevent that from occurring ?
  • jmg wrote: »
    cgracey wrote: »
    Actually, I can get that SAME rate if the bytecodes are on copacetic boundaries for the the RDFAST.

    I just found that putting in four extra bytecodes before the loop get me to 4.464MHz toggle rate.


    Does that mean speed will bounce around, depending on code alignment ?
    Is there an ALIGN mechanism to prevent that from occurring ?

    I don't even know what the rule for optimal alignment would be. It would take some experimenting and reasoning. If we figure it out, though, it would really kill performance to start adding filler bytecodes to get some things to loop fast.
  • cgraceycgracey Posts: 11,734
    edited 2019-06-30 - 09:41:27
    ...And I think the hub-relative location of 'x' for the pusha and popa may have more effect than the rdfast does on timing, since there are two 'x' actions.

    It would be impossible to compile-time align run-time things like stack pointers.
  • Thanks Chip!
    I'm finishing off my debugger now, and this will be a nice test of the XBYTE tracing. :)
    Melbourne, Australia
  • ozpropdev wrote: »
    Thanks Chip!
    I'm finishing off my debugger now, and this will be a nice test of the XBYTE tracing. :)

    Great. Man, remember how tricky it was to get everything working right on the FPGA? Hopefully, there are no silicon bugs lurking.
  • cgracey wrote: »
    ozpropdev wrote: »
    Thanks Chip!
    I'm finishing off my debugger now, and this will be a nice test of the XBYTE tracing. :)

    Great. Man, remember how tricky it was to get everything working right on the FPGA? Hopefully, there are no silicon bugs lurking.
    Lots of miles travelled here on FPGA and P2 silicon, I'm pretty confident your efforts will be rewarded.
    Melbourne, Australia
  • evanhevanh Posts: 8,033
    edited 2019-06-30 - 11:07:05
    Chip,
    A completely off topic question: Is there any conditional compile in Pnut v32i? eg: "if const < 10" sort of thing?
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • cgracey wrote: »
    There's a huge advantage to keeping the top of the stack in a register, like we are doing, but I can't figure out how to efficiently keep more than the top of the stack in registers.

    In the P2 ZPU interpreter I had two XBYTE tables, essentially one for "top of stack in register" and one for "top two stack items in registers". Push opcodes would transition from the one register state to the two register one (and if already in two would have to hit memory). Math opcodes would typically consume the top two registers and transition back to one item in register. It did add book-keeping complexity, but the ZPU instruction set is pretty simple so it all fit.

    My JIT compiler toolkit has a stack optimizer that lets you keep up to N stack items in registers. That relies on knowledge of where jumps go so we don't ever jump into the middle of a sequence (and every time there's a branch the stack state has to be reset).
  • Chip,
    Are you really going to get PropTool updated with Spin2???

    I know it just works, but it it lacks even the basic of features like conditional compile, etc. Surely we can give it a miss and move on to something better??? There is Eric's fastspin. We can use Visual Studio Code to edit which is open source and cross platform. Then there is OpenSpin which Roy is likely to update.
    My Prop boards: P8XBlade2, RamBlade, CpuBlade, TriBlade
    Prop OS (also see Sphinx, PropDos, PropCmd, Spinix)
    Website: www.clusos.com
    Prop Tools (Index) , Emulators (Index) , ZiCog (Z80)
  • evanh wrote: »
    Chip,
    A completely off topic question: Is there any conditional compile in Pnut v32i? eg: "if const < 10" sort of thing?

    Not yet. I recognize the need to add that functionality, though.
  • ersmith wrote: »
    cgracey wrote: »
    There's a huge advantage to keeping the top of the stack in a register, like we are doing, but I can't figure out how to efficiently keep more than the top of the stack in registers.

    In the P2 ZPU interpreter I had two XBYTE tables, essentially one for "top of stack in register" and one for "top two stack items in registers". Push opcodes would transition from the one register state to the two register one (and if already in two would have to hit memory). Math opcodes would typically consume the top two registers and transition back to one item in register. It did add book-keeping complexity, but the ZPU instruction set is pretty simple so it all fit.

    My JIT compiler toolkit has a stack optimizer that lets you keep up to N stack items in registers. That relies on knowledge of where jumps go so we don't ever jump into the middle of a sequence (and every time there's a branch the stack state has to be reset).

    Thanks for those explanations, Eric. That gives me something to think about.
  • Cluso99 wrote: »
    Chip,
    Are you really going to get PropTool updated with Spin2???

    I know it just works, but it it lacks even the basic of features like conditional compile, etc. Surely we can give it a miss and move on to something better??? There is Eric's fastspin. We can use Visual Studio Code to edit which is open source and cross platform. Then there is OpenSpin which Roy is likely to update.

    It is something that Parallax wrote and has complete knowledge of. So, that is our primary platform.
  • PropTool being closed source makes it a REALLY bad primary platform, also being Delphi makes it obsolete.

    Parallax simply must make a new cross platform editor or endorse/use one of the existing ones from the community.
  • Roy Eltham wrote: »
    PropTool being closed source makes it a REALLY bad primary platform, also being Delphi makes it obsolete.

    Parallax simply must make a new cross platform editor or endorse/use one of the existing ones from the community.

    One step at a time. PropellerTool is a nice step up from PNut, is it not?
  • cgracey wrote: »
    Roy Eltham wrote: »
    PropTool being closed source makes it a REALLY bad primary platform, also being Delphi makes it obsolete.

    Parallax simply must make a new cross platform editor or endorse/use one of the existing ones from the community.

    One step at a time. PropellerTool is a nice step up from PNut, is it not?
    Roy was given the x86 code for Spin1 and created a portable work-alike compiler. Couldn't the same be done with the PropTool? Have someone sign an NDA to get access to the closed source and have them create an open source version. As I understand it, the only closed source part is the edit widget. How hard could it be made to create a clean room implementation of that using something like Qt?

  • cgraceycgracey Posts: 11,734
    edited 2019-07-01 - 00:26:43
    David Betz wrote: »
    cgracey wrote: »
    Roy Eltham wrote: »
    PropTool being closed source makes it a REALLY bad primary platform, also being Delphi makes it obsolete.

    Parallax simply must make a new cross platform editor or endorse/use one of the existing ones from the community.

    One step at a time. PropellerTool is a nice step up from PNut, is it not?
    Roy was given the x86 code for Spin1 and created a portable work-alike compiler. Couldn't the same be done with the PropTool? Have someone sign an NDA to get access to the closed source and have them create an open source version. As I understand it, the only closed source part is the edit widget. How hard could it be made to create a clean room implementation of that using something like Qt?

    That could be done, but we are days away from having PropellerTool working. New chips will be here in a month. I want Spin2 to be in a proven state by then. We could Implement Spin2 in many different ways, going forward. I just need something that works ASAP - something that will let me prove the interpreter and get syntax issues sorted out. It's going to be a very iterative process.

  • cgracey wrote: »
    David Betz wrote: »
    cgracey wrote: »
    Roy Eltham wrote: »
    PropTool being closed source makes it a REALLY bad primary platform, also being Delphi makes it obsolete.

    Parallax simply must make a new cross platform editor or endorse/use one of the existing ones from the community.

    One step at a time. PropellerTool is a nice step up from PNut, is it not?
    Roy was given the x86 code for Spin1 and created a portable work-alike compiler. Couldn't the same be done with the PropTool? Have someone sign an NDA to get access to the closed source and have them create an open source version. As I understand it, the only closed source part is the edit widget. How hard could it be made to create a clean room implementation of that using something like Qt?

    That could be done, but we are days away from having PropellerTool working. New chips will be here in a month. I want Spin2 to be in a proven state by then. We could Implement Spin2 in many different ways, going forward. I just need something that works ASAP. something that will let me prove the interpreter and get syntax issues sorted out. It's going to be a very iterative process.
    That's understandable.

  • cgracey wrote: »

    Yes, a listing will happen at some point.

    Good. Very important to have a listing. I sort of found that for P1 (until I started to use BST) without a listing file it was often very difficult to understand how exactly the 32kB of memory was layed out without a fair bit of reverse engineering and head scratching. For P2, the memory might not be as scarce to need to try to reuse every spare block like I often needed to for P1 but the listing is still important when integrating with other languages etc, and it can be useful for code debug as well.
  • cgracey wrote: »
    evanh wrote: »
    Chip,
    A completely off topic question: Is there any conditional compile in Pnut v32i? eg: "if const < 10" sort of thing?
    Not yet. I recognize the need to add that functionality, though.
    Ha! Fortuitous timing in terms of topically raising the feature. I hadn't actually read this topic, I was just fishing for options to help me right now.
    "We suspect that ALMA will allow us to observe this rare form of CO in many other discs.
    By doing that, we can more accurately measure their mass, and determine whether
    scientists have systematically been underestimating how much matter they contain."
  • David Betz,
    PropTool is written in Delphi and uses some closed source libs or something. Porting it out to a reasonable language would be kind of silly. The better path to take would be to make something new from scratch using Qt. You can get a reasonable IDE up and running in Qt in a couple days, I've done the basics a couple times but then gave up because it was easier to just use VS Code or notepad++ for myself.

    There needs to be an open source cross platform IDE/compiler that is official, and it needs to be available at launch.
Sign In or Register to comment.