Spin32: If we had a 32 bit Spin compiler/interpreter how would it work?
jazzed
Posts: 11,803
I've been toying with 32 bit words for the Spin interpreter key variables (dcurr, pbase, vbase, etc..) and have something that works right so far within the HUB RAM. Another question beyond the subject is: can an interface to fast XMM memory be squeezed into the interpreter? TBD
1. Is it possible to just copy everything from HUB RAM to XMM and flip a switch?
2. Given that the ROM occupies $8000-$FFFF, and stack grows up, a big hack would be needed for stack.
3. Is it possible to just make the stack always start above $FFFF to push the stack over the "ROM Hole" code?
4. Maybe a big dummy array would serve to fix the "ROM Hole" instead of asking for a special compiler mode?
I understand that Homespun allows for images > 32K. Is there any limit at all in Homespun today that would prevent a "Spin32" from working?
TIA
--Steve
1. Is it possible to just copy everything from HUB RAM to XMM and flip a switch?
2. Given that the ROM occupies $8000-$FFFF, and stack grows up, a big hack would be needed for stack.
3. Is it possible to just make the stack always start above $FFFF to push the stack over the "ROM Hole" code?
4. Maybe a big dummy array would serve to fix the "ROM Hole" instead of asking for a special compiler mode?
I understand that Homespun allows for images > 32K. Is there any limit at all in Homespun today that would prevent a "Spin32" from working?
TIA
--Steve
Comments
Just an idea for a 32 bit SPIN memory model:
We have 8 COGs, so let's say from HUB-RAM each COG gets 3kB. A 1kB page for code and 2 x 1kB page for data (one for source, one for destination addresses, just for the case these are to far apart from each other?). The 32-bit address can then easily be mapped as 10 LSBits show you the address in the page and all MSB show you where to find that in external memory. If a operation accesses data outside of the current page it has to wait until a external memory driver has swapped the page. In the other case it can run at full-speed. And 1kB of SPIN code can really contain much.
This makes 8x3kB = 24kB. The rest of the HUB-RAM (8kB) can be used for COG to COG communication, (and maybe for stack?)
The main problem I see is that of XMM (SRAM) bus contention between multiple cogs if the Interpreter were to use XMM for other than loading overlays of code from (or cache). Individual byte/word/long access would be monumentally slow for the contention handler given the way the interpreter works.
@jazzed:
Changing to 32 bits for dcurr, pbase, vbase should be fairly simple as IIRC it is actually 32 bit within the cog now. However, when these are pushed and pulled from the stack, a long would need to be pushed & popped.
1. Yes, but only for·1 modified interpreter. This could be done with only rdxxxx & wrxxxx instructions changed to access XMM. Due to contention, only 1 interpreter could be run or else contention must be in place using a lock or something similar.
2. I don't agree. It just depends on where to place the stack to ensure it has enough space.
3. I believe so, but see my answer to MagIO below.
4. Yes.
@MagIO:
There is really a push & pop. Don't forget, you need considerable space for the video buffer in hub.
IMHO, a mechanism for copying code blocks from XMM maybe the best, served up by a single cog. The other alternative maybe to fetch a block of code from XMM (using contention locks) and then execute it from hub, sort of like a cache, but only acting on the code, not stack or variables.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
Are push and pop supported by SPIN language or do you talk about SPIN bytecode?
Real video is not where a propeller is really good for. So, for a simple text output some kB can still be used. Maybe the page size can be held flexible, so if you really need graphics you can reduce it to 512 byte. Each COG used with PASM code or old SPIN will give you extra memory available for other things.
For real good graphics external RAM is needed anyway - or maybe a second propeller or other support (TFT with controller + RAM / FPGA / XMOS)
I'm not sure that means what you think it means. There are both push and pop. There is also a pop in the context of "discard x items off the stack".
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Life may be "too short", but it's the longest thing we ever do.
Thing is, there will "never" be enough room in one COG for the interpreter and the XMM fetch code.
So the solution is to have a separate COG do the read/write. Not only does it solve most contention, but it would also allow Spin32 to easily run on any XMM hardware solution provided a driver exists. It seems locks would still be required to manage ownership of the IO COG.
Bill's VMCOG interface when complete would serve the solution nicely I'm sure. Using 2 COGs for 32 bit addressable Spin -vs- 1 COG is a fair trade for big programs assuming a working compiler is available.
Post Edited (jazzed) : 2/4/2010 7:26:39 PM GMT
Ok, I see what you mean about "user accessible". I guess exposing the stack to the user takes away a lot of the simplicity and safety of spin.
If you are a bit of an experimenter, you can experiment directly with the spin bytecodes using bstc. It implements a bytecode() primitive that directly injects the code into the method.
Just use comma delimited numbers to squirt in whatever you like.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Life may be "too short", but it's the longest thing we ever do.
The only obvious limitations I see are really in the headers. Both .binary and object. If you blew them out to 32 bits then you could pretty much place anything wherever you liked.
There is nothing preventing you from putting PBASE = $10000 VBASE = PBASE+CODESIZE & DBASE=$10+NEW_INTERPRETER_SIZE
Why not ask Chip what he is intending to do for the upcoming device? I suspect it's a problem that has already been thought through, and it would be logical to do it in such a fashion as it does not require completely re-architecting the compiler.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Life may be "too short", but it's the longest thing we ever do.
I thought about waiting for Parallax several times over the last few years·[noparse]:smilewinkgrin:[/noparse] Maybe Chip or someone else not buried in PropII development will speak up. Trying to make something compatible with what Parallax is doing would offer everyone a head start.
Running everything offset from $10000 is not a bad start. I guess the problem with PBASE = $10000, etc... is that hardware device drivers need to have buffers at $0-$7FFF for fast memory access. That could be adjusted if control variable addresses were pointers .... As in other times, the preprocessor conditionals using SPIN32 or whatever could serve to differentiate versions for private library code.
I've managed to save 6 longs in the original PNUT interpreter by replacing MULT and SQRT PASM thanks to CessnaPilot and Lonesock. Looking for other savings. I'm not sure what it will take to finish. An LMM type spin interpreter may be a better approach. Interesting challenge [noparse][/noparse]
Since DAT variables/code are "global", wouldn't they always be referenced by Spin in their original address regardless of pbase, dbase, etc... ?
Nope. DAT variables are only "global" to the object. All addresses are relative to PBASE. The hacky @@@ operator was invented to return the real hub address.
Using @label in spin when referencing an object will return the hub address of whatever it is you are asking for, but it's a runtime thing in the interpreter.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Life may be "too short", but it's the longest thing we ever do.
Initially I thought my version of the Interpreter would not fit in cog, so I built my overlay loader and used that to place some of the less used routines in hub. Once I had the decode working I found I had plenty of space (plus a big speed improvement) and began improving the other sections of the code. The hub decoding takes 256 longs in hub, but would be common for all copies of my interpreter. This allowed me to simplify the arithmetic code. I then still had enough room to place the overlays back into the interpreter.
IIRC there is still enough room for direct XMM access but of course would require locks. I believe it would be faster to do the access to XMM directly in cog even with locks. Otherwise, you are going via another cog which accesses sram (or wherever) to hub to cog and visa versa. Direct bypasses the hub.
If Bill's VMCOG works then perhaps it ma be no slower that direct access. We must wait and see.
So, what I am saying is, there is room in the interpreter and that will also gain a runtime improvement of 20-25% which may then be traded for XMM access.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
It would be useful if you summarize your overlay approach. I may adopt this if it's not a nightmare.
My faster spin interpreter also has a thread in the tools link.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
VMCOG could be used to provide an on-propeller (ie no PC needed) Spin debugger!
It would only work for "pure" Spin code, but would it not be cool for beginners to the prop? Combined with Sphinx?
In a related thought... the proposed XSpin virtual machine ought to be able to speed up Sphinx immensely, as it would get rid of temp files etc.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com 5.0" VGA LCD in stock!
Morpheus dual Prop SBC w/ 512KB kit $119.95, Mem+2MB memory/IO kit $89.95, both kits $189.95 SerPlug $9.95
Propteus and Proteus for Propeller prototyping 6.250MHz custom Crystals run Propellers at 100MHz
Las - Large model assembler Largos - upcoming nano operating system
jazzed: I will answer shortly. Yes I spawned a new version and added each mod seperately, so it appears that v260C_007F is the latest and indeed may be fully functional (just checking). My comments indicate it·SHOULD be a fully operational version.·I believe the regression worked. You will of course need to test·it. I had done a lot of testing of the individual sections as I optimised them, but recall how I was unravelling around the j6??? sections to see if I could speed it up with the newfound·free space.
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Links to other interesting threads:
· Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
· Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
· Prop Tools under Development or Completed (Index)
· Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
· Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
Post Edited (Cluso99) : 2/6/2010 4:51:09 AM GMT