Shop OBEX P1 Docs P2 Docs Learn Events
Spin Interpreter Needed — Parallax Forums

Spin Interpreter Needed

jazzedjazzed Posts: 11,803
edited 2010-06-23 04:19 in Propeller 1
We need a Spin interpreter that can fetch instructions from external memory like I2C EEPROM, RAM, SDCARD, etc....

I have been able to demonstrate with the Propeller JVM with a small and simple on board cache that it is possible to use I2C EEPROM like this effectively (or some other address mapped device such as SRAM, DRAM, or Flash). With Bill's VMCOG, it would also be possible to use SDCARD as the physical store for Spin byte-codes.

I have a version of a Big Spin Interpreter that is very close to achieving this goal, but there are limits that I just can't seem to get around (my time is also a limit). Given that, I'm posting what I have in case someone is willing to take up the cause. The example runs, and it is easy to see the changes required to support > 64KB (HUB+ROM) code space, but it does not support external memory yet because there is no room to put in a byte-code fetcher.

Changes from Chip's original interpreter can be spotted because of code that looks like below. The original code is commented out.

{
                        rdword  dcall,dcall             'set old dcall
                        wrword  pcurr,dbase             'set return pcurr
                        add     dbase,#2                'set call dbase
'}
'{
                        rdlong  dcall,dcall             'set old dcall
                        wrlong  pcurr,dbase             'set return pcurr
                        add     dbase,#4                'set call dbase
'}



What is necessary to finish is to have a byte-code fetcher. There are only a few places where this needs to be done (one example is shown below), and given enough savings in the COG by someone (the wall I can't seem to penetrate), it is possible to add a "spinner mailbox" for posting the address to another COG that will fetch the data. I'm hoping some huge re-write of the interpreter is not required to do this because of the amount of testing required (and chance of making bugs). There is some waste in the interpreter as is with the "mask" variables and some of the math routines, but not enough to make any of it easy.

loop                    mov     x,#0                    'reset x

                        rdbyte  op,pcurr                'get opcode
                        add     pcurr,#1



Well, have a look. If the collective power of the forum can get together and solve the problem, the Propeller community can benefit in a big way. Thanks for reading.

Cheers,
--Steve

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Propeller Pages: Propeller JVM

Comments

  • Dave HeinDave Hein Posts: 6,347
    edited 2010-06-21 17:34
    The square root operation uses 14 longs.· I have thought above using that space to implement an LMM interpreter within the Spin interpreter that could execute inline PASM code.· Maybe you could use that space.· Does anybody ever use the square root operator in Spin?

    You could continue to support the square root operator by implementing it in the EEPROM fetcher cog.

    Dave
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-06-21 17:36
    I don't have time to do this, however I took a quick look, and I have some suggestions:

    - remove WAITVID() ... no one will be able to write a video driver in spin until Prop2 at least

    FOR NOW

    - remove SQRT (someone else already suggested this)
    - remove STRSIZE
    - remove STRCOMP

    Note ops below all take 3 args, can share arg code

    - remove BYTEFILL
    - remove WORDFILL
    - remove LONGFILL
    - remove BYTEMOVE
    - remove WORDMOVE
    - remove LONGMOVE

    THEN

    - Use free space to make a small "FCACHE" area
    - Add a "fetch_byte_from_pcur_and_inc_pcur" subroutine
    - Add a TINY LMM interpreter, only complex enough to support three arguments to a function
    - Use LMM/FCACHE to put back in SQRT, STR*, *FILL, *MOVE

    The ops selected for conversion to LMM/FCACHE will only take a small performance hit, but save more than enough memory!
    Due to the limited scope of changes, only changed ops need thorough testing.

    ALSO

    Doing it this way allows limited in-line LMM code in Spin code [noparse]:)[/noparse]

    EDIT:

    Dave posted while I was writing this response off-line... I think it was David who I saw suggest removing SQRT for a small LMM loop before!

    My additional suggestions should free up more longs [noparse]:)[/noparse]

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
    My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
    and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
    Las - Large model assembler Largos - upcoming nano operating system

    Post Edited (Bill Henning) : 6/21/2010 5:42:59 PM GMT
  • jazzedjazzed Posts: 11,803
    edited 2010-06-21 18:25
    Dave Hein said...
    The square root operation uses 14 longs. I have thought above using that space to implement an LMM interpreter within the Spin interpreter that could execute inline PASM code. Maybe you could use that space. Does anybody ever use the square root operator in Spin?

    You could continue to support the square root operator by implementing it in the EEPROM fetcher cog.

    Dave

    Dave,

    This is a fair idea. I suppose some flags can be set for multiplexing features in the fetcher cog command mailbox without giving up too much address space. A separate COG to abstract away the hardware to me is ideal and is worth losing a COG over. Off-loading some code to the separate cog would allow some simple changes hopefully. The problem is in pulling apart the math spaghetti code to allow such a feature [noparse]:)[/noparse] All of it could be moved easily, but that would impact performance.

    Thanks,
    --Steve

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Pages: Propeller JVM
  • jazzedjazzed Posts: 11,803
    edited 2010-06-21 18:30
    Bill,

    You're right. Lots of things can be removed from the Spin definition, but I fear that would not serve the community very well. Longer term some redefinition would be useful especially to allow things like in-line LMM/PASM.

    If there was a way through a third-party compiler to specify SpinLight or whatever, then that might work, but I doubt BradC, mpark or the next guy/gal to do a compiler would be up with that. I have no idea how one would add LMM stuff at this point.

    IMHO, a Spin-clone that works by fetching from a separate COG would be better than redefining the language near term.

    Thanks for giving up some of your time,
    --Steve
    Bill Henning said...
    I don't have time to do this, however I took a quick look, and I have some suggestions:

    - remove WAITVID() ... no one will be able to write a video driver in spin until Prop2 at least

    FOR NOW

    - remove SQRT (someone else already suggested this)
    - remove STRSIZE
    - remove STRCOMP

    ...
    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Pages: Propeller JVM
  • Dave HeinDave Hein Posts: 6,347
    edited 2010-06-21 18:33
    Spin bytecode $3C is the only unused single-byte code available.· I think a "if_nc_z jmp #lmm_interp" instruction could be added somewhere after the "jF" label to jump to an LMM interpreter based on a $3C opcode.· Bill, I like your idea about moving the str*, *fill and *move instructions out of the cog.· They provide the biggest memory savings, they typically work on many bytes at a time.

    I'd also like to point out that bigger chunks of the Spin interpreter could be temporarily swapped out if a larger FCACHE area was needed.· After the large FCACHE is no longer needed the portion of the Spin interpreter that was swapped out would be copied back into the cog.

    Dave
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-06-21 19:51
    Hi Steve,

    I think you missed the part about adding them back in as LMM/FCACHE'd code [noparse]:)[/noparse]

    User level there would be no difference, except a tiny slowdown on the startup of those spin bytecodes!

    The Spin byte codes would be the same, except maybe a "LMM" intro byte.
    jazzed said...
    Bill,

    You're right. Lots of things can be removed from the Spin definition, but I fear that would not serve the community very well. Longer term some redefinition would be useful especially to allow things like in-line LMM/PASM.

    If there was a way through a third-party compiler to specify SpinLight or whatever, then that might work, but I doubt BradC, mpark or the next guy/gal to do a compiler would be up with that. I have no idea how one would add LMM stuff at this point.

    IMHO, a Spin-clone that works by fetching from a separate COG would be better than redefining the language near term.

    Thanks for giving up some of your time,
    --Steve
    Bill Henning said...
    I don't have time to do this, however I took a quick look, and I have some suggestions:

    - remove WAITVID() ... no one will be able to write a video driver in spin until Prop2 at least

    FOR NOW

    - remove SQRT (someone else already suggested this)
    - remove STRSIZE
    - remove STRCOMP

    ...
    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
    My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
    and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
    Las - Large model assembler Largos - upcoming nano operating system
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-06-21 19:53
    HI Dave,

    $3C could be it! for running in-line code

    Or, we could hijack "cognew(0,0)" to mean "go to next long boundry and exec LMM code"

    As for the replacements, same byte codes, except they would now run FCACHE'd LMM code

    You are very right about swapping out part of the interpreter... but I think removing the str* *fill and *move would make enough room for LMM/FCACHE, and as you noticed, they typically run a tight loop - perfect for FCACHE - which is why I selected them. Minimal execution time impact!
    Dave Hein said...
    Spin bytecode $3C is the only unused single-byte code available. I think a "if_nc_z jmp #lmm_interp" instruction could be added somewhere after the "jF" label to jump to an LMM interpreter based on a $3C opcode. Bill, I like your idea about moving the str*, *fill and *move instructions out of the cog. They provide the biggest memory savings, they typically work on many bytes at a time.


    I'd also like to point out that bigger chunks of the Spin interpreter could be temporarily swapped out if a larger FCACHE area was needed. After the large FCACHE is no longer needed the portion of the Spin interpreter that was swapped out would be copied back into the cog.



    Dave
    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
    My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
    and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
    Las - Large model assembler Largos - upcoming nano operating system
  • jazzedjazzed Posts: 11,803
    edited 2010-06-21 20:32
    Bill Henning said...
    Hi Steve,

    I think you missed the part about adding them back in as LMM/FCACHE'd code [noparse]:)[/noparse]
    You're right I did miss that. It doesn't look like a small change though.

    Seems like j3 and j4 case, lookup, lookdown and friends are more swappable
    and minimum code change friendly.

    I'm worried that any swapping will cause Spin to be not cognew-able though.

    Thanks,
    --Steve

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Pages: Propeller JVM
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-06-21 21:47
    cognew will need to launch bigspinvm anyway, so it would be compatible.
    jazzed said...
    Bill Henning said...
    Hi Steve,

    I think you missed the part about adding them back in as LMM/FCACHE'd code [noparse]:)[/noparse]
    You're right I did miss that. It doesn't look like a small change though.

    Seems like j3 and j4 case, lookup, lookdown and friends are more swappable
    and minimum code change friendly.

    I'm worried that any swapping will cause Spin to be not cognew-able though.

    Thanks,
    --Steve
    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
    My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
    and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
    Las - Large model assembler Largos - upcoming nano operating system
  • jazzedjazzed Posts: 11,803
    edited 2010-06-21 21:54
    Bill Henning said...
    cognew will need to launch bigspinvm anyway, so it would be compatible.
    Yes, it would work the first time for sure. What about for N cogs running the interpreter? Since the fetcher COG must be a blocking device anyway, there is no problem if the code is relocated as a service. With swap overlays, it seems more difficult to maintain re-entrancy.

    Thanks,
    --Steve

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Pages: Propeller JVM
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-06-21 22:25
    The replacement LMM code (for str* *fill etc) would be hub-resident I assume, and shared by all instances - not an issue.

    The fetcher cog can be lock()'d, however multiple cogs contending for the fetcher may slow things down too much - in which case bigspin would be for the "main" large business logic app.

    If several big apps are needed, it may make more sense to make bigspin multi-task several spin contexts, switching every X byte codes.

    Frankly, I would personally be THRILLED to just run ONE bigspin against VMCOG!
    jazzed said...
    Bill Henning said...
    cognew will need to launch bigspinvm anyway, so it would be compatible.
    Yes, it would work the first time for sure. What about for N cogs running the interpreter? Since the fetcher COG must be a blocking device anyway, there is no problem if the code is relocated as a service. With swap overlays, it seems more difficult to maintain re-entrancy.

    Thanks,
    --Steve
    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
    My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
    and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
    Las - Large model assembler Largos - upcoming nano operating system
  • localrogerlocalroger Posts: 3,452
    edited 2010-06-21 23:11
    jazzed, I'm working on a system similar to this to run code out of EEPROM or Bill's FlexMem, but I'm not targeting Spin. Instead I'm creating a new language that is similar enough to Spin that converting objects should be pretty easy, but with much more control for faster Hub RAM overlays, 512 kbyte code address space (4 1 mbit EEPROMS), reclaiming the RAM used for COGNEW, sneak-around access for private resources (so all your code can call the debug methods from anywhere if you want), and one-click downloads like we get with the proptool and BST. I am polishing up the foundation documents right now and was thinking of posting them soon.

    One thing I did which was very useful was I created an instruction class of the form %00_xx_xxxx which inserts the xx_xxxx into the ISNTR field of a synthetic PASM instruction, forces generic source and destination, and pops the arguments onto the stack, executes the synthetic instruction wc wz wr, and pushes the result and saves the flags for retrieval. This allows a relatively small amount of cog code to implement all of the pasm native math instructions in stack machine form. This frees up beaucoup cog RAM for the other primitives and EEPROM or FlexMem drivers.

    (Oh, and yes while it's a whole different way of optimizing I was inspired by LMM. Thanks, Bill.)
  • jazzedjazzed Posts: 11,803
    edited 2010-06-22 00:38
    @localroger, that sounds nifty. It's probably more interesting to the forum than Properller JVM code. Anything friendly to today's Propeller architecture and typical programming that allows for big programs would be a winner. Being able to use serial storage is a big win too and I can't wait to have an application that uses that increased power one way or another.

    Cheers,
    --Steve

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Pages: Propeller JVM
  • Cluso99Cluso99 Posts: 18,069
    edited 2010-06-22 03:30
    Here is my suggestion...

    Remove SQRT. You may also be able to remove some other functions. Use this space to implement what you need and get that working.

    Hopefully by then I can help and get my faster Spin Interpreter working. But remember, my version uses a hub table for decoding vectors. I can always regress my code to where I had it working but it is offline so I have to search my backups. The way I wrote the code was to first place an overlay handler into the code. That permitted me to free enough space to place the decode vectors in place (IIRC saved about 20+ instructions in speed), and remove the overlay again. Each phase was fully tested. Then I started speeding up each section of the code, verifying as I went. I included my debugger to verify results. I fed all variables into the section I sped up to verify it worked.

    FWIW I saved a huge amount of time in the maths section by utilising some of the saved space. Chip also found a faster way for one of the maths functions (divide or sqrt???).

    The last thing I did was make some changes to a group of functions by utilising the new space I found to unravel some complex code, and here is where I introduced a bug that I never really looked for. So it is only a matter of regression to the working version.

    Now, I certainly have enough space to implement LMM or overlays for more code functions. So I suggest you get them working and hopefully I can then help in getting the whole thing going. Otherwise I will dig out what I have done for you.

    The interpreter has $3C free. There are other·sub-codes that are not used. Use $3C for now anyway.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Links to other interesting threads:

    · Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
    · Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
    · Prop Tools under Development or Completed (Index)
    · Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
    · Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
    My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2010-06-22 09:12
    jazzed is on a roll!

    Ok, I'm keen to see 'Big Spin' too. And I think a fantastic use for such a program would be the HTML display that jazzed has proposed in another very active thread. I just posted a long reply over there.

    And, just as a general vibe of the thing, I think this is entirely possible without having to wait for the propII.

    But in essence - add a serial ram chip to a propeller, free up the entire hub ram for video buffer, drop a couple of instructions out of the spin interpreter cog to free up space for a serial ram driver, then instead of loading instructions from hub to interpret, load them from a serial ram chip, and re-add those removed spin instructions using some (possibly slow, inefficent code) from the serial ram chip. Later, to speed things up, think about clever cache code. But keep it simple to begin with - just load bytes one at a time from external serial ram instead of hub ram for the spin interpreter. Is this possible? Could the hardware be as simple as one 8 pin memory chip? Or adding as many of these 70c SPI 8kilobyte chips as needed? www.futurlec.com/Memory/23K640.shtml

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.smarthome.viviti.com/propeller

    Post Edited (Dr_Acula) : 6/22/2010 9:28:45 AM GMT
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-06-22 12:37
    Hi Dr_Acula,
    Dr_Acula said...
    jazzed is on a roll!

    Ok, I'm keen to see 'Big Spin' too. And I think a fantastic use for such a program would be the HTML display that jazzed has proposed in another very active thread. I just posted a long reply over there.

    And, just as a general vibe of the thing, I think this is entirely possible without having to wait for the propII.

    Right so far [noparse]:)[/noparse]
    Dr_Acula said...
    But in essence - add a serial ram chip to a propeller, free up the entire hub ram for video buffer, drop a couple of instructions out of the spin interpreter cog to free up space for a serial ram driver, then instead of loading instructions from hub to interpret, load them from a serial ram chip, and re-add those removed spin instructions using some (possibly slow, inefficent code) from the serial ram chip. Later, to speed things up, think about clever cache code. But keep it simple to begin with - just load bytes one at a time from external serial ram instead of hub ram for the spin interpreter. Is this possible? Could the hardware be as simple as one 8 pin memory chip? Or adding as many of these 70c SPI 8kilobyte chips as needed? www.futurlec.com/Memory/23K640.shtml

    Sorry, not quite.

    Adding just a serial ram chip would be quite slow for fetching byte codes directly, as it would take about 8us per byte with a small/simple implementation of SPI, and about 3.2us with a 10mbps timer based one - which would use up a timer from Spin, and need much more memory in the cog.

    What we were talking about was freeing enough space to add mailbox handling code, so it could talk to VMCOG, or enough space for Jazzed's caching eeprom code. Later there might be versions targeting particular memory designs directly.

    The 8kb chips are a bad deal on $/byte basis, the $1.50 32KB ones cost about half on a per byte basis.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
    My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
    and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
    Las - Large model assembler Largos - upcoming nano operating system
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-06-22 12:39
    I can't wait to see your faster Spin VM... I think it has a lot of promise. The external decode table is fine - 512 bytes of hub space is a small tradeoff for a speed increase!
    Cluso99 said...
    Here is my suggestion...

    Remove SQRT. You may also be able to remove some other functions. Use this space to implement what you need and get that working.

    Hopefully by then I can help and get my faster Spin Interpreter working. But remember, my version uses a hub table for decoding vectors. I can always regress my code to where I had it working but it is offline so I have to search my backups. The way I wrote the code was to first place an overlay handler into the code. That permitted me to free enough space to place the decode vectors in place (IIRC saved about 20+ instructions in speed), and remove the overlay again. Each phase was fully tested. Then I started speeding up each section of the code, verifying as I went. I included my debugger to verify results. I fed all variables into the section I sped up to verify it worked.

    FWIW I saved a huge amount of time in the maths section by utilising some of the saved space. Chip also found a faster way for one of the maths functions (divide or sqrt???).

    The last thing I did was make some changes to a group of functions by utilising the new space I found to unravel some complex code, and here is where I introduced a bug that I never really looked for. So it is only a matter of regression to the working version.

    Now, I certainly have enough space to implement LMM or overlays for more code functions. So I suggest you get them working and hopefully I can then help in getting the whole thing going. Otherwise I will dig out what I have done for you.

    The interpreter has $3C free. There are other sub-codes that are not used. Use $3C for now anyway.
    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
    My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
    and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
    Las - Large model assembler Largos - upcoming nano operating system
  • Dr_AculaDr_Acula Posts: 5,484
    edited 2010-06-22 13:54
    Hmm "Adding just a serial ram chip would be quite slow for fetching byte codes directly, as it would take about 8us per byte with a small/simple implementation of SPI"

    Yes, that is a bit slow. What is that, 125khz? Yikes!

    Ok, vmcog is the smart long term answer. Short term, I wonder if I have enough hardware sitting in front of me right now with a 512k ram chip that can access at about 3.8Mhz per byte and has a driver that is about 20 longs of pasm?

    SpinXMM certainly makes for interesting reading. Is the byte read code this bit?
    loop                    mov     x,#0                    'reset x
    
                            rdbyte  op,pcurr                'get opcode
                            add     pcurr,#1
    
    



    and could you just call a subroutine that reads from external ram instead? I'm sure it isn't that easy though. Got to find all the pcurr for instance.

    There probably is a catch somewhere. How far can calls go - 32k, 64k, something else?

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.smarthome.viviti.com/propeller
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-06-22 14:13
    Yep, I do believe you have enough hardware [noparse]:)[/noparse]

    The full 512KB could be supported for code pretty easy, just find all references to pcur, and modify as needed.

    I think a great first step would be a bigspin that just supported a large binary image, used for code/constants/initial data values, and only executed code from xmm, using the hub for stack and variables.

    The reason I suggest that as a first step is that then (for now) RD{BYTE|WORD|LONG} WR{BYTE|WORD|LONG} as well as VAR and stack references don't have to change much (except for initializing VAR sections from data in the code image)

    After that worked, and people got used to the possibility of very large spin programs, VAR could be moved to xmm-only, allowing for HUGE arrays. Frankly, it may be worthwhile to ONLY move arrays, and leave simple global variables in the hub.

    DAT sections will be a bit of a pain in xmm until cognew etc get modified to only expect them in xmm, copying the code referenced by cognew to the hub before starting it.
    Dr_Acula said...
    Hmm "Adding just a serial ram chip would be quite slow for fetching byte codes directly, as it would take about 8us per byte with a small/simple implementation of SPI"

    Yes, that is a bit slow. What is that, 125khz? Yikes!

    Ok, vmcog is the smart long term answer. Short term, I wonder if I have enough hardware sitting in front of me right now with a 512k ram chip that can access at about 3.8Mhz per byte and has a driver that is about 20 longs of pasm?

    SpinXMM certainly makes for interesting reading. Is the byte read code this bit?
    loop                    mov     x,#0                    'reset x
    
                            rdbyte  op,pcurr                'get opcode
                            add     pcurr,#1
    
    



    and could you just call a subroutine that reads from external ram instead? I'm sure it isn't that easy though. Got to find all the pcurr for instance.

    There probably is a catch somewhere. How far can calls go - 32k, 64k, something else?
    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
    My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
    and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
    Las - Large model assembler Largos - upcoming nano operating system
  • lonesocklonesock Posts: 917
    edited 2010-06-22 14:36
    Semi-off-topic, but out of curiosity, if you are going to remove the SQRT code from the interpreter, is there any way to just shoe-horn in a tiny LMM kernel into the 12(?) longs currently used by SQRT? If that is the case, you could just call SQRT( address of LMM code ).

    Jonathan

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    lonesock
    Piranha are people too.
    (geek humor escalation => "There are 100 types of people: those who understand binary, and those who understand bit masks")
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-06-22 14:55
    That is exactly what I was proposing a bunch of messages up - except I would suggest tossing more stuff out, and re-implementing as LMM:

    - SQRT (rarely if ever used)
    - STR* (tight loops, perfect for FCACHE)
    - *FILL (tight loops, perfect for FCACHE)
    - *COPY (tight loops, perfect for FCACHE)

    Tossing all of the above out should leave plenty of room for a small LMM interpreter, a small FCACHE area, and VMCOG interfacing code [noparse]:)[/noparse]
    lonesock said...
    Semi-off-topic, but out of curiosity, if you are going to remove the SQRT code from the interpreter, is there any way to just shoe-horn in a tiny LMM kernel into the 12(?) longs currently used by SQRT? If that is the case, you could just call SQRT( address of LMM code ).

    Jonathan
    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
    My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
    and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
    Las - Large model assembler Largos - upcoming nano operating system
  • jazzedjazzed Posts: 11,803
    edited 2010-06-22 14:57
    Bill Henning said...
    Yep, I do believe you have enough hardware [noparse]:)[/noparse]

    The full 512KB could be supported for code pretty easy, just find all references to pcur, and modify as needed.
    That's already done in the file I posted. The interpreter works, but is obviously untested beyond 64KB.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Propeller Pages: Propeller JVM
  • Bill HenningBill Henning Posts: 6,445
    edited 2010-06-22 15:00
    Oh I did not realize that! I just gave a cursory read of the code.

    REALLY NICE WORK STEVE!

    I can see I need to play with this once I am back from UPEW...
    jazzed said...
    Bill Henning said...
    Yep, I do believe you have enough hardware [noparse]:)[/noparse]

    The full 512KB could be supported for code pretty easy, just find all references to pcur, and modify as needed.
    That's already done in the file I posted. The interpreter works, but is obviously untested beyond 64KB.
    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    www.mikronauts.com E-mail: mikronauts _at_ gmail _dot_ com
    My products: Morpheus / Mem+ / PropCade / FlexMem / VMCOG / Propteus / Proteus / SerPlug
    and 6.250MHz Crystals to run Propellers at 100MHz & 5.0" OEM TFT VGA LCD modules
    Las - Large model assembler Largos - upcoming nano operating system
  • Cluso99Cluso99 Posts: 18,069
    edited 2010-06-23 04:19
    lonesock: I was only proposing to ditch the sqrt and others to get the basic extensions working while I find the time to dig out a working version of the faster interpreter that has free space in it. I was seeing ~25% speed improvement, but I was going for more. Nothing has been profiled properly.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Links to other interesting threads:

    · Home of the MultiBladeProps: TriBlade,·RamBlade,·SixBlade, website
    · Single Board Computer:·3 Propeller ICs·and a·TriBladeProp board (ZiCog Z80 Emulator)
    · Prop Tools under Development or Completed (Index)
    · Emulators: CPUs Z80 etc; Micros Altair etc;· Terminals·VT100 etc; (Index) ZiCog (Z80) , MoCog (6809)·
    · Prop OS: SphinxOS·, PropDos , PropCmd··· Search the Propeller forums·(uses advanced Google search)
    My cruising website is: ·www.bluemagic.biz·· MultiBlade Props: www.cluso.bluemagic.biz
Sign In or Register to comment.