Shop OBEX P1 Docs P2 Docs Learn Events
Virtual interrupts using LMM? — Parallax Forums

Virtual interrupts using LMM?

RaymanRayman Posts: 14,162
edited 2008-05-13 12:09 in Propeller 1
Just out of curiosity, I wonder if one could add interrupts to a LMM code...· I haven't looked in depth at the LMM, but it seems that the code doing the fetching could check the state of something or other to indicate an interrupt.·
Then, maybe it could store the flag states and source location and jump to a different source code location?

Comments

  • jazzedjazzed Posts: 11,803
    edited 2008-04-30 19:14
    Completely doable it seems.
    Of course one could add more LMM kernel services as necessary that would
    run natively; having an asynchronous mechanism would also be useful.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley

    Traffic is slow at times, but Parallax orders·always get here fast 8)
  • hippyhippy Posts: 1,981
    edited 2008-05-01 01:35
    Agreed; completely doable.

    Two options - Add a check on an I/O pin or hub memory location for every fetch/execute of an LMM instruction but that could halve the speed of LMM execution, or only check for interrupt after a jump into the LMM kernel which would give fastest LMM linear code execution but longer latency. Most LMM code will use a jump into the LMM kernel but not all ( ie, immediate adds/subs on the LMM PC to do branching wouldn't ).
  • jazzedjazzed Posts: 11,803
    edited 2008-05-01 04:47
    Well I thought you were talking about "software" interrupts like in the old PC BIOS [noparse]:)[/noparse]
    since we can't really do a real-time PC load on a pin-state change.

    Of course having IPC·for embedded apps given a multi-tasking ability would be useful.

    But yes one could have a configurable "Native" poll routine that would at least be able
    to quickly check for state changes (less than 200us??), and raise status for the LMM.
    Seems like the end jump of the unrolled kernel would be a fair place to put the flag
    check and context switch (if enabled) since there will be a sync loss there anyway.

    I've attached a version of an interrupt module.· It allows for ISR, IMR, and IPAR.
    I know you guys have been doing this longer than me, so suggestions are welcome [noparse]:)[/noparse]

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley

    Traffic is slow at times, but Parallax orders·always get here fast 8)

    Post Edited (jazzed) : 5/1/2008 4:53:42 AM GMT
  • Agent420Agent420 Posts: 439
    edited 2008-05-01 10:46
    Sorry to act like a total noob here, but what does LMM refer to?· Did a quick search and looked in the manual, didn't find it.· I'll probably be emabarrased by the answer blush.gif

    Thanks for the example code - been programming since 6502, but the Propeller is new to me.
  • RaymanRayman Posts: 14,162
    edited 2008-05-01 10:52
    LMM (Large-memory-model) is a trick to run assembly code from HUB ram. Cogs only have 2k, but the hub has 32k, much more memory for code. The LMM pulls instructions from the hub and executes them... It's slower than native assembly, but faster than SPIN. It's the basis for the C compiler that Imagecraft is working on and a few other compiler projects too...

    Post Edited (Rayman) : 5/1/2008 12:57:35 PM GMT
  • Bob Lawrence (VE1RLL)Bob Lawrence (VE1RLL) Posts: 1,720
    edited 2008-05-01 11:32
    Agent420,

    You can read more about with the URL below:

    propeller.wikispaces.com/Large+Memory+Model

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    Aka: CosmicBob
  • jazzedjazzed Posts: 11,803
    edited 2008-05-01 11:55
    Welcome Agent420. I bet lots of us cut our teeth on old CPUs.
    I built a 6502 based computer on a bread-board in 1987 [noparse]:)[/noparse]

    Looking back·that example, the measured loop time is about 600ns at 80MHz.
    Moving IPAR & IMR settings out of the loop and using stop/start instead of
    dynamic pin and IMR selection makes idle loop-time ~200ns (250ns measured).

    DAT
    pinMonEntry   org       0
                  mov       statptr, par    
                  mov       maskptr, par
                  add       maskptr, #4
                  mov       pinsptr, par
                  add       pinsptr, #8
                  'or        dira, #1               ' for measurement only
                  
    pinMonLoop    rdlong    IPAR, pinsptr           ' loop here for 600ns measured
                  andn      dira, IPAR              ' set input pins
    
    pinMonLoop2   rdlong    IMR,  maskptr           ' get IMR ... loop here 400ns
    
     
    pinMonLoop3   mov       IPAR, ina               ' get input 200ns loop with no event
                  and       IPAR, IMR wz            ' mask bit state
            if_nz wrlong    IPAR, statptr           ' if an event add 90 to 280 ns 
                  'xor       outa, #1               ' adds 50ns
                  jmp       #pinMonLoop3            
    
    

    By the way, I'm finding C with LMM to be 4x to 8x faster than equivalent spin without function calls.



    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    jazzed·... about·living in·http://en.wikipedia.org/wiki/Silicon_Valley

    Traffic is slow at times, but Parallax orders·always get here fast 8)
  • jazzedjazzed Posts: 11,803
    edited 2008-05-09 16:29
    Here is how to make LMM virtual interrupts work with ImageCraft C:

    ICC LMM USERn variables are variables that allow you to communicate with the kernel. These are also available if you want to use global variables with in-line asm.

    If USER0 is an interrupt "vector" flag and USER1 contains a first level callback function address, kernel code can load this first level callback function after saving the last PC on the stack. This callback code kernel function call is similar to ficall, but I found that using ficall does not work.

    The KEY FACTOR in making this work is to have the first level callback function that the kernel invokes do nothing but call another function. This forces the kernel to maintain context. If you have any variables at all in the first level callback, the system fails since the register context is not preserved.

    As far as performance goes, the latency caused by checking the interrupt vector flag at the end of the unrolled instruction fetcher is trivial. The more unrolled the better for kernel processing time in the face of an interrupt storm. The interrupt latency suffers with larger unroll count, but that is not likely to be critical since we can use the propeller cogs for handling tasks in parallel (not referring to SMP just PASM devices).

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    jazzed

    Post Edited (jazzed) : 5/9/2008 4:42:54 PM GMT
  • RaymanRayman Posts: 14,162
    edited 2008-05-09 16:35
    Very nice! So, I guess it could actually work! I can see instances where people might want an interupt ability... Especially when porting code from a different system...
  • ImageCraftImageCraft Posts: 348
    edited 2008-05-09 20:11
    We can easily add few more registers to the map for this purpose smile.gif I want to think about this though, as the LMM kernel will at some point, have hooks for debugging too, and it will need similar mechanism.

    Thank you for some rough speed guideline too. Now we can say something concrete rather than "much faster" than Spin. This jives with LMM C is about 10% to 25% of native code, and Spin is about 40x to 80x slower than native code. The kicker is that we can improve on the LMM C performance. This is the low water mark. We will implement FCACHE too, and the loops like strcpy etc. will scream...

    Lets see... so we effectively have a set of eight 8-12 Mhz 32 bit processing engines, with potential to run COG native routines at a separate COG at native 80 Mhz speed. Not bad at all...
  • jazzedjazzed Posts: 11,803
    edited 2008-05-09 20:28
    Actually we need three registers for "soft interrupts".

    1. the "vector" register.
    2. first level callback function address (unless you can make a configurable library hook).
    3. global interrupt enable/disable - also used for "hardware" interrupts.

    The "hardware interrupts" using PASM like shown earlier can probably use a special "vector" cookie like -1 rather than some function address vector. The cookie would have to be reserved if you make a library hook.

    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    jazzed
  • jazzedjazzed Posts: 11,803
    edited 2008-05-13 03:00
    Find attached a working example of LMM kernel interrupts with ImageCraft C.
    This was a pain to get working [noparse]:)[/noparse]

    If anyone wants to formalize the method by requesting special variables, speak up.
    Richard has already shown willingness to adapt. I think feedback on a solution
    with changes if necessary would help cement the idea.

    Interrupts are the first step to pre-emptive mulitasked O/S. With the left over
    cogs it would be great to create some form of symmetric multiprocessing.
    Well probably exhaust memory before that point though [noparse]:([/noparse]


    ▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
    jazzed
  • hippyhippy Posts: 1,981
    edited 2008-05-13 12:09
    jazzed said...
    Well probably exhaust memory before that point though [noparse]:([/noparse]

    I've been reflecting on that. The limitations for ICC apply equally to any LMM which interprets 32-bit wide PASM code.

    For a Prop I there can be 8K PASM instructions which includes the LMM interpreter itself, the Prop II will allow 64K PASM instructions. My gut feeling is that 64K will be enough for most applications, so it's only the Prop I which could have problems and, even there, 8K may be good enough for many things but note that any data RAM used comes from that same 8K/64K longs.

    One option is to go for an Extended-LMM where generated PASM is held in external I2C Eeprom or even SD Card. That adds overhead in a number of places and will slow execution down but should be workable.

    I've only looked at execution flow not data handling so far but it certainly seems feasible. With the least kernel changes ICC supports 64K PASM instructions as is, with minor changes that rises to 256K PASM instructions, more complicated changes and that goes up further.

    This is based on using ICC with a home-brew assembler and linker working from the generated .s files to create an entirely 'run from Eeprom' linear image. More complicated solutions such as overlay would likely require changes within ICC code generation

    The big overhead with Eeprom stored code is fetching time. That can be mitigated by using a separate Cog to fetch and cache Eeprom code as required, and it shouldn't overly complicate the kernel itself.

    Is it practical ? No idea. In the worse case of 1MHz I2C it could reduce the Propeller to being a 0.1 MIPS processor, but that could still be entirely acceptable to many if they insist on using a Prop I and want more than 8K.
Sign In or Register to comment.