Large Memory Model
Title:Large Memory Model
Author:SSteve
Published:Tue, 15 Mar 2011 08:05:09 GMT

Large Memory Model

The Propeller comes with two in built programming environments:

Large Memory Model (LMM) is an alternative programming environment suggested by Bill Henning. It lies somewhere between the concepts for native assembler and a virtual machine. It's a minimal virtual machine that runs on instructions that are mostly a 1:1 mapping with the native assembler instructions of the propeller. The instructions reside in Hub memory but are copied one by one into Cog RAM to be executed. This means that the code can amount to nearly 32KB of code, which is up to 8K instructions.

The basic virtual machine consists of just 4 lines of assembler:
nxt    rdlong  instr,pc
       add     pc,#4
instr  nop                ' placeholder!
       jmp     nxt
As can be seen, at nxt an instruction pointed to by the Program Counter (pc) is copied to Cog memory at instr (replacing the nop). Then after incrementing the program counter to the next long, the instruction is executed. This loop takes 32 cycles, and so is 8 times slower than native assembler for executing code, but several times faster than spin. However the loop can be unrolled to make it faster - Bill suggests unrolling 4 times to get it just 5 times slower than native assembler.

Additionally note that the instruction executed could call a routine that lies elsewhere in Cog RAM. This technique can be used to create extra pseudo instructions that don't exist in the native assembler set. Indeed some are needed for jumping and calling to LMM code elsewhere in Hub Memory. Bill suggests these pseudo instructions:
* FJMP addr       - calls routine that replaces PC with long at PC, then jumps to nxt
* FCALL addr      - increments SP by 2, replaces PC with long at PC after it saves
                    PC+4 at SP
* FRET            - loads PC from word at SP, decrements SP by two
* FBRC addr       - branch to far address if Carry flag is set
* FBRNC addr      - branch to far address if Carry flag is clear
* FBRZ addr       - branch to far address if Zero flag is set
* FBRNZ addr      - branch to far address if Zero flag is NOT set
* FCACHE [code] 0 - copies a block of code, terminated by NULL into a cache area in
                    Cog memory. Then executes it.

Implementations

At the time of writing, a few people have experimented with the scheme and written some small LMM programs which do work. However there is as yet no definitive version of the scheme, and various tools implement slightly different versions of LMM code.
Bill Henning is working on an LMM Macro Assembler/Linker.
ImageCraft have implemented LMM in their C compiler.
Ross Higson has implemented LMM in his Catalina C compiler
More details and much discussion can be found in the original thread (old forum).

LMM Kernel Specification - Pacito version


Ale500 (aka Pacito) has drafted a Specification for a LMM usable even beyond the 32 kbytes imposed by the HUB RAM using an external means of accessing more storage.

The specification for a Large Memory Model with access up to 512 klongs of code and 512kbytes of data can be found here.

LMM Kernel Specification - AiChip Industries version


AiChip Industries have created a Large Memory Model Virtual Machine implementation ( LmmVm ) which is designed to be usable with the Propeller Tool. Some native Propeller instructions need to be modified for LMM usage but those changes are designed to be achievable by hand rather than requiring any additional tools.

Details of the Large Memory Model with access up to 32klongs of code or data and using cog-based registers can be found here.

LMM Kernel Specification - Phil Pilgrim (PhiPi) version


Phil Pilgrim (PhiPi) examined how the main LMM loop is usually written and determined a way to overcome that loop's failure to hit 'hub access sweet spots' which requires the LMM loop to be un-rolled to achieve maximum throughput and lowest speed degredation when compared to native PASM execution.

The mechanism used rests upon reversing the LMM code so addresses of LMM Code reduce sequentially rather than increment and use a clever sequence of PASM instructions to maximise LMM throughput.

Details of the Reversed Large Memory Model can be found here.

Thumb-style Code

An extension to the Large Memory Model scheme is a Thumb-style ( similar to that developed by ARM(R) ) representation where 16-bit ( word ) codes are used to represent a subset of the native 32-bit Propeller instructions. Word instructions are fetched from hub memory, decoded, expanded and then executed.
While consequently slower in execution than the Large Memory Model, up to 16K instructions can be held in hub memory. Execution should also be faster than when interpreting any arbitrary bytecode such as that used by Spin.

AiChip Industries has proposed a Thumb VM implementation and produced proof of concept code. Without appropriate development tools the use of Thumb VM is not practical at the current time.

AiChip Industries' Thumb VM Proposal can be found here.

The original discussion thread and proof of concept can be found here.