LMM-Thumb
ImageCraft
Posts: 348
OK, all ye smart Propeller people. As you may know, we are prototyping on some ideas on XMM-C (external memory model) for Propeller (a plug in module with 128K bytes to mutli-megs of memory). The compiler change for that is actually minimal, as in close to zero. The trick is the hardware design.
Any case, I am also thinking going the other direction - the C compiler emitting a compact form of LMM. There was some ideas about LMM "Thumb" mode being bantered around before. The goal is of course to make the resulting program more compact, at the cost of higher interpretation overhead. As we have a 5x-10x speed improvement over Spin (and that is not counting using FCACHE), if we decrease the code size with a proportional slowdown, it may still be very attractive to some users. The normal LCC mode is of course still available.
In fact, it's probably a case that the NC non-commercial use version only supports LMM-Thumb, and the STD version supports both.
Just thinking aloud right now. Appreciate any comments and feedback, and also any technical insight on the LMM-Thumb design.
// richard
Any case, I am also thinking going the other direction - the C compiler emitting a compact form of LMM. There was some ideas about LMM "Thumb" mode being bantered around before. The goal is of course to make the resulting program more compact, at the cost of higher interpretation overhead. As we have a 5x-10x speed improvement over Spin (and that is not counting using FCACHE), if we decrease the code size with a proportional slowdown, it may still be very attractive to some users. The normal LCC mode is of course still available.
In fact, it's probably a case that the NC non-commercial use version only supports LMM-Thumb, and the STD version supports both.
Just thinking aloud right now. Appreciate any comments and feedback, and also any technical insight on the LMM-Thumb design.
// richard
Comments
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
www.mikronauts.com - a new blog about microcontrollers
Would this external memory be addressed as the larger internal memory when Prop II
comes out or is more like using an SD card for memory?
Also, is there a way to generate an array of 512 longs that can be used
to run a cog without LMM? I believe constructing such an array is one of the
ways to run independant cog code. Help file quote:
Could ICC not generate this table from C code without LMM parts? If I see how much code can be
generated with SX/B in a SX18/28 page (512 words, comparable to 512 longs in a cog) then that
allows us to write cog code without using assembly or manually extracting hexcode from a binary
image created by proptool (and where still assembly is necessary).
regards peter
Also, remember that a Cog 512 word "page" is not quite the same as code page in other micros. A Cog takes time to load 512 words. It's not a paging scheme per se.
***
As for XMM, I won't want to speculate too much on how things will look for Prop II as I have no inside info on Prop II. However, I suspect the changes between LMM to XMM is minimal, and the change from LMM/XMM to Prop II is also minimal, but that's just a guess.
My work with overlays leads me to believe that will be faster than LMM execution. I was achieving the sweet spot for loading, including variable length.
However, I don't know anything about your code and what could be achieved. Suffice to say that the code footprint stored externally will have more impact than anything else so this is where the work will need to be done.
So I guess what I am saying is
1. External code must be small (highest priority)
2. A mix of cog interpreter, overlay and LMM for the code to interpret the bytecode/wordcode or whatever.
What I see with the spin interpreter is that a lot of time is spent pushing and popping data. The architecture doesn't allow this to be circumvented, and pushing and popping is done to hub so that impacts performance immensely. (I am not criticising spin - we are just pushing the envelope).
The ability to use ICC to compile driver code could be used for both versions of Prop in the future. How compact would ICC compiler native PASM code be ?
Ron
[noparse][[/noparse]quote] I have also written a fast LMM style interpreter and overlay loader. My ClusoDebugger also uses an LMM model.
Have you released any copy or revision this work on the forum?
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH - Electronics: Engineer - Programming: Professional
Post Edited (Quantum) : 11/12/2008 9:18:24 PM GMT
PASM and SPIN debug with Zero Footprint http://forums.parallax.com/forums/default.aspx?f=25&m=290946&p=2
The Overlay loader is here
Assembly Oververlay Loader for Cog FAST (renamed & released) http://forums.parallax.com/forums/default.aspx?f=25&m=272823
I have not released the ClusoInterpreter yet :-( but here is some discussion
Spin Interpreter - Faster??? http://forums.parallax.com/forums/default.aspx?f=25&m=273607&p=1&ord=d
▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
JMH - Electronics: Engineer - Programming: Professional