Announcing CLMM (pronounced as Clem) Execute Code from the CLUT

ctwardell · 2012-12-16 23:31

The P2 continues to bear fruit!

Attached is a sample file containing an execution engine that executes code from the CLUT giving an increase in the amount of code that can run within a COG.

In order to not take advantage of the preview that Parallax has given us, the code is released under the MIT license.

What I call CLMM Engine#1 is in the attached spin file.

The CLMM #1 has three NOPs between the POPAR instruction that reads the instruction to execute and the slot where the instruction will be executed.
This allows timing that supports using the SETSPA, ADDSPA, and SUBSPA to do absolute and relative branching.

The CLMM #2 will use a standard loop instead of REP so that it can run while multitasking.

The CLMM #3 will use a tighter loop that precludes using the SETSPA, ADDSPA, and SUBSPA but executes inline code faster.

Let's have fun with this and get back to exploring the P2.

Chris Wardell

CLMM_Engine1.spin

Please also see the following thread:

http://forums.parallax.com/showthread.php?144675-Challenge-Execute-Prop-II-code-from-it-s-CLUT-space-(CLMM)

ctwardell · 2012-12-16 23:35

Reserved for updates.

I'll get better docs out, just wanted this out under MIT ASAP.

I've been seeing a lot more possibilities as I work on examples, for instance a loop that alternates between using SPA and SPB would support two "threads" while consuming a single COG thread. So you could get 5 threads going, 4 of the new COG multitasking threads with CLMM running on one of those providing two threads for a total of 5.

You could also provide for "soft interrupts" by testing a condition in the CLMM loop while running on a thread using SPA and the interrupt handler would change to using SPB, the end of the handler would switch back to SPA.

C.W.

ctwardell · 2012-12-17 01:17

Reserved for additional code posts.

12/17/2012 - I believe Engine #4 Rev 2 is the most stable and should be used for testing as of now, I'm seeing a few pipeline issue on the others as I've made some changes.

Engine #2
This engine supports multi-tasking by using jmp instead of rep and branching via direct SP manipulation.

Test 1, simple test of two engines using SPA and SPB
CLMM_Engine2 Test1.spin

Engine #3
Faster engine forgoes direct SP manipulation in exchange for improved speed, uses JMP instead of REP for potential multi-tasking support.

Test 1, simple test of two engines using SPA and SPB
CLMM_Engine3 Test1.spin

Engine #4
Faster engine forgoes direct SP manipulation in exchange for improved speed, uses REP instead of JMP for even more speed at the expense of multi-tasking support.

Test 1, simple test of two engines using SPA and SPB
CLMM_Engine4 Test1.spin

Rev 2 Test 1, added move immediate and added initial clmm pipeline flush.
CLMM_Engine4 Revision 2 Test1.spin

Heater. · 2012-12-17 01:21

That's interesting, 5 threads is great and all but it also means you can have one "native" thread and two CLMM threads. The native one running as fast as possible.

ctwardell · 2012-12-17 01:30

Heater. wrote: »

That's interesting, 5 threads is great and all but it also means you can have one "native" thread and two CLMM threads. The native one running as fast as possible.

Yes, that too. Lot's of ways to slice and dice.

I'm not sure how to name all the variants of "engines".

C.W.

ctwardell · 2012-12-17 06:37

Back at it after a little sleep.

I'm going to add a Move Immediate instuction to 1 through 4 and the missing pipeline flush to 3 & 4.

C.W.

Sapieha · 2012-12-17 08:19

Hi.

Now It is positive competition instead for dumb war.

Thanks

jazzed · 2012-12-17 09:18

Nice work C.W.

Is it possible to fill the CLUT from HUB with instructions and execute them? Trying to make a better on cog cache.

Is threading impossible with this model? Fetching and executing in separate threads could offer some advantage for LMM style programs.

ctwardell · 2012-12-17 10:01

Thanks Jazzed.

Yes you can load the CLUT with any source of instructions you wish. Once one of the CLMM engines is chosen a user will need to make use of the COG addresses used to make extension function and return calls so the CLUT code can be encoded accordingly.

I think Clusso has some code he worked out with Chip to fill the CLUT for overlays, that may be what you are looking for. Clusso doesn't have a DE0-Nano yet, so he can't work with it directly yet.

I think we will be able to use threading with one of the versions that uses JMP to close the CLMM loop instead of REPS, I just need to work out some pipeline issues.

C.W.

jazzed · 2012-12-17 10:26

To go further, what I'm interested in is having a thread that monitors a FIFO water-line that will fill the input side with more code assuming that the program hasn't jumped. If it has jumped, then such an operation would not be necessary. Also, the overlay load I remember seeing used REPS (or REPD) which may present threading problems as you mentioned.

The CLUT is certainly turning out to be a great asset. I remember Chip discussing it and all the features last year as well as adding push/pop access. In many ways it's a generic array data structure. Thank goodness we don't have to use self modifying code anymore for such basic data access within the COG. I can't wait to use it as a packet buffer.

Cluso99 · 2012-12-17 14:12

There are so many uses I see for the CLUT. It certainly is not a waste in cogs that don't use video.
PUSH & POP are really MOV instructions after all, just that they use SPx as the one of the addresses.

Next we will be asking Chip for more SPx registers, and more INDx and PTRx registers. Oh, and 2KB. I didn't see this as being the first item we'd be asking for.

Announcing CLMM (pronounced as Clem) Execute Code from the CLUT

Comments