Self-modifying Interpertive Code?

Bruce Bates · 2006-02-22 14:51

Folks -

Perhaps someone could clarify something for me. I just went through the past messages, and found two responses which seem to conflict with one another, at one level or another:

One poster seemed to indicate that SPIN was an interpreter, or that it was an interpretive language. Obviously, the assembler language emits native code.

Another poster seemed to claim that self-modifying code was possible, even as much as I SHUDDER to think of the HORRORS inherent in that concept. NOTHING in the WORLD is more difficult to troubleshoot than self-modifying code written by someone who doesn't even know what those words mean!

In my experience (as incomplete as it may be) interpreters, by their very nature, don't generally support self-modifying code, as a programming modality. It can happen in ERROR, but it isn't generally isn't supported as a programming methodology. Perhaps self-modification is only possible as a programming methodology (forget errors) using the Propeller Assembler?

Comments would be appreciated.

Regards,

Bruce Bates

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔


Post Edited (Bruce Bates) : 2/22/2006 3:06:54 PM GMT

Jon Williams · 2006-02-22 14:58

The Propeller ROM contains the Spin interpreter which is loaded into a Cog that will execute Spin code. Interestingly, Spin code can be used to launch an assembly program into another Cog; in that case the Cog running assembly code does not load the Spin interpreter.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Jon Williams
Applications Engineer, Parallax

Bruce Bates · 2006-02-22 15:04

Jon -

Are you saying or implying that program "chaining" is possible, or just that SPIN can pass off control·to an (extant and/or operational) assembler program?

The concept here is going to be the steepest part of the learning curve!

Regards,

Bruce Bates

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Paul Baker · 2006-02-22 16:00

Chaining is possible, in fact a cog can direct itself to dump its interpretor, load and run assembly. I am fairly certain that the reverse is also possible. Self modyfying code is possible, since all memory is von numan, theres nothing preventing you from changing a code instruction to another instruction. This will be more easily accomplished in assembly because spin bytecodes will not be published (too much of a tech support nightmare). Andre has created an overlay module to cope with swaping in extra pixel data in his machine, so different levels can have a different tileset, but there is nothing preventing you from overlaying additional code from external memory.

But if you use the IDE provided, follow generally accepted programming practices (such as only writing data·to predefined variables), you arn't going to wander into the realm of self-modifying code. I only mentioned it as a capability·for the very advanced programmers that can adequately implement the feature.

Oh wait, I think I understand the jist of your question, in all but the most extreme examples, chaining spin code is not nessesary because the spin byte codes reside in the hub memory which should be large enough for very lengthy programs.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Post Edited (Paul Baker) : 2/22/2006 4:09:04 PM GMT

Tracy Allen · 2006-02-22 18:49

There is self-modification built into the assembler, in the way that subroutines are called. It is a stack-free mechanism. For example, the JMPRET D,S instruction operates by storing a JMP PC+1 into the address held in D (9 bits of the machine instruction) and then jumps to the address held in S (another 9 bits in the machine instruction). In most computers you would have a RET instruction with the return vector pulled off a stack. However in Propeller asm, the code is actually modified in place to make it a jump back to the original location.

The assembler also has neumonics for CALL and RET, but if I understand it correctly, the assembler converts those to the same mechanism using the JMPRET D,S primitive. So self-modification is deeply engrained in the instruction set. I don't think there are any native stack instructions, and no interrupts of course.

The Spin language does have a stack, or multiple stacks for different cogs, but those are higher level constructs created on the fly.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

Dave_Bell · 2006-02-22 19:09

Then the code Martin posted earlier for blink_led is confusing:

Martin Hebel said...
' assign variable space for each processor
VAR
long stack0[noparse][[/noparse]20]
|
|

' start the same code in each hub controlling a different LED
PUB start
cognew(blink(23,1), @stack0)
|
|
etc.

I would assume this sets up a 20-word deep stack for each COG, then associates a process with each one, no?

Dave

Gadgetman · 2006-02-22 19:42

So...
Instead of storing the RET address on the stack, it stores it dynamically in the 'RET' instruction?

It doesn't save any memory, as the Address of the instruction to be modified must exist in the calling instruction, but it does do away with the need for a conventional Stack, at least for CALL type instructions.
And with the entire RAM for each COG addressable as registers(If I haven't misunderstood completely), there's no real need for a Stack when using Assembler, at least if you're careful of how you assign variables.

Still, I wouldn't really call that 'self-modifying code' - as it only inseerts the return-address, not changes what the instruction does - more 'run-time address-resolution'
(It sounds good, so why not? )

I can't wait to play with it...

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Don't visit my new website...

Paul Baker · 2006-02-22 19:49

Dave_Bell said...
Then the code Martin posted earlier for blink_led is confusing:

Martin Hebel said...
' assign variable space for each processor
VAR
long stack0[noparse][[/noparse]20]
|
|

' start the same code in each hub controlling a different LED
PUB start
cognew(blink(23,1), @stack0)
|
|
etc.

I would assume this sets up a 20-word deep stack for each COG, then associates a process with each one, no?

Dave

Do not confuse assembly with spin, Tracy is taking about the assembly method of "calling", Martin's LED example is spin code, which does require a stack space.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Martin Hebel · 2006-02-22 20:00

And perhaps not even stack space in that sense of the work, but variable storage locations for object to use. The size has to be large enough to accomodate variables created in the object. (as I understand it).

-Martin

Dave_Bell · 2006-02-22 20:02

Ah - that makes sense, then.
I'm trying to recall legacy computers that took similar advantage of long instruction words, incorporating conditional bits on every instruction. This concept of storing the return address in the JUMPRET instruction also sounds familiar, but i'm afraid mym memory is fading about as fast as new technology is intoduced!!

Dave

Paul Baker · 2006-02-22 20:04

Gadgetman said...
So...
Instead of storing the RET address on the stack, it stores it dynamically in the 'RET' instruction?

It doesn't save any memory, as the Address of the instruction to be modified must exist in the calling instruction, but it does do away with the need for a conventional Stack, at least for CALL type instructions.
And with the entire RAM for each COG addressable as registers(If I haven't misunderstood completely), there's no real need for a Stack when using Assembler, at least if you're careful of how you assign variables.

Still, I wouldn't really call that 'self-modifying code' - as it only inseerts the return-address, not changes what the instruction does - more 'run-time address-resolution'
(It sounds good, so why not? )

I can't wait to play with it...

Thats not quite what the instruction does, I dont have a full understanding of it myself yet, but we were provided an example of full duplex assynchronous communication where there where two procedures, one for receiving data and one for transmitting, since the transmit line may be out of sync with the receive line they are operated independently. Every few instructions in both routines was a JMPRET which passed control back and forth between the two procedures. I don't have a copy of it on hand at the moment, but I dont think there was ever a return function, instead a register for each procedure contained a pointer for each procedure, one for where the·other procedure left, the other for where your current procedure leaves so the switching would look like JMPRET RecvPointer, XmitPointer, where RecvPointer is a register containing the PC+1·address of the last instruction executed in the other procedure (the receiver), and XmitPointer contains the PC+1 address of the current JMPRET address of the transmitter procedure.

Now it may be possible that using CALL uses the location of the RET to store the JMP, then each time its called it gets rewritten with the calling CALL's PC+1, but Im not absolutely sure about this.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Post Edited (Paul Baker) : 2/22/2006 8:08:46 PM GMT

Bamse · 2006-02-22 20:18

Gadgetman said...

Instead of storing the RET address on the stack, it stores it dynamically in the 'RET' instruction?

How would that work with nested subroutines and more than one cog using the same subroutine ???
Doesn't each cog require a stack of it's own ???

/Bamse

Martin Hebel · 2006-02-22 20:23

The main memory that all access are the Spin tokens, these are read by the Cogs to perform intepreted machine code. While the interpreter may do this, the original Spin code is not affected. A better example is your own assembly code that is ran directly in a cog, thus affecting no other cog.

-Martin

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Martin Hebel

Disclaimer: ANY Propeller statements made by me are subject to my inaccurate understanding of my limited time with it!
Southern Illinois University Carbondale -Electronic Systems Technologies
Personal Links with plenty of BASIC Stamp info
and SelmaWare Solutions - StampPlot - Graphical Data Acquisition and Control

Paul Baker · 2006-02-22 20:23

The propeller assembly does not natively support recursive programming (a subroutine calling itself). Since the JMPRET allows you to specify where you store the return data, you could implement your own call stack. But it is better to use interative programming instead of recusive (neither Chip nor Jeff have experienced the need for recusive subroutines).

Thats a good point Martin, JMPRET is an assembly instruction, it is not availible to spin programs, spin programs use the familiar function calling structure you are use to.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Post Edited (Paul Baker) : 2/22/2006 8:28:18 PM GMT

Gadgetman · 2006-02-22 20:25

As I understand it, Assembly programs are stored in local memory, so that would remove the need of guarding against crashes with other COGs, but nested subroutines is a different matter, yes...

I can't wait to read those part of the datasheet...

Bamse?
Where are you from?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Don't visit my new website...

Paul Baker · 2006-02-22 20:27

Correct Gadgetman.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Tracy Allen · 2006-02-22 23:25

Yes, as Paul noted, the JMPRET D,S mechanism can be more subtle. In the simplest case, which corresponds to a CALL and RET, the D and S would be literals resolved at compile time. But like all other asm instructions on Propeller, D and or S can also be indirect, depending on bits that are stored in the machine code. In the full-duplex example Paul mentioned, the JMPRET uses the contents of two registers, rather than literals. In that way it is able to ping-pong back and forth between the routines, and it is quite a mind twist to see how that happens.

The indirect address mode allows one JMPRET to branch to different places determined at run time rather than at compile time.

For example, in the full duplex routine, there are two JMPRET instructions in the receive component and two in the transmit component, and those come up in relation to the delay in waiting for a byte to transmit/receive, and in waiting for each bit to complete (in relation to the system clock counter).

Using the indirect mode of addressing, each JMPRET cleverly knows which of the return addresses in the opposite routine is active, stored in the indirect addressing register. So the routine is always ping ponging back and forth between the correct section of the code. One remarkable thing about this mechanism, that takes some study, is that there are not additional commands that set the contents of the registers (except one direct MOV at initialization). The whole mechanism is managed by the JMPRET D,S instructions with indirect reference for both D and S.

The indirect registers are ordinary registers, not any kind of special CPU register.

(Please don't ask for us to post the code. The conditions of talking about this stuff is that we can only paraphrase, not copy and paste.)

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

MacGeek117 · 2006-02-23 00:17

And I thought SX assembly was hard to learn!
RoboGeek

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
There are·3 kinds of people in the world,

the dreamers, the do-ers, and the "Oh, what's this button do"-ers.
Formerly bugg.
www.parallax.com
www.goldmine-elec.com
www.expresspcb.com
www.startrek.com
·

PLJack · 2006-02-23 01:16

This is so exciting!

I love the propeller concept.

Let's see if I understand correctly.

Suppose one of the cogs became unresponsive or locked up.
If you wrote the code to check for such an event you could have one of the other cogs stop it, copy over some startup code and then restart that cog.
Not the best example but it is a very powerful combination.

Is this true?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
- - - PLJack - - -

Perfection in design is not achieved when there is nothing left to add.
It is achieved when there is nothing left to take away.

Tracy Allen · 2006-02-23 05:19

A watchdog function in one cog would be a definite possibility. Suppose you are waiting for a signal from some sensor or device, and for some reason (being suddenly unplugged, cosmic ray, whatever), the signal never comes back. Of course a small program loop ccan implement timeouts or alternative exit strategies, but that eats up time and slows down the loop. It would be nice to use the WAITPEQ or WAITPNE (pin equal, pin not equal) commands, which are simple and lightning fast to respond, but might wait forever if the pin state never matches the target.

WAITNEQ (wait for not equal) takes a mask for which pins to watch, and a state to watch for on those selected pins. Say you are looking for a high on pin 0 and pin 1 will be the watchdog. For a simple solution you attach an RC circuit to pin 1, bring the pin low to discharge the capacitor, then let R start to charge it up. The mask is %11 and the starting state is %00 . You use WAITPNE command in Spin or its exact homologue in asm, and it will time out when either the signal arrives or the RCTIME expires, whichever comes first. Subsequent code would then branch dependent on the timeout state. OR, get the same effect by using another cog to provide the watchdog timout on pin 1. That cog could look dumbly for a zero on pin 1, set it high after a fixed interval.

A more drastic operational watchdog could also be made that requires cog A to send messages at a certain interval to cog B. The message could pass either through i/o or through the hub. If cog B does not receive the expected message, it could relaunch cog A.

Maybe this thread is getting kind of off topic of self-modifying interpreter code. Sorry about that, Bruce.

>And I thought SX assembly was hard to learn!
>RoboGeek

I guess that is a subject for another thread. Not suprisingly, Jeff and Chip think it is quite easy to learn! There are certainly some powerful instructions that can do a lot in one bite.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com

Bruce Bates · 2006-02-23 07:21

Tracy -

Don't apologize at all. This has been most interesting.

At the moment, here's what has me concerned:

Everytime you turn around, someone is saying "just use another COG" to blah, blah, blah. With 8 COGs, that's all well and good, but there for IS a finite limit here. If one is using various COGs for timers, access methods, watchdogs, and device drivers, etc. one starts quickly running out of COGs to run the (so called) mainline program.

For my money, those functions just mentioned are generally viewed as overhead or "system" functions, although there is no integrator, supervisor or system manager it would seem. 50% of me is an applications programmer, and 50% is a systems programmer, so I get VERY protective of BOTH of "my" pieces of the pie! If the system wants a BYTE of the pie, let us find it's own!

Moreover, it seems to me that the most precious commodity on this processor, even more than RAM or ROM or I/O ports, is the number of available COGs, even though they can apparently be re-used time and time again during operation. Run out of COGs while you're writing your program, and it's back to square one on the drawing board?

OTOH some of the facets of this sytem are wholly intuitive by design. I asked myself the following, and was even able to answer it without the benefit of any documentation. In fact, I can think of at least two ways to accomplish it. Answers to the following query are NOT INVITED by those who have the documentation!

Suppose on one pass of the HUB you wanted the process to be COG1, COG2, COG3, COG4 and on a future pass you wanted it to be COG1, COG3, COG2, COG4. The answer is really quite simple, but it is a good conceptual exercise!

Since there will be an obvious delay between the haves and the have nots regarding product documentation, it seems to me, from a self-education point of view, the most productive time which we have nots could spend, is to capture the essense and operation of the system, and worry less about the seemingly awesome system capabilities! Just one persons opinion I guess, and just as flawed as the next.

Regards,

Bruce Bates

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔

Paul Baker · 2006-02-23 16:09

Yes to those who do not have access to the documentation, (read my explanation in the "Propeller Datasheet" thread for my "one person's" logical reasoning why they arn't making them widely availible yet), its best to try to get a perceptual and qualitative view of the new system, rather than concentrate of the details.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Guenther Daubach · 2006-02-27 14:54

Paul,

I absolutely agree with you! Propeller is not yet released, and neither are the docs but it is amazing to how many details speculations about Propeller features, and possibilities extend in this and other threads. I don't think its worth the time to speculate on each and every detail, or have one of the few forum members with a Propeller on their desks figuring out vertain features by running test applications.

I think it's a good idea, to calm down a bit while waiting for the official release of Propeller and its documentation.

When I learned first that COGs hold program code and data in one common RAM area, I immediately got alerted what would happen when a program goes "bananas", overwriting itself, like it happened all the time on the good old PCs w/o protected mode. I'm pretty sure, this is can happen on the Propeller as well, especially with badly designed assembly code. Using Spin, IMO will make it almost impossible that such unwanted code changes happen.

As someone mentioned, in assembly, it is possible to create "controlled" self-modifying code. Instead of using a jump table at the beginning of a state machine, one might consider to modify the constant part in a JMP instruction in order to jump to the next state handler when the code is called next. But why should one do it this way? Maybe to squeeze out the last available word in memory.

Other instructions mentioned in this thread, like the JMPRET D, S instruction, do - IMO - not stand for self-modifying code, just for a different method how to setup and restore addresses. When you do a CALL in an SX application, PC+1 is pushed on the stack, and PC is loaded with the first address of the called soubroutine. On a RET, the return address is popped off the stack and loaded into PC. Would you call this self-modifying code? No, because registers (the stack and PC) are affected but not instruction codes.

In Propeller assembly, this is similar: On a subroutine call, the return address (PC+1) is saved in a register (D), and PC is loaded with either the constant part of the instruction (#S) which is defined at assembly time, and not at run-time, or the contents of a register (S). On a return, PC is loaded with the value formerly stored in D, i.e. the return address. As a "side effect", as PC+1 is always saved in a register, such instructions allow to "ping-pong" between two code segments with no extra program coding (see Chip's full duplex serial tx/rx code). Again, such instructions do modify register contents but no instruction codes, so you can't call this "self-modifying code".

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Greetings from Germany,

G

Paul Baker · 2006-02-27 16:54

Yes, JMPRET wasn't intended for self modifying code in its traditional understanding, its capabaility is rather limited. However there are 3 assembly commands that were tailored for modifying other assembly commands. The MOVS, MOVD and MOVI are designed to change the source register, the destination register, and the instruction code of the targeted assembly instruction.

I think I am trashing portions of the memory, because my current program is running a TV object and at a certain point in the program, artifacts appear on the screen. Ill have to hunt down whats leaking the data.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
·1+1=10

Post Edited (Paul Baker) : 2/27/2006 4:57:36 PM GMT

Tracy Allen · 2006-02-27 18:05

Well, CALL and RET, the specific instance of JMPRET that the document prefaces with, "to help keep you sane", specifically modifies the destination field of a JMP instruction. It may not be a major example of self-modifying code, but one certainly has to understand it and how this processor manages to get by without a stack. The calls can be nested, but the mechanism does not permit recursion.

The more advance usage found in full-duplex example uses the indirect register addressing mode, so it does not need to modify a JMP instruction per se. The indirect mode is what allows it to ping pong between routines.

One simple usage for MOVS I think of is to help with status, debug or error messages. Move the current status byte into the source field a MOV in the message kicker outer. Okay, the message could go in its own register, but there would be no need.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Tracy Allen
www.emesystems.com