The Secret SX Instructions

pjv · 2004-12-15 04:52

Hello Guenther;

I did some hardware checking today, and I need to post two minor corrections; perhaps you have already discovered this.

All of the two level interrupt stacks are of a LIFO nature with the bottom level retaining·the value "pushed" into it, regardless of the number of "pops" given. I had incorrectly indicated·it would fill with zeros when "popped".

This implies the stack will "reverse fill" with the value "pushed" into the second (bottom) level·when "pops" are issued. This could be of some significance to very tricky programmers, and may be important for you in properly displaying the stacks' contents.

Secondly, although I was a little wishy-washy on it, I incorrectly indicated that·in the debug mode the oscillator pin shifted or·swapped into·W. I now believe it swaps with the carry bit, and that it is the OSC2 pin.·More testing is required for me to be·positive.

Peter

Peter Verkaik · 2004-12-19 12:23

I thought about trhead switching inside the ISR and came up

with a littlle framework.

Comments please

·I still have no guarantee the call stack does not get corrupted.

Switching should be done at a known level in the main code (flags ??)

to be sure.

regards peter

Post Edited (Peter Verkaik) : 12/19/2004 12:26:34 PM GMT

Peter Verkaik · 2004-12-19 14:53

Little update, forgot to save/restore PC high bits (in M)

Threads limited to 3 using a single ram bank

regards peter

Guenther Daubach · 2004-12-19 19:32

pjv said...
Hello Guenther;

I did some hardware checking today, and I need to post two minor corrections; perhaps you have already discovered this.

All of the two level interrupt stacks are of a LIFO nature with the bottom level retaining the value "pushed" into it, regardless of the number of "pops" given. I had incorrectly indicated it would fill with zeros when "popped".

This implies the stack will "reverse fill" with the value "pushed" into the second (bottom) level when "pops" are issued. This could be of some significance to very tricky programmers, and may be important for you in properly displaying the stacks' contents.

Secondly, although I was a little wishy-washy on it, I incorrectly indicated that in the debug mode the oscillator pin shifted or swapped into W. I now believe it swaps with the carry bit, and that it is the OSC2 pin. More testing is required for me to be positive.

Peter

Peter,

thanks for the info - I'll check the behavior of the shadow stacks on a "real" silicon, and will then adapt the handling in SxSim accordingly.

No problem with OSC2 swapping because SxSim can't support this anyway, and the same is true for the "real" breakpoint handling as SxSim uses its own, allowing you to set as many breakpoints as you like in one session.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Greetings from Germany,

G

Guenther Daubach · 2004-12-19 21:28

pjv said...
Hello Guenther;

I did some hardware checking today, and I need to post two minor corrections; perhaps you have already discovered this.

All of the two level interrupt stacks are of a LIFO nature with the bottom level retaining the value "pushed" into it, regardless of the number of "pops" given. I had incorrectly indicated it would fill with zeros when "popped".

This implies the stack will "reverse fill" with the value "pushed" into the second (bottom) level when "pops" are issued. This could be of some significance to very tricky programmers, and may be important for you in properly displaying the stacks' contents.

Secondly, although I was a little wishy-washy on it, I incorrectly indicated that in the debug mode the oscillator pin shifted or swapped into W. I now believe it swaps with the carry bit, and that it is the OSC2 pin. More testing is required for me to be positive.

Peter

Peter,

in the meantime, I did some testing on a "real" SX silicon concerning the push/pop instructions, and my results are as follows:

When you do just one PUSH (say PUSH W), all subsequent POPs return the value you have previously PUSHed once.

When you do two PUSHes, the first POP returns the most recently PUSHed value, where all (no matter how many) subsequent POPs always return the previously PUSHed value. IOW, the stacks always "remember" the first PUSHed value (if there is any - I did not check POPs w/o any previously PUSH so far).

When you do more than two PUSHes before a POP, the "oldest" POPed values get lost, and subsequent POPs return the most recent two values that were PUSHed.

In the meantime, I have modified SxSim to handle PUSH/POPs like this but I'd like to get some confirmation from you, or anybody else who has checked the PUSH/POP instructions if I'm right before relasing a new version of SxSim.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Greetings from Germany,

G

James Newton · 2004-12-19 22:34

Peter, the thread code looks good, but I would consider a slightly different approach:

Use the first 5 registers in each bank as the th_x registers and make the thread_index register select the bank. Then you can use fixed register locations to store and retrieve the task variables, you don't have to inc fsr (or even use it) and you are not limited to only 3 tasks. Also, the task can use the registers in the "local" (to that task) bank.

This also opens another possibility: If you set the rule that each task must only use registers in its own bank (or globals) and NOT FSR then you don't have to save and restore the FSR register. It can simply be set to the appropriate bank on each task switch.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
---
James Newton, Host of SXList.com
james@sxlist.com 1-619-652-0593 fax:1-208-279-8767
SX FAQ / Code / Tutorials / Documentation:
http://www.sxlist.com Pick faster!

Peter Verkaik · 2004-12-20 01:34

Hi James,

I thought about that, but that defragments memory which is bad for large

buffer arrays. I now even think thread switching inside the isr is not all that

good because truly unconditional pre-emptive switching cannot be done

without access to the call stack (if a thread gets switched while in a call,

the return address for that call is on top of stack, so another thread cannot

continue anyway if that one was in call also).

Perhaps it is better to let the main code call a thread switch routine.

Something like:

mov m,#(($+4)>>8) ;1 word, page·for return to :retadr

mov w,#(($+3)&$FF) ;1 word, offset for return to :retadr

call @thread_switch ;2 words

:retadr ;this is where we return after thread_switch

That way I can use a fsr stack for both return addresses and arguments
and results. Each thread can have its own fsr stack so there is a real
context switch.

How about that?
regards peter

·

pjv · 2004-12-20 03:04

Hello Guenther;

I looked at what you said, and I still get·the results·I had earlier described.

I suspect you are being bitten by the "DEBUG BUG". That is, if you are using the SX Key debugger to observe the results, that very process (as warned earlier) alters the value in the first stack levels, and hence the values of the subsequent·"POPS".

In my test, I run a small program, and do not break into the debugger until it has fully completed. That way the SX Key stays out of the way during the push and pop tests.

In that program I put a value in W, then pushW, then put adifferent value in w, and again pushW. Then I popW, and store the results in RAM, and again popW, and put it in another RAM, and again popW to a third ram.

This is the only way you can see what's really going on at execution time. If you break into the debugger during the process, "your results WILL vary".

As I recall, with·early versions of SX Key software, I have found a similar situation in observing the content of the RTCC counter. Under debug, it will show a value different from the value it has at execution time. Try a·"mov w, RTCC; mov RAM,W" sometime and then break into the debugger to show you the RAM contents. Then repeat that by single stepping the debugger, and see the different result.

I have not tested this with recent releases of the software, so now things may be different.

As we said before, you need to be careful with these hidden instructions.

Peter

pjv · 2004-12-20 03:09

Hello Peter;

You are absolutely correct. Context switching while in a call is not permitted. This does put a severe restriction on the utility of thread switching. So I limit the use of calls to only be used by the RTOS itself.

Oh, for access to the call stack.......

Peter

Guenther Daubach · 2004-12-20 07:48

pjv said...
Hello Guenther;

I looked at what you said, and I still get the results I had earlier described.

I suspect you are being bitten by the "DEBUG BUG". That is, if you are using the SX Key debugger to observe the results, that very process (as warned earlier) alters the value in the first stack levels, and hence the values of the subsequent "POPS".

In my test, I run a small program, and do not break into the debugger until it has fully completed. That way the SX Key stays out of the way during the push and pop tests.

In that program I put a value in W, then pushW, then put adifferent value in w, and again pushW. Then I popW, and store the results in RAM, and again popW, and put it in another RAM, and again popW to a third ram.

This is the only way you can see what's really going on at execution time. If you break into the debugger during the process, "your results WILL vary".

As I recall, with early versions of SX Key software, I have found a similar situation in observing the content of the RTCC counter. Under debug, it will show a value different from the value it has at execution time. Try a "mov w, RTCC; mov RAM,W" sometime and then break into the debugger to show you the RAM contents. Then repeat that by single stepping the debugger, and see the different result.

I have not tested this with recent releases of the software, so now things may be different.

As we said before, you need to be careful with these hidden instructions.

Peter

Hello Peter,

in my test, I did not use the SX-Key debugger at all. Similar to what you had described, I stored a value in W (say 1) and pushed it. Next I stored another value in W (say 2), and pushed it again. Then I stored another value in W (3) just to verify that W gets modified by the subsequent pops. Then I poped W, and stored its value in ra. I stored the result in W after the next pop in rb, and the result after another pop in rc (with all ports configured as outputs before). Then I entered into an endless loop, and checked the levels of the output pins which gave me the results I had described, i.e. ra = 2, rb = 1, rc = 1.

In another exoperiment, I only pushed one value (1), and again did three pops, again writing the results to the ports. In this case, all three ports were set to 1.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Greetings from Germany,

G

James Newton · 2004-12-20 09:23

Peter and Peter, Yes making our own call and return is probably for the best. But I could see some examples where the overhead is not warrented... where only one thread is allowed to use the call stack for example, and this little task switcher (either your 3 thread or my idea for an 8 thread) will make some projects easier to code.

For the complete call/return stack, see the code I posted on the 12th up above. Using an self-triggered interrupt makes "faking" a call/return stack pretty quick and avoids the need to load the return address before the call. I do so wish I could get away from using an IO pin however. Is there any chance tht the DEBUG secret instruction triggers a seperate interrupt?

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
---
James Newton, Host of SXList.com
james@sxlist.com 1-619-652-0593 fax:1-208-279-8767
SX FAQ / Code / Tutorials / Documentation:
http://www.sxlist.com Pick faster!

pjv · 2004-12-20 09:57

Hello Guenther;

I see what you did, and it sounds perfectly proper. So I can't follow the difference in results. I had reconfirmed·my·results·prior to my post yesterday.

Could you please repeat what I did, and report back, and I'll do the same to yours? Something not adding up here......

Peter
·

pjv · 2004-12-20 18:56

Hello Guenther;

In reading your post of yesterday at 1:28 more carefully, I think we are both getting the same result. What threw me was your results with the single push and three pops, all returning a value of one.

Obviously the stack already had the value of one at its deepest level.

Consecutive·"pop"s (that is more than one) will repeated ly return the value·of the deepest level of the stack regardless of the number of "pop"s, as posted in my "reverse fill" correction on Dec 14. The deepest stack level does not change with "pop"s, only with "push"es.

Peter

·

Guenther Daubach · 2004-12-20 23:04

pjv said...
Hello Guenther;

In reading your post of yesterday at 1:28 more carefully, I think we are both getting the same result. What threw me was your results with the single push and three pops, all returning a value of one.

Obviously the stack already had the value of one at its deepest level.

Consecutive "pop"s (that is more than one) will repeated ly return the value of the deepest level of the stack regardless of the number of "pop"s, as posted in my "reverse fill" correction on Dec 14. The deepest stack level does not change with "pop"s, only with "push"es.

Peter

Hello Peter,

I just tried the following code on a "real" SX silicon with the SX-Key debugger (I assume that this code is similar to what you have tried):

LIST Q = 37
DEVICE SX28L, TURBO, STACKX, OSCHS2
IRC_CAL IRC_FAST
FREQ 50_000_000
RESET Main

Main

clr $08
clr $09
clr $0a
mov w, #$af
dw $048 ; PUSH w
mov w, #$fe
dw $048 ; PUSH w
clr w
dw $04c ; POP w
mov $08, w
dw $04c ; POP w
mov $09, w
dw $04c ; POP w
mov $0a, w
break
clr w
jmp $

At the break, the debugger displayed the following register contents:

08 = FE
09 = AF
0A = AF

which is what I had expected. (BTW AFFE in German means "monkey").

I noticed another intersting effect: After having executing this code for the first time, I manually changed the contents of the registers at 08, 09, and 0a to 00 in the SX-Key Debugger, clicked the "Reset" button, and then again "Run". To my surprise, the debugger did not stop at the break as expected but endlessly continued running the program.

Seems as if those PUShes and POPs mess up the debugger even before entering into a breakpoint.

Therefore, I think the best method to test the "secret commands" is in run-only mode.

Here is the "non-debug-output-port" version of my test program:

LIST Q = 37
DEVICE SX28L, TURBO, STACKX, OSCHS2
IRC_CAL IRC_FAST
FREQ 50_000_000
RESET Main

MODE_DIR = $0f

Main
mode MODE_DIR
mov !ra, #%11111100
mov !rb, #%11111100
mov !rc, #%11111100
clr ra
clr rb
clr rc
mov w, #1
dw $048 ; PUSH w
mov w, #2
dw $048 ; PUSH w
mov w, #$ff
dw $04c ; POP w
mov ra, w
mov w, #$ff
dw $04c ; POP w
mov rb, w
mov w, #$ff
dw $04c ; POP w
mov rc, w
jmp $

When I run this program clocked by the SX-Key or, the port outputs are:

RA: 2
RB: 1
RC: 1

i.e. exactly what I had expected.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Greetings from Germany,

G

pjv · 2004-12-20 23:36

Hello Guenther,

Ja, alles ist good.

Seems we are getting consistent results afterall.

Onward.

Peter

Guenther Daubach · 2004-12-21 09:20

pjv said...
Hello Guenther,

Ja, alles ist good.

Seems we are getting consistent results afterall.

Onward.

Peter

Hello Peter,

great - the next version of SxSim is almost ready for posting.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
Greetings from Germany,

G

James Newton · 2005-01-23 06:41

Did anyone ever verify what "$044 sets the SX into debug mode" actually does? Is it really just a software interrupt?

Is there any way to cause an interrupt without hijacking one of the hardware resources such as port B or RTCC?

I'm trying to think of a way to use the secret instructions to implement a call and return that is not limited to half pages and is not limited to 8 levels.

Here is another idea: to make a call, load M, W with $+2 then page and jmp to the callx routine. After the jmp, DW the target address. The callx routine saves M and W to the stack then does an IREAD to get the target address which it then jumps too using the secret instructions.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
---
James Newton, Host of SXList.com
james@sxlist.com 1-619-652-0593 fax:1-208-279-8767
SX FAQ / Code / Tutorials / Documentation:
http://www.sxlist.com Pick faster!

Peter Verkaik · 2005-01-23 12:58

James,

Lets analyze your idea.

For my smallc port I have such a working call/return but perhaps iread

can optimize that.

For smallc the code to call a label:

mov DL,#($+10)&255· ;load return address, 2 words

mov DH,#($+8)>>8;· ;2 words

mov AL,#_label&255· ;load target address, 2 words

mov AH,#_label>>8· ;2 words

jmp @callf· ;jump to runtime routine to execute call, 2 words

This takes 10 words.

Now using iread:

mov DL,#($+6)&255· ;load iread address, 2 words

mov DH,#($+4)>>8· ;2 words

jmp @callf· ;2 words

dw _label· ;1 word

This takes 7 words
Now using a (existing??) software interupt I name IJMP

mov DL,#($+6)&255· ;load iread address, 2 words

mov DH,#($+4)>>8· ;2 words

IJMP· ;software interrupt, 1 word
dw _label· ;1 word
This takes 6 words

Here is the callf for smallc
call·pushDX···;save return address (popped by x_return), 1 word
call·move41···;move target address from AX into DX, 1 word
jb·CPU_STAT.0,@_sxintrpt·;we are in ISR, 2 word
x_callf_1
pushST···;align shadow stack, 1 word
pushW· ;1 word
mov·w,fsr··;save fsr state, 1 word
pushFSR· ;1 word
mov·m,DH··;get our dest into m:w, 2 words
mov·w,DL· ;1 word
pushPC···;push m:w onto shadow stack, 1 word
reti···;pops all registers and jump to dest· ;1 word
This takes 13 words

Here is the callf for use with iread
call·pushDX···;save return address (popped by x_return), 1 word
jb·CPU_STAT.0,@_sxintrpt·;we are in ISR, 2 words
pushST···;align shadow stack, 1 word
pushW· ;1 word
mov·w,fsr··;save fsr state, 1 word
pushFSR· ;1 word
mov M,DH· ;prepare to read target address, 2 words
mov w,DL· ;1 word
iread· ;read target address into M:W, 1 word
pushPC···;push m:w onto shadow stack, 1 word
reti···;pops all registers and jump to dest, 1 word
This takes 13 words

Here is the callf with software interrupt
call·pushDX···;save return address (popped by x_return), 1 word
jb·CPU_STAT.0,@_sxintrpt·;we are in ISR, 1 word
popPC· ;load address to read target from, 1 word
iread· ;read target address, 1 word
pushPC···;push target address onto shadow stack, 1 word
reti···;pops all registers and jump to dest, 1 word
This takes 6 words (if return address is not in DX it takes quite more words)

It appears the iread methods saves 3 words per call, but the software
interrupt saves the most.·If only we had this.

Unfortenately I cannot apply the iread method for smallc because
also the CALL1 token must be supported (target address in AX) and I
cannot generate a dw instruction from a runtime value.

regards peter
·

James Newton · 2005-01-23 17:59

Rather than loading the return address into your DH, DL file registers, just load it into MW. Then in callf, save MW to your return stack, iread, pushPC, and reti.

Can't CALL1 be a seperate routine from callf?

BTW are we shure that pushW isn't actually pushMW? It doesn't make sense to me that they didn't save M.

▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔▔
---
James Newton, Host of SXList.com
james@sxlist.com 1-619-652-0593 fax:1-208-279-8767
SX FAQ / Code / Tutorials / Documentation:
http://www.sxlist.com Pick faster!

Peter Verkaik · 2005-01-23 19:59

James,

I don't think M gets saved. I have seen several sources that used M

inside the ISR and these always saved M explicitly.

So I would be required to do so also. But that can be done.

Using MW would save me 2 words per call to label. Assuming most calls

will be calls to labels rather than to address in primary register the

end result would save me words.

New callf

callf
mov DL,w· ;1 word
mov DH,m· ;2 words, destroys w
call·pushDX···;save return address -1·(popped by x_return), 1 word
callf_1
jb·CPU_STAT.0,@_sxintrpt·;we are in ISR, 2 words
pushST···;align shadow stack, 1 word
pushW· ;1 word
mov·w,fsr··;save fsr state, 1 word
pushFSR· ;1 word
mov M,DH· ;prepare to read target address, 2 words
mov w,DL· ;1 word
iread· ;read target address into M:W, 1 word
pushPC···;push m:w onto shadow stack, 1 word
reti···;pops all registers and jump to dest, 1 word
This now requires 16 words
For CALL1 (seperate entry)
mov DL,w· ;1 word
mov DH,m· ;2 words
call pushDX ;1 word
mov DL,AL· ;2 words
mov DH,AH· ;2 words
jmp callf_1· ;1 word
So this requires 9 extra words.
Overall there will be words saved when there are 5 or more 'C' calls
which will be most likely.
Now if only we had this software interrupt !

I will adapt the smallc macro to use iread for function calls.
regards peter