Using ALTI for stacks

I'm was thinking about how to use ALTI effectively for stack manipulation (as was suggested in the "New Spin" thread). This is what I came up with (untested):
If you wanted to avoid the call overhead, the de-normalized code would be:
As noted, I don't think spacer instructions are needed, as you are not altering an instruction in the pipeline.
Does this look like what others were thinking? I'm, not crazy about the need for two ALTI instructions in the POP, but that doesn't seem too bad overall. Simple PUSH/POP instructions would certainly be nicer, though.
Edit: I just realized that the RDLUT probably requires SETR, not SETD. Code is updated to reflect that.
Edit: Removed unnecessary instructions from inlined version
loop SETD sp, #in_val
CALL #push
SETR sp, #out_val
CALL #pop
JMP #loop
push ALTI sp, #%100_111 'substitute sp[17:9] into D, substitute sp[8:0] into S, then increment sp[8:0]
WRLUT 0-0, 0-0
RET
pop ALTI sp, #%010 'increment sp[8:0]
ALTI sp, #%100_000_100 'substitute sp[17:9] into D, substitute sp[8:0] into S
RDLUT 0-0, 0-0
RET
sp LONG 0
in_val RES 1
out_val RES 1
If you wanted to avoid the call overhead, the de-normalized code would be:
loop ALTI sp, #%111 'substitute sp[17:9] into D, substitute sp[8:0] into S, then increment sp[8:0]
WRLUT in_val, 0-0
ALTI sp, #%010 'increment sp[8:0]
ALTI sp, #%100 'substitute sp[27:19] into R, substitute sp[8:0] into S
RDLUT out_val, 0-0
JMP #loop
RET
sp LONG 0
in_val RES 1
out_val RES 1
As noted, I don't think spacer instructions are needed, as you are not altering an instruction in the pipeline.
Does this look like what others were thinking? I'm, not crazy about the need for two ALTI instructions in the POP, but that doesn't seem too bad overall. Simple PUSH/POP instructions would certainly be nicer, though.
Edit: I just realized that the RDLUT probably requires SETR, not SETD. Code is updated to reflect that.
Edit: Removed unnecessary instructions from inlined version
Comments
MOV in_val1, #6 MOV in_val2, #7 loop SETD sp, #in_val1 CALL #push ' push first operand on LUT stack SETD sp, #in_val2 CALL #push ' push second operand on LUT stack CALL #_mul SETR sp, #out_val CALL #pop ' pop result off LUT stack CMP out_val, #42 WZ IF_ E JMP #0 'halt JMP #loop 'maybe the answer will be different next time? _mul SETR sp, #temp2 CALL #pop ' pop second operand off LUT stack SETR sp, #temp CALL #pop ' pop first operand off LUT stack MUL temp, temp2 SETD sp, #temp CALL #push ' push result on LUT stack RET push ALTI sp, #%100_111 'substitute sp[17:9] into D, substitute sp[8:0] into S, then increment sp[8:0] WRLUT 0-0, 0-0 RET pop ALTI sp, #%010 'increment sp[8:0] ALTI sp, #%100_000_100 'substitute sp[17:9] into D, substitute sp[8:0] into S RDLUT 0-0, 0-0 RET sp LONG 0 in_val1 RES 1 in_val2 RES 1 out_val RES 1 temp RES 1 temp2 RES 1
It seems to me (totally unsubstantiated, mind you) that this will not perform much better than a hub stack using inlined PUSHx/POPx. Of course, you could also inline the push/pop operations in the above code to avoid the CALL/RET overhead, but at a cost of readability.
loop ALTI sp, #%111 'substitute sp[8:0] into S, then increment sp[8:0] WRLUT #6, 0-0 'push first operand on LUT stack ALTI sp, #%111 'substitute sp[8:0] into S, then increment sp[8:0] WRLUT #7, 0-0 'push second operand on LUT stack CALL #_mul ALTI sp, #%010 'increment sp[8:0] ALTI sp, #%100 'substitute sp[8:0] into S RDLUT out_val, 0-0 ' pop result off LUT stack CMP out_val, #42 WZ IF_ E JMP #0 'halt JMP #loop 'maybe the answer will be different next time? _mul ALTI sp, #%010 'increment sp[8:0] ALTI sp, #%100 'substitute sp[8:0] into S RDLUT temp2, 0-0 'pop second operand off LUT stack ALTI sp, #%010 'increment sp[8:0] ALTI sp, #%100 'substitute sp[8:0] into S RDLUT temp, 0-0 'pop first operand off LUT stack MUL temp, temp2 ALTI sp, #%111 'substitute sp[8:0] into S, then increment sp[8:0] WRLUT temp, 0-0 'push result on LUT stack RET sp LONG 0 out_val RES 1 temp RES 1 temp2 RES 1
Question to Chip: what happens if you mix ALTI and AUG? For instance, in the above code, what if I wanted to push a large immediate value? Is this allowed?
ALTI sp, #%111 'substitute sp[8:0] into S, then increment sp[8:0] WRLUT ##7, 0-0 'push large immediate value on LUT stack
Oh well! I kinda figured, but was hopeful. That's bound to bite someone in the future, though. The assembler should probably flag it as an error.
The ALTxx instructions modify the next instruction, no matter what it is. The AUGS/AUGD instructions, however, will work around the ALTxx instructions. So, you'd do one or both AUGx instruction(s), then the ALTxx instruction, then the final instruction. That should work okay.
' Using LUT as a push/pop stack CON stktop = $1FF DAT org push_x wrlut x,ptr 'write x at lut[ptr] incmod ptr,#stktop 'increment/wrap ptr ret pop_x decmod ptr,#stktop 'decrement/wrap ptr rdlut x,ptr 'read x at lut[ptr] ret ptr long 0 x res 1
All those ALTxx instructions are only useful for indirect addressing of cog registers.
' Using cog registers as a push/pop stack CON stksize = 100 DAT org push_x altd ptr,#stk 'set next d field mov 0,x 'write x at stk[ptr] incmod ptr,#stksize-1 'increment/wrap ptr ret pop_x decmod ptr,#stksize-1 'decrement/wrap ptr alts ptr,#stk 'set next s field mov x,0 'read x at stk[ptr] ret ptr long 0 stk res stksize x res 1
' Using REP+ALTI to automate register operations CON stksize = 100 DAT org scroll sets x,#stk+1 's field of x points to stk[1] setd x,#stk+0 'd field of x points to stk[0] rep #2,#stksize-1 'ready to scroll registers alti x,#%111_111 'set next d and s fields, increment them in x mov 0,0 'do 'mov stk[n+0],stk[n+1]', n = 0..stksize-2 ret stk res stksize x res 1
That's good to know. This would also imply that you can't use AUGx to modify ALTx, which seems reasonable.
That's correct.
That does point out a minor issue with doing hub stacks with PTRx: they are not bounded. Of course, if you exceed your stack size, all bets are off anyhow. It's probably a toss-up whether it's worse to corrupt the bottom of your stack or non-stack memory.
Yes it is. Didn't even consider that one.
In any case, stacks aren't really an issue. We've got reasonable ways to do them.
Nice!