@JonnyMac said:
Would it help to move your pasm code into the free interpreter space with setregs?
Huh, SETREGS() makes REGLOAD() completely redundant. Yeah, that can work but is quite space limited. Also, without any built-in allocation schemes to match, I'm not a fan of using any of those methods. And that goes for methods like LONGMOVE() too.
That still fails the non-allocated test. And just looks ugly!
I guess, if we're going for ugly, I might experiment with making a temporary solution of dual compile paths so that Flexspin is also catered for. Conditional compiling is supposed to doable in Pnut now I think ... EDIT: or not. Eric posted a workaround solution - https://forums.parallax.com/discussion/comment/1543640/#Comment_1543640
Ah, and I found this in Chip's Spin2 doc too. It's as ugly as SETREGS is but it should provide the execute in place I'm after.
Hmm, except that example doesn't work with Pnut v37 ... I've now tried a couple Pnut releases from both v35 and v34, none work. They all report the same error of "Undefined symbol" on x
Evan, I'm not sure what you want done but here is the inline routine in Spin2_interpreter.spin2 (v37):
' a: In-line PASM
' b: REGEXEC(hubadr)
' c: REGLOAD(hubadr)
' d: CALL(anyadr)
'
inline setq #16-1 'a load local variables from hub into buff
rdlong buff,dbase 'a
bith pb,#31 'a set flag to restore local variable to hub
mov ptrb,pb 'a get bytecode ptr into ptrb
skip ##%11100100000111 'a x2 begin inline_pasm skip pattern
regexec_ skip ##%1111000000 '| b x2 begin REGEXEC skip pattern
regload_ mov ptrb,x '| b c get hubadr into ptrb
rdword w,ptrb++ 'a b c read start register
rdword y,ptrb++ 'a b c read length of pasm code, minus 1
setq y 'a b c read in code
altd w 'a b c
rdlong 0,ptrb++ 'a b c altd causes ptrb++ to inc by 4, not by (y+1)*4
_ret_ popa x '| | c REGLOAD done, pop stack
shl y,#2 'a | update bytecode ptr for inline_pasm
add y,ptrb 'a |
call_ mov w,x '| | d get CALL address
popa x '| b d pop stack
mov y,pb '| b d save bytecode ptr
mov z,ptra 'a b d save ptra
call w 'a b d call pasm code (can use pa/pb/ptra/ptrb/stack, C/Z=0)
mov pb,y wc 'a b d restore bytecode ptr
if_c setq #16-1 'a b d if inline_pasm, restore local variables to hub
if_c wrlong buff,dbase 'a b d
_ret_ mov ptra,z 'a b d restore ptra
16 longs of variables are transferred to/from cog RAM before/after code is copied.
Aside: could setd after rdword w,ptrb++ replace the altd that causes the ptrb++ bug?
What?! You replied to the very post with my example and request to Chip. I'm asking Chip to provide a Spin2 language mechanism for executing Pasm code directly in hubRAM without the need to copy it to cogRAM first. When a routine has no loops it gains no speed benefit from being run in cogRAM while still incurring the setup and copy overheads.
PS: In short, Flexspin has this feature already and if Pnut/Proptool got it too then I'd make use of it all the time.
Flexspin's language syntax is simple enough - ORG/END is for cogexec only, ASM/ENDASM is for optimiser to choose between hubexec and cogexec depending on what it thinks is best. If Pnut selected only hubexec for ASM/ENDASM then that'd be fine.
@evanh said:
What?! You replied to the very post with my example and request to Chip. I'm asking Chip to provide a Spin2 language mechanism for executing Pasm code directly in hubRAM without the need to copy it to cogRAM first. When a routine has no loops it gains no speed benefit from being run in cogRAM while still incurring the setup and copy overheads.
What you want done is very easy if starting from scratch. Even if code not copied from hub to cog RAM there will be 16 longs copied both ways. What does Flexspin do? I only use it for PASM.
@evanh said:
I'm asking Chip to provide a Spin2 language mechanism for executing Pasm code directly in hubRAM without the need to copy it to cogRAM first.
rdword w,ptrb++ 'a b c read start register
rdword y,ptrb++ 'a b c read length of pasm code, minus 1
setq y 'a b c read in code
altd w 'a b c
rdlong 0,ptrb++ 'a b c altd causes ptrb++ to inc by 4, not by (y+1)*4
_ret_ popa x '| | c REGLOAD done, pop stack
shl y,#2 'a | update bytecode ptr for inline_pasm
add y,ptrb 'a |
be replaced with this?
rdword w,ptrb++ 'a b c read start register
setd patch 'a b c *** new ***
rdword y,ptrb++ 'a b c read length of pasm code, minus 1
setq y 'a b c read in code
' altd w *** not needed ***
patch rdlong 0,ptrb++ 'a b c *** ptrb correct ***
_ret_ popa x '| | c REGLOAD done, pop stack
' shl y,#2 *** not needed ***
mov y,ptrb ' *** changed ***
Methods can have up to 64KB of local variables, which can be bytes, words, and longs (default), in both singles and arrays._
Yeah, and the first 16 are in cogRAM.
_ret_ rcl PR0,#1
That's not going to happen - Copying to parameters, then to PRs, then back again - I was prepared to bend over for the ugliness but not for the double handling when avoiding that was the objective in the first place.
@evanh : Just to be clear: I think you're asking @cgracey to implement flexspin's inline ORGH feature, so that e.g.
PUB sum(a, b) : r
orgh
add a, b
mov r, a
end
would work just like the version with ORG, but instead of placing the inline assembly code in COG memory it goes into HUB memory and is executed directly from there. Does that sound right? (In flexspin terms ORGH/END is like ASM/ENDASM except the inline code isn't touched by the optimizer).
@ersmith said:
@evanh : Just to be clear: I think you're asking @cgracey to implement flexspin's inline ORGH feature, so that e.g.
PUB sum(a, b) : r
orgh
add a, b
mov r, a
end
would work just like the version with ORG, but instead of placing the inline assembly code in COG memory it goes into HUB memory and is executed directly from there. Does that sound right? (In flexspin terms ORGH/END is like ASM/ENDASM except the inline code isn't touched by the optimizer).
In flexspin are the variables copied from hub to cog RAM before calling the routine then copied back afterwards?
@ersmith said:
@evanh : Just to be clear: I think you're asking @cgracey to implement flexspin's inline ORGH feature, so that e.g.
PUB sum(a, b) : r
orgh
add a, b
mov r, a
end
would work just like the version with ORG, but instead of placing the inline assembly code in COG memory it goes into HUB memory and is executed directly from there. Does that sound right? (In flexspin terms ORGH/END is like ASM/ENDASM except the inline code isn't touched by the optimizer).
In flexspin are the variables copied from hub to cog RAM before calling the routine then copied back afterwards?
In flexspin the variables are always in cog RAM (at least by default).
@ersmith said:
@evanh : Just to be clear: I think you're asking @cgracey to implement flexspin's inline ORGH feature, so that e.g.
PUB sum(a, b) : r
orgh
add a, b
mov r, a
end
would work just like the version with ORG, but instead of placing the inline assembly code in COG memory it goes into HUB memory and is executed directly from there. Does that sound right? (In flexspin terms ORGH/END is like ASM/ENDASM except the inline code isn't touched by the optimizer).
Nice, yes, ideal for Chip to use that. Is ORGH/END inlining a new feature or have I just not seen it in the docs?
@evanh: ORGH / END has been in flexspin and documented since July (I think it was you who asked for hubexec without optimization in Spin2, wasn't it? The other languages already had it.)
I don't think it was me, I still do most testing in C.
My Spin work is more on the Prop1 these days, I have a small paying job there - Hacked together a stepper driver and Modbus driver from the OBEX and customised them for a custom made PCB with eight tiny stepper motors attached to each board in a multi-drop RS485 network. The boards and motors are located in a rotary sealing head on a packaging machine. Two slip-rings for power and two for data.
Each stepper driver manages a pair of motors. So four cogs used for them alone.
Comments
Would it help to move your pasm code into the free interpreter space with setregs?
Huh, SETREGS() makes REGLOAD() completely redundant. Yeah, that can work but is quite space limited. Also, without any built-in allocation schemes to match, I'm not a fan of using any of those methods. And that goes for methods like LONGMOVE() too.
puta stub in there with SETREGS and call some HUBexec Code from there?
Mike
That still fails the non-allocated test. And just looks ugly!
I guess, if we're going for ugly, I might experiment with making a temporary solution of dual compile paths so that Flexspin is also catered for. Conditional compiling is supposed to doable in Pnut now I think ... EDIT: or not. Eric posted a workaround solution - https://forums.parallax.com/discussion/comment/1543640/#Comment_1543640
Ah, and I found this in Chip's Spin2 doc too. It's as ugly as SETREGS is but it should provide the execute in place I'm after.
Hmm, except that example doesn't work with Pnut v37 ... I've now tried a couple Pnut releases from both v35 and v34, none work. They all report the same error of "Undefined symbol" on
x
No. The
inline
routine in Spin2_interpreter.spin2 could be modified by anyone to not copy to cog RAM, in theory anyway.Incidentally, I notice that PNut_v37.zip has a file named instructions_future.txt with contents:
I'm still clueless sorry. I don't know anything about using XBYTE.
Evan, I'm not sure what you want done but here is the
inline
routine in Spin2_interpreter.spin2 (v37):16 longs of variables are transferred to/from cog RAM before/after code is copied.
Aside: could
setd
afterrdword w,ptrb++
replace thealtd
that causes theptrb++
bug?What?! You replied to the very post with my example and request to Chip. I'm asking Chip to provide a Spin2 language mechanism for executing Pasm code directly in hubRAM without the need to copy it to cogRAM first. When a routine has no loops it gains no speed benefit from being run in cogRAM while still incurring the setup and copy overheads.
PS: In short, Flexspin has this feature already and if Pnut/Proptool got it too then I'd make use of it all the time.
Flexspin's language syntax is simple enough - ORG/END is for cogexec only, ASM/ENDASM is for optimiser to choose between hubexec and cogexec depending on what it thinks is best. If Pnut selected only hubexec for ASM/ENDASM then that'd be fine.
Chip's example code won't work because x is a hub variable not a cog register.
VAR variables are all in hubRAM but aren't the first 16 or so locals in cogRAM?
What you want done is very easy if starting from scratch. Even if code not copied from hub to cog RAM there will be 16 longs copied both ways. What does Flexspin do? I only use it for PASM.
Spin2
CALL
works this way.I'm asking for a new feature from Pnut. What is starting from scratch?
I just tried that above. It seems to be broken. Not sure if it ever was working to be honest.
Re altd bug, couldn't existing code
be replaced with this?
@evanh
From the docs.
Try this
Yeah, and the first 16 are in cogRAM.
That's not going to happen - Copying to parameters, then to PRs, then back again - I was prepared to bend over for the ugliness but not for the double handling when avoiding that was the objective in the first place.
@evanh : Just to be clear: I think you're asking @cgracey to implement flexspin's inline ORGH feature, so that e.g.
would work just like the version with ORG, but instead of placing the inline assembly code in COG memory it goes into HUB memory and is executed directly from there. Does that sound right? (In flexspin terms ORGH/END is like ASM/ENDASM except the inline code isn't touched by the optimizer).
In flexspin are the variables copied from hub to cog RAM before calling the routine then copied back afterwards?
In flexspin the variables are always in cog RAM (at least by default).
Nice, yes, ideal for Chip to use that. Is ORGH/END inlining a new feature or have I just not seen it in the docs?
@evanh: ORGH / END has been in flexspin and documented since July (I think it was you who asked for hubexec without optimization in Spin2, wasn't it? The other languages already had it.)
I don't think it was me, I still do most testing in C.
My Spin work is more on the Prop1 these days, I have a small paying job there - Hacked together a stepper driver and Modbus driver from the OBEX and customised them for a custom made PCB with eight tiny stepper motors attached to each board in a multi-drop RS485 network. The boards and motors are located in a rotary sealing head on a packaging machine. Two slip-rings for power and two for data.
Each stepper driver manages a pair of motors. So four cogs used for them alone.
Sorry, I just noticed this thread. Haven't been to the forum in a week.
I like Eric's ORGH/END approach.
I will review all this tomorrow.
Unrelatedly, now that we have this spiffy constant override feature, can we get conditional assembly?
i.e.
I posted a new v38 at the top of this thread.
Bug fixed from v37 that didn't allow parent-object CON blocks to use CON symbols from child objects.
Bug fixed in interpreter which caused ROTXY/POLXY/XYPOL to not work.
REPEAT-var returned to original behavior where var = (final value +/- step) after REPEAT.
All DEBUG displays now use gamma-corrected alpha blending for anti-aliasing.
FYI - I think the 2 BAT files need to be updated to reflect V38.
Ah, thank you. I will fix that.
I posted a new v39 at the top of this thread.
Note that I downloaded about three newer versions over the original file that I posted earlier tonight. I kept finding things that needed fixing.
I posted a new version (40) at the top of this thread. It adds 'REPEAT count WITH variable' to simplify 0..n-1 loops.
Here is the new zip file:
https://drive.google.com/file/d/1ZgSfEjS3-qSZa82fd0xXUs7AnWBsAzjX/view?usp=sharing