hub RAM, hub exec mode, memory allocation,... help needed

bozo · 2022-05-16 07:11

Hi all, I am hoping someone can set me straight on a couple of P2 things - I've been spending too much time trying to work this out, its driving me crazy and think I must be missing something really really obvious ... my apologies in advance for the length of this post ...

In the Propeller Tool, when I hit F8 to look at the compile info I see that $0- $12AF (4784 bytes) is taken up with Interpreter, followed by code/data, etc. And I assume that the data in this window is exactly what gets loaded in hub RAM later when I hit F10/F11.

Then, after my code is loaded in hub RAM, cog0 loads the interpreter from hub RAM and starts executing the first SPIN method it comes to in my top level object. Since cog RAM is limited to 512 longs ($0-$1FF), the interpreter code can occupy no more than 2k bytes (512 longs x 4 bytes/long) in the cog, right?

Question 1. So why then does the F8 compile window info show 4784 bytes of interpreter? Is it an error or is there other stuff between $0 and $12B0 that's not related to the SPIN2 interpreter? These 4784 bytes of 'interpreter' aren't loaded into cog0 are they? Even if the cog's LUT RAM is included that's a total of 4096 bytes in the cog, which is still a bit shy of 4784.

The next problem I have is understanding the mechanism for hub execute mode and ORGH.

In my DAT block I have some PASM code; it starts with an ORGH directive and is launched via the SPIN2 statement coginit(HUBEXEC_NEW, @myAsmCode, @myParams)

Unrelated to this code, and under an earlier ORG directive in my DAT block are some reservations for variables that are unrelated to myAsmCode:

asmVar1        res       1
asmVar2        res       1
asmVar3        res       1
asmVar4        res       1

these reservations occupy $1F to $22 in cog0 by virtue of the ORG directive which precedes some other PASM code before these 4 res lines.

My ORGH directive and my PASM2 code myAsmCode comes after these res lines.

When the new cog is launched to run @myAsmCode in hub exec mode, am I right in concluding that the code is relocated from its original location (somewhere in the code/data area that the F8 window illustrated) to $400 in hub RAM? Or was it compliled directly to $400 when I hit F10 in the Propeller Tool and it is uploaded to $400 when the P2 reboots?

When I use debug in cog0 to take a peek
debug(uhex(@myAsmCode),uhex(myAsmCode))
the result is :
Cog0 @myAsmCode = $1334, myAsmCode = $FD64_2636
indicating that the PASM code is sitting in hub RAM at $1334 before I launch cog1 to execute it.

But if I look at the F8 compile info window the code at $1334 (or at $400 for that matter) looks nothing like the $FD64_2636 that the debug statement yielded. Why is that?

Then later, after cog1 is launched and is running myAsmCode from hub, debugs tell me

Cog1  INIT $0000_1334 $0000_150C jump
Cog1  @myAsmCode = $E5DD_0196, #myAsmCode = $400

which I assume means that the code is being executed from #1334 and that cog2 believes the address of the label myAsmCode is $400.

On top of this, it appears that cog1 also knows about the res lines I had in the DAT block, since further debugs in cog1 give me
Cog1 #asmVar1 = $1F, 0, #asmVar2 = $20, 25, #asmVar3 = $21, 75, #asmVar4 = $22, 0

I guess this is OK, since these addresses are in the cog RAM and are totally separate from the same locations in cog0 or any other cog.

Question 2. So, isn't $400 hub RAM right in the middle of the area where the interpreter was loaded at the last reboot (i.e. between $0 and $12AF)? Has $400 been overwritten by the PASM code? Is the interpreter saved somewhere else or was it overwritten by the PASM code when the coginit/hubexec was called?

I have more Qs but will leave them, depending on what guidance you can give me on the above.
Thanks for reading through this far

evanh · 2022-05-16 08:14

Probably the first thing to say is Spin code is not machine code, it's byte code. So it doesn't get executed directly by any cog. This theoretically allows the code to be run from anywhere, including straight from a file on an SD card. Not that Proptool actually supports that.

Interpreter and user program are bundled together in one binary. Interpreter is initially executed from $0 (cogexec) but quickly redeploys itself outside of cogRAM. Proptool's Spin interpreter is broken up into a few parts. There is a small amount in cogRAM but most of it lives in hubRAM I believe. Hubexec allows it to be natively executed by a cog directly from hubRAM.

Proptool's Spin memory map has most of cogRAM, and maybe all of lutRAM, available for user allocation. User Spin programs live entirely in hubRAM. CogRAM is for data and can be used for special assembly routines.

Flexspin has different memory map and also 100% compiles to machine code, but still, by default, uses hubexec. But has larger amounts of routines living in cogRAM. Much less cogRAM/lutRAM free for user. No support for interrupts nor regload() function. Spin programs run faster of course.

Pure Pasm-only assembled code is different again - Very simple memory map of machine code starting at $0 (cogexec). No supporting environment, everything is user space. Flexspin fully supports this pure-pasm build option.

evanh · 2022-05-16 08:29

Most printed addresses will be hubRAM addresses.

Even for cogexec code, the code is loaded into cogRAM only when it is to be executed. It is stored in hubRAM and it'll be the hubRAM copy of the code that the print routine is accessing.

Consider hubRAM as main memory. CogRAM is the processor general register set. Just it's a really large register set that can also be an instruction cache ... sort of.

PS: Attaching an example program would help when referring to specific cases.

Ariba · 2022-05-16 11:05

@bozo said:
Hi all, I am hoping someone can set me straight on a couple of P2 things - I've been spending too much time trying to work this out, its driving me crazy and think I must be missing something really really obvious ... my apologies in advance for the length of this post ...

You don't miss something obvious, because it is just not obvious that Chips compiler does not generate absolute addresses in DAT sections. This makes hubexec code really hard to use together with Spin code.

In the Propeller Tool, when I hit F8 to look at the compile info I see that $0- $12AF (4784 bytes) is taken up with Interpreter, followed by code/data, etc. And I assume that the data in this window is exactly what gets loaded in hub RAM later when I hit F10/F11.

Yes this data goes into hubram like showed.

Then, after my code is loaded in hub RAM, cog0 loads the interpreter from hub RAM and starts executing the first SPIN method it comes to in my top level object. Since cog RAM is limited to 512 longs ($0-$1FF), the interpreter code can occupy no more than 2k bytes (512 longs x 4 bytes/long) in the cog, right?

No, a cog can also execute from LUT ram, and most of the Spin interpreter routines live in LUT ram. This is another 2k in hub that gets loaded into LUT ram. The remaining of the 4784 bytes may be some tables or other data or maybe some hubexec code for the interpreter. Hope this answers your question 1:

Question 1. So why then does the F8 compile window info show 4784 bytes of interpreter? Is it an error or is there other stuff between $0 and $12B0 that's not related to the SPIN2 interpreter? These 4784 bytes of 'interpreter' aren't loaded into cog0 are they? Even if the cog's LUT RAM is included that's a total of 4096 bytes in the cog, which is still a bit shy of 4784.

The next problem I have is understanding the mechanism for hub execute mode and ORGH.

In my DAT block I have some PASM code; it starts with an ORGH directive and is launched via the SPIN2 statement coginit(HUBEXEC_NEW, @myAsmCode, @myParams)

Unrelated to this code, and under an earlier ORG directive in my DAT block are some reservations for variables that are unrelated to myAsmCode:
asmVar1        res       1
asmVar2        res       1
asmVar3        res       1
asmVar4        res       1
these reservations occupy $1F to $22 in cog0 by virtue of the ORG directive which precedes some other PASM code before these 4 res lines.

My ORGH directive and my PASM2 code myAsmCode comes after these res lines.

When the new cog is launched to run @myAsmCode in hub exec mode, am I right in concluding that the code is relocated from its original location (somewhere in the code/data area that the F8 window illustrated) to $400 in hub RAM? Or was it compliled directly to $400 when I hit F10 in the Propeller Tool and it is uploaded to $400 when the P2 reboots?

It's neighter relocated nor compiled directly into hubram at $400, it just lands in hubram at the next free address when the compiler assembles the code for the object. ORGH (without an address parameter) just tells the compiler that it should generate code as it would start at $400 (to force hubexec code). You can also define another address >$400, but the code will not be relocated to this address, if the compiler is in Spin mode. It's different if you compile only PASM code without any Spin. So with Spin and hubexec you have to write the hubexec code with only relative addressing, so that the absolute addresses do not matter.

When I use debug in cog0 to take a peek
debug(uhex(@myAsmCode),uhex(myAsmCode))
the result is :
Cog0 @myAsmCode = $1334, myAsmCode = $FD64_2636
indicating that the PASM code is sitting in hub RAM at $1334 before I launch cog1 to execute it.

But if I look at the F8 compile info window the code at $1334 (or at $400 for that matter) looks nothing like the $FD64_2636 that the debug statement yielded. Why is that?

That should work if the debug commands are in Spin code. Spin knows the real absolute addresses. It's hard to tell why this not matches without seeing your code.

Then later, after cog1 is launched and is running myAsmCode from hub, debugs tell me
Cog1  INIT $0000_1334 $0000_150C jump
Cog1  @myAsmCode = $E5DD_0196, #myAsmCode = $400
which I assume means that the code is being executed from #1334 and that cog2 believes the address of the label myAsmCode is $400.

Exactly, the real location and the address it is compiled for just don't match. Chip says it's not easy possible to correct this, because every object is compiled separatly and does not know at which address it lands at the end. I say this should be possible with another compiler pass. I mean you can do it by hand if you just write ORGH $1334.

On top of this, it appears that cog1 also knows about the res lines I had in the DAT block, since further debugs in cog1 give me
Cog1 #asmVar1 = $1F, 0, #asmVar2 = $20, 25, #asmVar3 = $21, 75, #asmVar4 = $22, 0

I guess this is OK, since these addresses are in the cog RAM and are totally separate from the same locations in cog0 or any other cog.

Question 2. So, isn't $400 hub RAM right in the middle of the area where the interpreter was loaded at the last reboot (i.e. between $0 and $12AF)? Has $400 been overwritten by the PASM code? Is the interpreter saved somewhere else or was it overwritten by the PASM code when the coginit/hubexec was called?

I think this sould be clear now. It's not overwritten, the code is not at $400, it's just compiled as it would be there, but is really at $1334 in hubram.

I have more Qs but will leave them, depending on what guidance you can give me on the above.
Thanks for reading through this far

Andy

pik33 · 2022-05-16 12:09

The P2 memory map is:

$000..$1FF - COG
$200..$3FF - LUT

These are addresses in longs

$400-$7FFFF - HUB. Now the addresses are in bytes. And this means you can not execute the code from first 1024 bytes in the hub.

When a P2 starts, it loads the code to the hub and then starts the cog #0 from the hub address 0.
The result is you have first 496 longs=$3C0 bytes in the cog and all the code in the hub. The cog can now load the code from anywhere to LUT for execute, or even start to execute from the hub jumping to something >=$400.
This means the interpreter code size is not in any way restricted to the cog or cog+lut size.

If you can use a Flexprop, things are simpler there. There is no interpreter, the code is compiled directly to asm and a listing is generated so you can see what is there and where it is, but the rules are the same. The program is loaded to hub, then cog #0 is initialzed and starts executing. After initialization it jumps and executes the machine code from the hub.

evanh · 2022-05-16 13:01

That's code space. Data space is 3 separate maps.

macca · 2022-05-16 13:58

@bozo said:
In the Propeller Tool, when I hit F8 to look at the compile info I see that $0- $12AF (4784 bytes) is taken up with Interpreter, followed by code/data, etc. And I assume that the data in this window is exactly what gets loaded in hub RAM later when I hit F10/F11.

Yes, however if you have debug mode enabled, the data will get corrupted and you won't see what is actually loaded (this answers one of you later questions), so disable debug before looking at it.

Then, after my code is loaded in hub RAM, cog0 loads the interpreter from hub RAM and starts executing the first SPIN method it comes to in my top level object. Since cog RAM is limited to 512 longs ($0-$1FF), the interpreter code can occupy no more than 2k bytes (512 longs x 4 bytes/long) in the cog, right?

Question 1. So why then does the F8 compile window info show 4784 bytes of interpreter? Is it an error or is there other stuff between $0 and $12B0 that's not related to the SPIN2 interpreter? These 4784 bytes of 'interpreter' aren't loaded into cog0 are they? Even if the cog's LUT RAM is included that's a total of 4096 bytes in the cog, which is still a bit shy of 4784.

Because the interpreter uses cog, lut and even hub-exec code so the overall size is larger than 2048 but a portion is run in cog ram (loaded at boot time), a portion in lut ram loaded by the interpreter itself and, as needed the cog-execution code. The interpreter source code is available from the PNut package (not Propeller Tools) if you want to look at it.

When the new cog is launched to run @myAsmCode in hub exec mode, am I right in concluding that the code is relocated from its original location (somewhere in the code/data area that the F8 window illustrated) to $400 in hub RAM? Or was it compliled directly to $400 when I hit F10 in the Propeller Tool and it is uploaded to $400 when the P2 reboots?

No, the code resides where the compiler places it, it is never physically relocated.

When I use debug in cog0 to take a peek
debug(uhex(@myAsmCode),uhex(myAsmCode))
the result is :
Cog0 @myAsmCode = $1334, myAsmCode = $FD64_2636
indicating that the PASM code is sitting in hub RAM at $1334 before I launch cog1 to execute it.

But if I look at the F8 compile info window the code at $1334 (or at $400 for that matter) looks nothing like the $FD64_2636 that the debug statement yielded. Why is that?

See the note above about the debug mode and the info window.

Then later, after cog1 is launched and is running myAsmCode from hub, debugs tell me
Cog1  INIT $0000_1334 $0000_150C jump
Cog1  @myAsmCode = $E5DD_0196, #myAsmCode = $400
which I assume means that the code is being executed from #1334 and that cog2 believes the address of the label myAsmCode is $400.

Because the compiler generate a "fake" address starting from $400 for hub-execution code.
The rule of thumb is that any code starting from address >= $400 is hub-exec code, however when dealing with Spin code the compiler doesn't know where the code exactly resides in memory, it only know the object-relative address so it has to "trick" itself into believing that the hub-exec code resides at an address >= $400, this is why you see that difference. It is no different than the org $000 used to run cog code.
Different is when you have only PASM code, in that case it know where the code resides and you can do something like this:

CON
    _clkfreq = 160_000_000
    delay    = _clkfreq / 2

DAT

                org   $000

start
                asmclk                      ' set clock
                jmp     #@main              ' jump to hub program

ct              res     1                   ' variables are in cog ram

'
' HUB Program
'
                orgh    $400

main
                getct   ct                  ' get current timer
.loop           drvnot  #56                 ' toggle output
                addct1  ct, ##delay         ' set delay to timer 1
                waitct1                     ' wait for timer 1 expire
                jmp     #.loop

I have more Qs but will leave them, depending on what guidance you can give me on the above.
Thanks for reading through this far

Hope this helps.

Wuerfel_21 · 2022-05-16 14:51

@macca said:

@bozo said:
In the Propeller Tool, when I hit F8 to look at the compile info I see that $0- $12AF (4784 bytes) is taken up with Interpreter, followed by code/data, etc. And I assume that the data in this window is exactly what gets loaded in hub RAM later when I hit F10/F11.

Yes, however if you have debug mode enabled, the data will get corrupted and you won't see what is actually loaded (this answers one of you later questions), so disable debug before looking at it.

It doesn't really "corrupt" anything per se (at least it shouldn't but idk if proptool is just busted), it just shifts all the addresses forward because the debug blob is prepended onto the program. On startup it relocates itself into debug RAM, locks that and then relocates the regular program to zero.

macca · 2022-05-16 15:01

@Wuerfel_21 said:

@macca said:

@bozo said:
In the Propeller Tool, when I hit F8 to look at the compile info I see that $0- $12AF (4784 bytes) is taken up with Interpreter, followed by code/data, etc. And I assume that the data in this window is exactly what gets loaded in hub RAM later when I hit F10/F11.

Yes, however if you have debug mode enabled, the data will get corrupted and you won't see what is actually loaded (this answers one of you later questions), so disable debug before looking at it.

It doesn't really "corrupt" anything per se (at least it shouldn't but idk if proptool is just busted), it just shifts all the addresses forward because the debug blob is prepended onto the program. On startup it relocates itself into debug RAM, locks that and then relocates the regular program to zero.

I don't know if the data are corrupted or just picks something else because there is the debugger code at the beginning (probable, who knows the internals?), but the result is that "what you see is not what you get"

See this for reference:
https://forums.parallax.com/discussion/173219/propeller-tool-debug-statements-weirdness

And AFAIK, it wasn't fixed in PropTools 2.6.

evanh · 2022-05-16 15:06

Ah, that's pure-Pasm only. I doubt Proptool was tested with debug and without Spin. I've not tried to do that even in Flexspin, but then I use my own cog resident debug routines at that point.

Wuerfel_21 · 2022-05-16 15:24

@macca said:

@Wuerfel_21 said:

@macca said:

@bozo said:
In the Propeller Tool, when I hit F8 to look at the compile info I see that $0- $12AF (4784 bytes) is taken up with Interpreter, followed by code/data, etc. And I assume that the data in this window is exactly what gets loaded in hub RAM later when I hit F10/F11.

Yes, however if you have debug mode enabled, the data will get corrupted and you won't see what is actually loaded (this answers one of you later questions), so disable debug before looking at it.

It doesn't really "corrupt" anything per se (at least it shouldn't but idk if proptool is just busted), it just shifts all the addresses forward because the debug blob is prepended onto the program. On startup it relocates itself into debug RAM, locks that and then relocates the regular program to zero.

I don't know if the data are corrupted or just picks something else because there is the debugger code at the beginning (probable, who knows the internals?), but the result is that "what you see is not what you get"

See this for reference:
https://forums.parallax.com/discussion/173219/propeller-tool-debug-statements-weirdness

And AFAIK, it wasn't fixed in PropTools 2.6.

Yeah, ok, I checked, PropTool is just busted. (cuts off displayed data due to not taking debug blob into account?)

evanh · 2022-05-16 15:32

@Wuerfel_21 said:
Yeah, ok, I checked, PropTool is just busted. (cuts off displayed data due to not taking debug blob into account?)

Is it just that Info window you guys are talking about? Does the runtime actually work with debug anyway?

BTW: That's not the only problem with that Info window, the clock mode info is broken too. And copy'n'paste don't work either.

JonnyMac · 2022-05-16 15:33

Yeah, ok, I checked, PropTool is just busted. (cuts off displayed data due to not taking debug blob into account?)

Does @"Jeff Martin" know about this?

evanh · 2022-05-16 15:37

@JonnyMac said:

Yeah, ok, I checked, PropTool is just busted. (cuts off displayed data due to not taking debug blob into account?)

Does @"Jeff Martin" know about this?

I've raised the lack of ability to dump the hex data to a file. I very rarely use Proptool, it can't connect to the hardware for me. Only use it to compare binaries against Pnut when there is some reason not to use Flexspin.

msrobots · 2022-05-16 21:18

As usual @Wuerfel_21 gave the right answer to debug and relocation, and as usual one needs to think the Chip way to understand what is happening.

And also as usual when all fails one needs to read the source, not easy with Chips love for two letter names.

Debug was added later and Chip is mostly/often looking for shortcuts.

Sometimes it is not easy to wrap your mind about Chips thinking, but the compiled Spin binary for Debug or non -debug is the same in Spin.

Except with debug enabled a binary blob, the compiled Debug part gets simply added in front of the compiled Spin part. At start with debug the debug stuff starts in COG0 moves the compiled debug block into upper HUB ram, moves the rest of the binary data (the spin Block) in HUB back to HUB address zero, Enables Debug and then restarts cog0 with the compiled Spin into Cog0.

The reason for that might be that Chip can change the size of the debug binary without changing the compilation of the Spin Part.

So with debug enabled t wo relocations are done, debug binary will get relocated to the top, the main block back to 0. That is the basic idea, but had flaws Chip worked around to make the basic Idea work.

All debug command strings are also compiled into the debugger so they do not need the space in the Spin compiled blob. So the debug binary changes depending on the parameters of the debug commands.

Proptool shows the binary HEX getting loaded, not the memory view after the event, that might need to get changed with a second memory HEX view.

One showing the binary as it gets loaded, like in Spin1, and a second one showing the real memory map after relocation, showing the program at HUB 0 where it ends up and the Debug binary at top of the HUB ram where it ends up after the debug start.

That would may lessen the confusion.

One bad thing I still have with Chips Spin2 is that you can not really do absolute addressing with fixed addresses in your code, say for screen buffer or such, you always need to ask Spin for the absolute address in runtime.

Spin 1 has the commands STACK and FREE as constants, allowing you to tell the compiler that you want to keep x bytes to reserve on top of HUB but in Spin2 this seems not to be implemented, i did ask Chip a while ago and he was not sure either.

That would allow you to at least have some HUB ram to address directly,

Enjoy!

Mike

msrobots · 2022-05-16 21:31

And the same actually happens when loading to flash, a flash loader binary blob gets added in front of the Spiin compiled block, get executed at HUB0 to do the job and then overwrites itself so the program ends up at HUB and flash address 0.

The only absolute addresses in HUB you can use is currently everything below $400, you can not execute code there, execution would jump to COG or LUT but you can access them as HUB memory with wrxxx and rdxxx with absolute addressing.

I hope I could add a little bit more light to the foggy memory.

One nice thing of spin2 being fully relocatable and use just relative addressing is that it basically would allow to add and remove spin methods at runtime, loaded from sd or external ram, but as far as I know nobody tried that so far.

Mike

Jeff Martin · 2022-05-17 15:25

Thank you, @bozo for reporting this, @Wuerfel_21 and @msrobots for your explanations, everyone else who helped too, and also @JonnyMac for alerting me to this.

I should have known this (and kind of did) but either way, I wasn't thinking about the effect it'd have on the info display when I incorporated the debug-capable P2 compiler into the Propeller Tool. I didn't know this problem existed in the display.

At this moment I don't know exactly what solution I'll settle on, but it's important for the Propeller Tool to visually demonstrate (in the hex view) the memory image as it will be at the moment the user code begins to run. That means it must (by default) show the Spin2 Interpreter, user code, and data at the start of the image and indicate the debugger is at the end. I may also provide a toggle button to see the to-be-downloaded image, and I might as well include the flash loader there as well. The latter enhancements would require some rearranging or extra dynamic layout on that info display.

There's also some known bugs or missing support in the info display as it relates to P2 that I need to resolve.

Wuerfel_21 · 2022-05-17 15:41

@"Jeff Martin"

While we're at it, that issue I brought up on Zoom that one time but couldn't reproduce, wherein the cog address of a label would not be correctly displayed in the status bar? I figured that out.

The issue is twofold:

the cog address displayed is in units of bytes. I think it should be longs
addresses larger than $1FF (in bytes) display as "N/A" (so everything past the first quarter of cog ram, including LUT)

Jeff Martin · 2022-05-17 15:49

@Wuerfel_21 said:
While we're at it, that issue I brought up on Zoom that one time but couldn't reproduce, wherein the cog address of a label would not be correctly displayed in the status bar? I figured that out.

The issue is twofold:

the cog address displayed is in units of bytes. I think it should be longs

addresses larger than $1FF (in bytes) display as "N/A" (so everything past the first quarter of cog ram, including LUT)

Oh yes, great! Indeed, it should be in longs. If I understand correctly, part of the second item will be fixed by displaying longs instead of bytes, but needs to also be enhanced to display through LUT memory addresses.

hub RAM, hub exec mode, memory allocation,... help needed

Comments