The addressing conundrum
Seairth
Posts: 2,474
in Propeller 2
Since this topic has now spread across 3 or more other threads, I figured this deserved a thread of its own.
As things stand now (2015-09-29 FPGA Image):
* All instruction addressing is byte-oriented
* Cog and LUT instruction addresses must still be long-aligned
* Hub instruction addresses below $1000 must be offset by one byte
* Hub data addressing is byte-oriented
* Cog and LUT data addressing is long-oriented
* All ORGx directives are byte-oriented
* All #immediate labels are byte-oriented.
* Any instruction that is operating on words or long must divide the #immediate by 2 or 4 to get the correct value
* All @relative labels are long-oriented
* All other labels (those indicating a register) are long-oriented.
Now... how do we make this simpler?
As things stand now (2015-09-29 FPGA Image):
* All instruction addressing is byte-oriented
* Cog and LUT instruction addresses must still be long-aligned
* Hub instruction addresses below $1000 must be offset by one byte
* Hub data addressing is byte-oriented
* Cog and LUT data addressing is long-oriented
* All ORGx directives are byte-oriented
* All #immediate labels are byte-oriented.
* Any instruction that is operating on words or long must divide the #immediate by 2 or 4 to get the correct value
* All @relative labels are long-oriented
* All other labels (those indicating a register) are long-oriented.
Now... how do we make this simpler?
Comments
But I really appreciate the summary.
Things aren't quite as confusing as the list seems at first glance... For example... we can easily remember that cog and lut addresses are long because we don't have byte and word variants for RDLUT and WRLUT...I will probably never write a hub instruction below $1000 because that is were the guru's put stuff... # only has nine bits ... as always. The most complicated org statement I have ever written is ... ORG, so I'm going to have to study ORG again:) etc. etc.
I think the biggest problem isn't the rules... it is knowing exactly what the rules are.
As you have said... to glean this summary, you had to go to multiple threads. There are many issues similarly spread around.
One way to fix this "conundrum" would be to place active help in the user's tools ... a dropdown cheat sheet. Or... You type in RDLUT... the tool hints the addressing and alignment rule or links to more a more complete description, etc, etc. A great place for imagination and creativity.
Addressing is important, but functionality trumps convenience in my mind.
What is fun to me... is something that really works well. Learning how to do that is always going to be somewhat of a pain.
At present, important conversations are so exhaustive and spread over so many threads that the final state of many issues is essentially unknown to most.
Until the rules are all in one concise document, it is almost impossible to do anything anyway:)
After we have that, then most of us will need simple, well documented snippets... which will flood in.
It shouldn't have an concept of bytes or longs or addressing associated with it.
The instructions are located really at the non-long aligned address in memory, but you can't call this mis-aligned
It's not that hard:
Hubexec can execute from any address, cog and lut code can execute only from long aligned addresses under $1000.
Every non-long aligned address always selects hubexec, also in the 0..$1000 range.
So you can use this fact to execute hubexec in the lower memory.
Andy
I think all the examples have only one "orgh" directive and they are all "orgh 1". I assume the "1" is the absolute address and not an offset command.
What do you do if you want to add a second "orgh" starting point? Seems like you have to count lines... Then you have to wonder what to do if the first one gets bigger, right?
BTW: I think "orgh 1" "orgh 2" and "orgh 3" would all work, it's just "orgh 0" that would try to run in a cog...
Exactly.
Also, virtually any program that has 508KB of code (512KB - $1000) is sure to have at least $1000 bytes of non-machine-code data (bytecode, tables, strings, arrays, buffers, etc.). Those first $1000 bytes certainly won't be wasted if they can't be hubexec'd from.
One way this could be handled is to have the ORGx without a number simply mean "switch context". That way, even if you have something like:
In the above example, hub offset will still be increasing in the ORG context so that when you hit the second ORGH, it switches the context back with the correct offset. The same rule would apply to ORG, though this would probably be less commonly used.
The PC steps by 1 in cog and LUT exec, and by 4 in hub exec. This would solve the problem of cog and LUT addresses being all spread out by 4x. It would also fix the #reg problem. It would mean that cog and LUT code would be assembled differently from hub code because of address reckoning. This would be like Prop1, but with hub exec code having spread out addresses. Hub code would not be compatible with cog and LUT code because of the assembled addresses.
All this would achieve, in reality, would be a simpler perspective for the user. It would take a tiny bit more logic. Maybe it's what is needed.
It would also get rid of the overlapped cog/LUT exec and hub exec address spaces. It would cancel the entire long-alignment conundrum, as well.
It's 32bit, it should all be 32bit reads, in fact your byte stuff is just masking a 32bit read, correct?
jmp foo
The way PASM currently handles:
jmp #foo>>2
Then the assembler can know how to pack bits into the instruction and the meaning of # is not altered.
I agree - the best place to start, is to clean up the most common usages, and Labels with Jumps/branches are fundamental ASM core operations.
Let the tools help.
Adding segments also allows the tools to check what users are doing, more than a simple ORG
ORG with a 'magic number' is more cryptic than an explicit segment name.
Other MCUs have explicit opcodes for byte and word data handling, which keeps things clear.
This is why SEGMENT makes more sense than a hard ORG.
SEGMENT can include any align that segment demands, and allows easy merge of code.
SEGMENT also allows a linker to be used, which is more common in a mix of ASM and HLL
This also makes a case for SEGMENT, and in some segments certain data types may be prohibited (this can be tool-checked)
ie if byte and word are allowed only in HUB, that means .byte & .word storage allocators would give an error if used in COG or LUT segments, but .long is legal in all
(maybe with an align variant )
jmp foo puts the value of foo into the S portion of the instruction and leave the I flag off. This causes it to use the register at foo to get the address to jump to.
jmp #foo puts the value of foo into the S portion of the instruction and sets the I flag on. This causes it to jump to the address where foo is at.
I'm actually in favor of just having all execution be long aligned, so the PC doesn't need the lower 2 bits. We only need word and byte access for data to/from the hub.
Other Assemblers just use simple labels (no prefix) and relative is a tool-choice.
Users need not worry about if a relative jump, or absolute jump is encoded.
Simple and clean. ( @Rn is used by MSP430, 8051...)
Edit: Sorry. I'm wrong. The PDP-11 used (R1) to indicate indirect through R1. Maybe it was the PDP-10....
DJZ (etc.) do not have an immediate form. On register or relative. I'm don't think you can use AUGx with relative addressing.
I would even be fine with having hub access always be long oriented, and just have instructions for reading the proper bytes or words from a long address. So, for example, RDBYTE D,S,N where N is what byte to read from the hub address at S and place into D. This would be similar to the get/setbyte and get/setword instructions. In fact, we could even do the nibble versions if we wanted.
The more I think about this, the more I think it's the way to go. I hope Chip likes it.
Slight correction. There are three forms for JMP:
In this case, it's long offset. I think. As I mentioned in the prior comment, I don't know if the 20-bit relative address is long or byte offsets.
but how do you scan byte arrays, with a 3 operand opcode ?
Data access has always been the gotcha in all of this, it still needs to be byte-granular.
To me, the Silicon is quite good as it is, it just needs a tool-clean-up pass.
Wouldn't this destroy linear byte and word addressing? We'd have to maintain separate counters to cycle through bytes and longs.
I've been stalled out, trying to come to some resolution about how to reign this mess in, because moving forward on the current path is dismal.
Here is what looks best to me:
$00000..$001FF = cog exec (register addressing is 1:1, PC steps by 1)
$00200..$003FF = LUT exec (register addressing is 1:1, PC steps by 1)
$00400..$FFFFF = hub exec (PC steps by 4, relative D,@S (9-bit immediate) branches are shifted left twice)
This keeps the cogs simple and fun, like they are on Prop1, which is a necessity. It also gets rid of any impetus to make overlapped cog/LUT/hub execution spaces.
There's no way we ought to clutter up the assembly language with all kinds of operators to overcome the current 4:1 hub:cog/LUT addressing ratio. Making it 1:1 keeps us sane, happy, and free.
This takes a few gates to implement and it doesn't slow the chip down.
This would make the Prop2 just like the Prop1, but without any hub alignment requirements for longs and words, and with the pleasant addition of hub exec. It even gets rid of the notion that hub exec instructions ought to be long-aligned for any reason. Hub execution and data access all become dirt simple with no alignment caveats.